Upload
trinhcong
View
232
Download
0
Embed Size (px)
OULU 1998
MULTIUSER DEMODULATION FOR DSCDMA SYSTEMS IN FADING CHANNELS
MARKKUJUNTTI
Department of Electrical Engineering
MULTIUSER DEMODULATION FOR DSCDMA SYSTEMS IN FADING CHANNELS
MARKKU JUNTTI
Academic Dissertation to be presented with the assent of The Faculty of Technology, University of Oulu, for public discussion in Raahensali (Auditorium L 10), Linnanmaa, on October 17th, 1997, at 12 noon.
OULUN YLIOP ISTO, OULU 1998
Copyright © 1998Oulu University Library, 1998
Manuscript received 16 Sebtember 1997Accepted 18 Sebtember 1997
Communicated by Professor Timo LaaksoProfessor Sergio Verdú
ALSO AVAILABLE IN PRINTED FORMAT
ISBN 9514247558(URL: http://herkules.oulu.fi/isbn9514247558/)
ISBN 9514246322ISSN 03553213 (URL: http://herkules.oulu.fi/issn03553213/)
OULU UNIVERSITY LIBRARYOULU 1998
Dedicated to my family.
Juntti, Markku, Multiuser demodulation for DSCDMA systems in fading channelsDepartment of Electrical Engineering, University of Oulu, FIN90570 Oulu, FinlandActa Univ. Oul. C 106, 1997Oulu, Finland
(Manuscript received 16 September, 1997)
Abstract
Multiuser demodulation algorithms for centralized receivers of asynchronousdirectsequence (DS) spreadspectrum codedivision multipleaccess (CDMA) systems infrequencyselective fading channels are studied. Both DSCDMA systems with short (onesymbol interval) and long (several symbol intervals) spreading sequences are considered.
Linear multiuser receivers process ideally the complete received data block. The approximation of ideal infinite memorylength (IIR) linear multiuser detectors by finitememorylength (FIR) detectors is studied. It is shown that the FIR detectors can bemade nearfar resistant under a given ratio between maximum and minimum receivedpower of users by selecting an appropriate memorylength. Numerical examples demonstrate the fact that moderate memorylengths of the FIR detectors are sufficient to achievethe performance of the ideal IIR detectors even under severe nearfar conditions.
Multiuser demodulation in relatively fast fading channels is analyzed. The optimalmaximum likelihood sequence detection receiver and suboptimal receivers are considered.The parallel interference cancellation (PIC) receiver is demonstrated to achieve betterperformance in known channels than the decorrelating receiver, but it is observed to bemore sensitive to channel coefficient estimation errors than the decorrelator. At highchannel loads the PIC receiver suffers from bit error rate (BER) saturation, whereas thedecorrelating receiver does not. Choice of channel estimation filters is shown to be crucialif low BER is required. Dataaided channel estimation is shown to be more robust thandecisiondirected channel estimation, which may suffer from BER saturation caused byhangups at high signaltonoise ratios.
Multiuser receivers for dynamic CDMA systems are studied. Algorithms for ideal linear detector computation are derived and their complexity is analyzed. The complexity ofthe linear detector computation is a cubic function of KL, where K and L are the number of users and multipath components, respectively. Iterative steepest descent, conjugategradient, and preconditioned conjugate gradient algorithms are proposed to reduce thecomplexity. The computational requirements for one iteration are a quadratic function ofKL. The iterative detectors are also shown to be applicable for parallel implementation.Simulation results demonstrate that a moderate number of iterations yields the performance of the corresponding ideal linear detectors. A quantitative analysis shows that thePIC receivers are significantly simpler to implement than the linear receivers and onlymoderately more complex than the conventional matched filter bank receiver.
Keywords: channel estimation, interference cancellation, decorrelation, iterativedetection
Preface
Research for this thesis has been carried out mostly in the spreadspectrum researchgroup in the Telecommunication Laboratory, Department of Electrical Engineering,University of Oulu, Oulu, Finland. The work towards the thesis was initialized inAugust 1993 after I had completed the Master’s thesis in May 1993. The first yearof the research included mostly literature study. From September 1994 to October1995 I was a visiting scholar at Department of Electrical and Computer Engineering, Rice University, Houston, Texas, USA in Professor Behnaam Aazhang’sresearch group. During that time the major part of the research leading to theresults in Chapters 3 and 5 was conducted. After completing the work carried outat Rice University the research for the results in Chapter 4 was performed at theUniversity of Oulu in the second half of the year 1996.
I wish to express my gratitude to my advisors Docent Jorma Lilleberg andProfessor Behnaam Aazhang for their comments and guidance throughout thework. My warmest thanks are also due to my supervisor Associate Professor PenttiLeppanen for providing me the opportunities to work on this interesting topic andfor guidance in my postgraduate studies. The help provided by Professor SavoGlisic, Professor Emeritus Juhani Oksman and Dr. Juha Ylitalo on various issuesand especially in getting the necessary financial support has also been invaluable. Iam grateful to the reviewers of the thesis Professor Timo Laakso from Helsinki University of Technology, Espoo, Finland, and Professor Sergio Verdu from PrincetonUniversity, Princeton, New Jersey, USA.
I would like to thank numerous members of the spreadspectrum research groupat the Telecommunication Laboratory of University of Oulu, as well as the facultyand students at Department of Electrical and Computer Engineering of Rice University for the challenging environments to pursue research. The cooperation andjoint work with Matti Latvaaho has proven to be very fruitful. I am also gratefulto Markku Heikkila, who performed the simulations for Chapter 4. The discussionswith Dr. Kishore Kota and Dr. Markus Lang have been of great value. Thanks aredue to Jari Iinatti, Pertti Jarvensivu, Matti Latvaaho, Harri Saarnisaari, PekkaKaasila, and Tero Ojanpera who read and commented the manuscript. The help onvarious computer problems provided by Veikko Hovinen, Dr. Markus Lang, PekkaNissinaho, Dr. Jan Erik Odegard, Harri Saarnisaari, and Jari Sillanpaa has been
invaluable.The financial support provided by the Academy of Finland, Elektrobit, Ltd., the
Graduate School in Electronics, Telecommunications, and Automation, Finnish AirForce, Nokia Mobile Phones, Ltd., Nokia Telecommunications, Ltd., Rice University, the Technology Development Centre of Finland, the University of Oulu, aswell as the following Finnish foundations: Emil Aaltosen Saatio, Jenny ja AnttiWihurin Rahasto, Oy Nokia Ab:n Saatio, Tauno Tonningin Saatio, and TekniikanEdistamissaatio enabled this work and is thus gratefully acknowledged.
I am grateful to my father Aarno and to my late mother Kaija for the help,support, and love they have provided for me throughout my life. The positiveattitude towards education in our home has been an important driving force formy later studies.
I wish to express my deepest thanks to my family, my wife Hanna for the loveand support she has shown to me, and to my children Tuomas (three years) andKaisa (one year) for the understanding they have shown in their own ways. Without Hanna’s highly positive attitude the thesis would not have been completed.
Oulu, September 15, 1997 Markku Juntti
List of original publigations
The thesis is in part based on the following original publications, which are referredin the text by Roman numerals:
I Juntti M & Glisic S (1997) Advanced CDMA for wireless communications. In:Glisic SG & Leppanen PA (eds) Wireless Communications: TDMA VersusCDMA, Kluwer Academic Publishers, Chapter 4, p 447–490.
II Juntti MJ & Aazhang B (1995) Linear finite memorylength multiuser detectors. Proc. Communication Theory MiniConference (CTMC’95) in conjunction with IEEE Global Telecommunications Conference (GLOBECOM’95),Singapore, November 13–17, p 126–130.
III Juntti MJ, Aazhang B & Lilleberg JO (1996) Linear multiuser detectionfor RCDMA. Proc. Communication Theory MiniConference (CTMC’96)in conjunction with IEEE Global Telecommunications Conference (GLOBECOM’96) London, U.K., November 18–22, p 127–131.
IV Juntti MJ & Aazhang B (1997) Finite memorylength linear multiuser detection for asynchronous CDMA communications. IEEE Transactions onCommunications 45(5): p 611–622.
V Juntti MJ (1997) Performance of decorrelating multiuser receiver with dataaided channel estimation. Proc. Communication Theory MiniConference(CTMC’97) in conjunction with IEEE Global Telecommunications Conference (GLOBECOM’97), Phoenix, Arizona, USA, November 5–7.
VI Juntti MJ, Latvaaho M & Heikkila M (1997) Performance Comparison ofPIC and Decorrelating Multiuser Receivers in Fading Channels. Proc. IEEEGlobal Telecommunications Conference (GLOBECOM’97), Phoenix, Arizona,USA, November 3–8.
VII Juntti MJ (1995) Linear multiuser detector update in synchronous dynamicCDMA systems. Proc. IEEE International Symposium on Personal, Indoorand Mobile Radio Communications (PIMRC’95) Toronto, Ontario, Canada,September 27–29, 3: p 980–984.
VIII Juntti MJ, Aazhang B & Lilleberg JO (1996) Iterative implementation of linear multiuser detection. Proceedings of Conference on Information Sciencesand Systems (CISS’96), Princeton University, Princeton, New Jersey, USA,March 20–22, 1: p 343–348.
IX Juntti MJ & Lilleberg JO (1996) Implementation aspects of linear multiuser detectors in asynchronous CDMA systems. Proceedings of IEEE International Symposium on Spread Spectrum Techniques and Applications(ISSSTA’96) Mainz, Germany, September 22–25, 2: p 842–846.
X Juntti MJ, Aazhang B & Lilleberg JO (1996) Iterative implementation oflinear multiuser detectors for dynamic CDMA systems. IEEE Transactionson Communications, preliminarily accepted.
For clarity, the thesis is presented as a monograph and the original publicationsare therefore not reprinted.
List of symbols and abbreviations
A diagonal matrix of transmitted complex amplitudes of all usersat one symbol interval (K ×K)
Ak transmitted complex amplitude of user kA diagonal matrix of transmitted complex amplitudes of all users
over all symbol intervals (NbK ×NbK)A(n) diagonal matrix of transmitted complex amplitudes of all users
over symbol intervals inside the processing window (NK ×NK)A(n) diagonal matrix of transmitted complex amplitudes of all users
over symbol intervals inside the processing window plus the edgesymbol intervals ((N + 4)K × (N + 4)K)
A(n)e diagonal matrix of transmitted complex amplitudes of all users
over the edge symbol intervals (4K × 4K)b vector of data symbols of all users over all symbol intervals
(NbK × 1)b(n) vector of data symbols of all users over symbol intervals
inside the processing window (NK × 1)b
(n) vector of data symbols of all users over symbol intervals insidethe processing window plus the edge symbol intervals((N + 4)K × 1)
b(n) vector of data symbols of all users at symbol interval n (K × 1)b(n)e vector of data symbols of all users over the edge symbol
intervals (4K × 1)b(n)k data symbol of user k at symbol interval nc vector of the channel coefficients of all users over all symbol
intervals (NbKL× 1)c(n) vector of the channel coefficients of all users at symbol
interval n (KL× 1)c(n)k vector of the channel coefficients of user k at symbol
interval n (L× 1)c(n)k combining vector (L × 1)
c(n)k (t) channel impulse response of user k at symbol interval nc(n)k,l channel complex coeffcient (gain) of lth multipath of user k
at symbol interval nC(n) matrix of the channel coefficient vectors of all users at
symbol interval n (KL×K)C(n) combining matrixC matrix of the channel coefficient vectors of all users over
all symbol intervals (NbKL×NbK)C(n) matrix of the channel coefficient vectors of all users over
symbol intervals inside the processing window (NKL×NK)C(n) matrix of the channel coefficient vectors of all users over
symbol intervals inside the processing window plus the edgesymbol intervals ((N + 4)KL× (N + 4)K)
C(n)e matrix of the channel coefficient vectors of all users over
the edge symbol intervals (4KL× 4K)CAPk channel capacity of user kC set of complex numbersD(i) detector block (KL×KL)D(z) ztransform of a linear detector (KL×KL)D linear infinite memorylength multiuser detectorDN truncated linear finite memorylength multiuser detector of
length N (NKL×KL)DN optimal linear finite memorylength multiuser detector of
length N (NKL×KL)Ek transmitted energy per symbol of user kF convolution of the multiuser channel impulse response
and multiuser detector (NKL×KL)h vector of dataamplitude products of all users
over all symbol intervals (NbK × 1)h(n) vector of dataamplitude products of all users over symbol
intervals inside the processing window (NK × 1)h
(n) vector of dataamplitude products of all users over symbolintervals inside the processing window plus the edge symbolintervals ((N + 4)K × 1)
h(n) vector of dataamplitude products of all users at symbolinterval n (K × 1)
h(n)e vector of dataamplitude products of all users over the
edge symbol intervals (4K × 1)h
(n)k dataamplitude product of user k at symbol interval n
I identity matrixIL identity matrix (L× L)J distance of channel estimation filter taps in symbol intervalsJ0 zeroorder Bessel function of the first kindk user indexK number of active users
l propagation path indexL number of propagation pathsL Cholesky factor of the correlation matrix R(0)L(n) Cholesky factor of the correlation matrix R(n)
n discrete symbol interval indexN number of symbols in the processing windowNb number of symbols in the data packetNc processing gainNp distance of the pilot symbolsNs number of samples in symbol intervalP “half” of the processing window lengthPk probability of bit error for user kPpr number of coefficients in the prediction part of the
channel estimation filterPsm number of coefficients in the smoothing part of the
channel estimation filterq(n)k,l channel estimation filter input vectorr discretetime sampled received signal vector over
all symbol intervals (NbNs × 1)r(n) discretetime sampled received signal vector over
symbol intervals inside the processing window (NNs × 1)r(t) complex envelope of received continuoustime signalR(n)k,k′ (i) matrix of crosscorrelations of signature waveforms
for all multipath components of users k and k′ with delayof i symbols at symbol interval n (L× L)
R(n)(i) matrix of crosscorrelations of signature waveformsfor all multipath components of all users with delay of isymbols at symbol interval n (KL×KL)
R matrix of crosscorrelations of signature waveformsfor all multipath components of all users over all symbolintervals (NbKL×NbKL)
R(n) matrix of crosscorrelations of signature waveformsof multipath components of all users over symbol intervalsinside the processing window (NKL×NKL)
R(n) matrix of crosscorrelations of signature waveformsof multipath components of all users over symbol intervalsinside the processing window plus the edge symbolintervals (NKL× (N + 4)KL)
R(n)e matrix of crosscorrelations of signature waveforms
of multipath components of all users over the edge symbolintervals (NKL× 4KL)
IR set of real numberss
(n)k (t) signature waveform of user k at symbol interval ns
(n)k,m chip m of user k at symbol interval n
S(0)(0) matrix of samples of signature waveforms (Ns ×KL)
S matrix of samples of signature waveforms ((N + 2)Ns ×NKL)t continuoustime indexT length of a symbol periodTm delay spreadTc length of a chip periodT detector matrix; inverse of R or R(n)
UN block column of inverse matrix (NKL×KL)v(n)k,l channel estimation filter vectorw vector of the matched filter output noise components
of all users over all symbol intervals (NbKL× 1)w(n) vector of the matched filter output noise components
of all users over symbol intervals inside theprocessing window (NKL× 1)
w(n) vector of the matched filter output noise componentsof all users at symbol interval n (KL× 1)
w(n)k vector of the matched filter output noise components
of user k at symbol interval n (L × 1)w
(n)k,l noise component of the sampled output of the matched
filter for the lth multipath of user k at symbol interval ny vector of matched filter outputs of all users over all
symbol intervals (NbKL× 1)y(n) vector of matched filter outputs of all users over symbol
intervals inside the processing window (NKL× 1)y(n) vector of matched filter outputs of all users at symbol
interval n (KL× 1)y(n)k vector of matched filter outputs of multipath components
of user k at symbol interval n (L × 1)y
(n)k,l sampled output of the filter matched to the kth users lth
multipath component at symbol interval ny[MUD] multiuser detector output vector over all symbol
intervals (NbKL× 1)y
(n)[MUD] multiuser detector output vector over symbol intervals
inside the processing window (NKL× 1)y
(n)[MUD](m) multiuser detector output vector over symbol intervals
inside the processing window at iteration m (NbKL× 1)y(n)
[MUD] multiuser detector output vector at symbol intervaln (KL× 1)
z complex discretetime sampled zero mean additivewhite Gaussian noise vector over all symbol intervals(NbNs × 1)
z(t) complex continuoustime zero mean additive whiteGaussian noise
0L zero matrix (L× L)δk,k′ Kronecker delta function
δ(t) Dirac’s delta functionζ
(n)1 matrix of the past edge correlations (NKL× 2KL)ζ
(n)2 matrix of the future edge correlations (NKL× 2KL)ηk asymptotic multiuser efficiency of user kηk powerlimited nearfar resistance of user kλi eigenvalue of a matrixµ(n) response of the edge symbols at linear detector outputΞ modulation symbol alphabetφk transmitted carrier phaseϕk,l(.) channel autocorrelation (autocovariance) functionσ2 twosided power spectral density of the noiseΣc covariance matrix of vector cτk delay of kth user’s transmitted signalτk,l delay of lth multipath component of user kψ(t) chip waveformΨ multipleaccess interference estimateΩ(.) loglikelihood functionAME asymptotic multiuser efficiencyAWGN additive white Gaussian noiseBEP bit error probabilityBER bit error rateBPSK binary phase shift keyingCG conjugate gradientCGL conjugate gradient for solving least squares problemsCDMA codedivision multipleaccessDA dataaidedDD decisiondirectedDF decisionfeedbackDFE decisionfeedback equalizerDS directsequenceDSP digital signal processingDCDMA deterministic codedivision multipleaccessEM expectationmaximizationFDMA frequencydivision multipleaccessFH frequencyhoppingFIR finite impulse responseflop floating point operationGPIC groupwise parallel interference cancellationGSIC groupwise serial interference cancellationHD hard decisionIC interference cancellationIIR infinite impulse responseISI intersymbol interferenceLMMSE linear minimum mean squared errorMAI multipleaccess interferenceMC multicarrier
MF matched filterML maximum likelihoodMLSD maximum likelihood sequence detectionMMSE minimum mean squared errorMOE minimum output energyMPSK Mary phase shift keyingMRC maximal ratio combiningMSE mean squared errorMUD multiuser demodulationNDA nondataaidedNFR nearfar resistancePCG preconditioned conjugate gradientpdf probability density functionPDMA polarizationdivision multipleaccessPIC parallel interference cancellationPSK phase shift keyingRCDMA random codedivision multipleaccessSAGE space alternating generalized expectationmaximizationSIC serial interference cancellationSD soft decision; steepest descentSINR signaltointerferenceplusnoise ratioSNR signaltonoise ratioTDMA timedivision multipleaccessSDMA spacedivision multipleaccessWSSUS widesense stationary uncorrelated scatteringWCDMA wideband codedivision multipleaccess∗ convolution(·)∗ complex conjugation(·)max maximum of the argument(·)min minimum of the argument(·)[d] decorrelating detector applied to the argument(·)[HD−PIC] hard decision parallel interference cancellation detector applied
to the argument(·)[LIN ] linear detector applied to the argument(·)[MRC] maximal ratio combining applied to the argument(·)[ms] linear minimum mean squared error detector applied
to the argument(·)[nw] noisewhitening detector applied to the argument(·)[PIC] parallel interference cancellation detector
applied to the argument(·) estimate of the argumentarg argumentAH conjugate transpose of AA−1 inverse of AA> transpose of Adiag(· · ·) diagonal matrix with elements · · · on main diagonal
E(·) expectationinf(·) largest lower bound (infimum)ln(·) natural logarithmmax(·) maximummbc(·) middle block columnmin(·) minimumQ(.) normalized and scaled Gaussian complementary error
functionRe(·) real partsgn(·) signum functionsup smallest upper bound (supremum) ·  magnitude‖ · ‖ Euclidean normdxe smallest integer larger than or equal to x(A)ij element at the ith row and jth column of matrix A∂∂x gradient vector with respect to x
Contents
AbstractPrefaceList of original publigationsList of symbols and abbreviationsContents1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.1. Multipleaccess techniques . . . . . . . . . . . . . . . . . . . . . . . . 211.2. Multiuser demodulation . . . . . . . . . . . . . . . . . . . . . . . . . 241.3. Aim and outline of the thesis . . . . . . . . . . . . . . . . . . . . . . 251.4. Author’s contribution to publications . . . . . . . . . . . . . . . . . . 26
2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.1. System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1. Continuoustime model . . . . . . . . . . . . . . . . . . . . . 282.1.2. Discretetime model . . . . . . . . . . . . . . . . . . . . . . . 312.1.3. Finite processing window model . . . . . . . . . . . . . . . . 322.1.4. Statistical fading channel model . . . . . . . . . . . . . . . . 342.1.5. Summary of notational conventions . . . . . . . . . . . . . . . 36
2.2. Review of earlier and parallel work . . . . . . . . . . . . . . . . . . . 362.2.1. Receivers for fading channel communications . . . . . . . . . 372.2.2. Optimal multiuser demodulation . . . . . . . . . . . . . . . . 402.2.3. Suboptimal multiuser demodulation . . . . . . . . . . . . . . 42
2.2.3.1. Linear equalizer type multiuser demodulation . . . . 422.2.3.2. Interference cancellation . . . . . . . . . . . . . . . . 452.2.3.3. Other multiuser receivers . . . . . . . . . . . . . . . 48
2.3. Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 493. Finite memorylength linear multiuser detection . . . . . . . . . . . . . . 50
3.1. Linear FIR multiuser detectors . . . . . . . . . . . . . . . . . . . . . 513.2. Stability of detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.3. Performance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.1. Singlepath channel . . . . . . . . . . . . . . . . . . . . . . . 563.3.2. Multipath channel . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4. Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.1. Detector stability . . . . . . . . . . . . . . . . . . . . . . . . . 603.4.2. Detector performance . . . . . . . . . . . . . . . . . . . . . . 61
3.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644. Multiuser demodulation in Rayleigh fading channels . . . . . . . . . . . . 74
4.1. Optimal receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.2. Suboptimal receivers . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2.1. Channel estimation . . . . . . . . . . . . . . . . . . . . . . . . 774.2.2. Interference suppression . . . . . . . . . . . . . . . . . . . . . 81
4.3. Receiver performance analysis and results . . . . . . . . . . . . . . . 824.3.1. Performance of linear receivers . . . . . . . . . . . . . . . . . 83
4.3.1.1. MSE of DA channel estimation . . . . . . . . . . . . 834.3.1.2. BEP of DA decorrelating receiver . . . . . . . . . . 854.3.1.3. Channel capacity of DA decorrelating receiver . . . 914.3.1.4. BER of DA and DD decorrelating receivers . . . . . 92
4.3.2. Performance comparisons of decorrelating and PIC receivers . 954.3.2.1. Sensitivity of BER to channel estimation errors . . 954.3.2.2. BER in optimally estimated channel . . . . . . . . . 974.3.2.3. BER in suboptimally estimated channel . . . . . . . 99
4.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005. Multiuser detection in dynamic CDMA systems . . . . . . . . . . . . . . . 107
5.1. Ideal linear detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.1.1. Detection algorithms . . . . . . . . . . . . . . . . . . . . . . . 1085.1.2. Detector update in synchronous systems . . . . . . . . . . . . 109
5.1.2.1. Inverse update . . . . . . . . . . . . . . . . . . . . . 1105.1.2.2. Cholesky factor update . . . . . . . . . . . . . . . . 112
5.1.3. Detector computation in asynchronous systems . . . . . . . . 1155.2. Iterative linear detection . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.2.1. Iterative algorithms . . . . . . . . . . . . . . . . . . . . . . . 1175.2.1.1. Steepest descent and conjugate gradient algorithms 1185.2.1.2. Preconditioned conjugate gradient algorithm . . . . 119
5.2.2. Iterative sliding window detection . . . . . . . . . . . . . . . 1205.2.3. Numerical performance evaluation . . . . . . . . . . . . . . . 120
5.3. Complexity comparisons . . . . . . . . . . . . . . . . . . . . . . . . . 1225.3.1. Summary of implementation complexities . . . . . . . . . . . 1235.3.2. An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1256. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1356.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1366.3. Future research directions . . . . . . . . . . . . . . . . . . . . . . . . 137
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139Appendices 12
1. Introduction
Transmission of information has become a key feature of the modern way of life.The possibilities offered by telecommunications are changing the way how people work, shop, spend their leisure time etc. The advancing communication andinformation processing technologies create more markets for new communicationservices and products. In particular, the demand for wireless communication services has increased rapidly and the trend is excepted to continue. Therefore, stringent requirements on the capacity of communication systems are posed in termsof the number of users a system can serve simultaneously. In other, and moreappropriate, words, as much information as possible should be transferred. Thisgoal can be achieved by designing efficient source coding methods to compress thenonsystematic redundancies in the information, by using smaller cells in cellularsystems, by utilizing spatial signal processing techniques, and by designing efficientmultipleaccess techniques and transceivers for them.
The topic of this thesis is to analyze demodulation techniques which demodulate multiple users of a communications system jointly increasing the capacity ofcommunication systems. The approach is called multiuser demodulation (MUD)or multiuser detection. Receivers applying multiuser demodulation are called multiuser receivers. As an introduction to the topic, multipleaccess techniques, theirfeatures, as well as pros and cons are discussed in Section 1.1. In Section 1.2 factsmotivating the need for the multiuser receivers are considered, and short historicaloverview of the multiuser demodulation is also presented. The aims and outline ofthe thesis is described in Section 1.3.
1.1. Multipleaccess techniques
Multipleaccess refers to a technique to share a common communications channelbetween multiple users. The freedoms in use when designing multiuser communication systems include space, time, and frequency. Time and frequency domains areduals of each other via the Fourier transform so that the actual options to use are
22
space domain and timefrequency domain designs. In the space domain users canbe separated by making their distance large enough. An example is to use cablesto separate communication signals in wireline communication. Another example isto separate transmitters geographically to have large enough distances attenuatingthe signals so that they do not interfere significantly. More advanced techniquesinclude polarizationdivision multipleaccess (PDMA) and spacedivision multipleaccess (SDMA) [1]. In PDMA two users can be separated by using electromagneticwaves with different polarization. In SDMA sectorized antennas are usually appliedto separate users at the same frequency.
In timefrequency domain multipleaccess each users’ transmitted data signal ismodulated by a signature waveform. The receiver can demodulate each users data,if the signature waveforms of the users are different enough. Various signaturewaveform designs result in different multipleaccess techniques.
The oldest multipleaccess technique is frequencydivision multipleaccess(FDMA). In FDMA each users’ signature waveform occupies its own frequencyband and the receiver can separate the users’ signals by simple bandpass filtering.FDMA is a simple scheme and applicable to both analog and digital modulation. Itis not, however, very flexible for providing variable bit rates, which is an importantrequirement in future communication services. Making the bit rate higher requiresmore frequency channels to be allocated for a user. This implies a need for severalbandpass filters.
The introduction of digital modulations enabled the appearance of timedivisionmultipleaccess (TDMA), in which each users’ signature waveform is limited toa predetermined time interval. TDMA is relatively simple to implement and itis very flexible for providing variable bit rates. Increasing the bit rate can beimplemented by assigning to a user more transmission intervals. However, thetransmissions of all the users must be exactly synchronized to each other. Due tosimpler implementation of more complicated modulation schemes in TDMA thanin FDMA the capacity of TDMA systems is usually significantly higher than thatof the FDMA systems.
The invention of spreadspectrum techniques for communication systems withantijamming and low probability of undesired interception capabilities lead to theidea of codedivision multipleaccess (CDMA). A review of the spreadspectrumtechniques can be found in papers by Scholtz [2] and Pickholtz et al. [3]. Moredetailed treatments can be found in the books by Simon et al. [4], Dixon [5],Peterson et al. [6], and Viterbi [7]. The history of spreadspectrum has beenreviewed in [8, 9, 10] and [4, Part 1, Chap. 2].
CDMA can be implemented in numerous ways including frequencyhopping(FH), timehopping (TH), and directsequence (DS) spreadspectrum techniques[4] as well as multicarrier (MC) techniques [11]. Design of CDMA signature waveforms based on wavelets [12, 13, 14, 15] or overlapping signature waveforms (spreadsignature CDMA) counteracting fading [16, 17] have also been proposed. HybridCDMA systems based on combining all or some of the techniques are also possible. In FHCDMA users’ signature waveforms are centered on different carrierfrequencies at different time intervals. The hopping from a frequency to another iscontrolled according to a pseudorandom spreading sequence. In DSCDMA sys
23
tems each users’ signature waveforms are continuous in the time domain and havea relatively flat spectrum. Therefore, in DSCDMA systems users are separatedneither in time nor in frequency domains, but all signature waveforms occupy thewhole frequency band allocated for the transmission at all times. However, thedata of users can be separated in the receivers, since the signature waveforms ofDSCDMA are formed by spreading sequences which are unique to all users. Inmulticarrier modulation each user’s data is transmitted using different carrier frequencies [18]. In MCCDMA the data signal is also spread in the frequency domainas in DSCDMA [11].
Traditional FDMA and TDMA are designed to be orthogonal in the sense thatthe signature waveforms are mutually orthogonal. DSCDMA, on the other hand,can be designed to be either orthogonal or nonorthogonal. The spreading sequences can be designed to be orthogonal. If the signals of all users arrive at thereceiver with the same time delay (spreading sequence phase) and if the transmission medium does not cause time dispersion, the signature waveforms appear asorthogonal at the receiver. With unequal timing offsets, the signature waveformsare nonorthogonal at the receiver.1 The spreading sequences may also be designedto be nonorthogonal. Orthogonal CDMA is in many respects similar to FDMAor TDMA. The nonorthogonal CDMA is more flexible than orthogonal multipleaccess techniques, since there is no hard limit on the number of users, as there is inorthogonal multipleaccess techniques due to the finite dimensionality of the signalspace.
The debate on the question which multipleaccess technique gives the maximal system capacity is very controversial2. One significant answer is given byinformation theory. In the so called Gaussian multipleaccess channel, i.e., in atimefrequency channel distorted by additive white Gaussian noise (AWGN) withseveral transmitters and one centralized receiver, the maximum Shannon capacityis obtained by letting all the users to use all the bandwith at all time instants [22].A DSCDMA system is clearly a good approximation of such a system. Anotheranswer for cellular systems is provided by the fact that a DSCDMA system canprovide a frequency reuse factor of one [7], whereas TDMA has so far been limited to a reuse factor of three or four. However, antenna diversity may push thereuse factor for TDMA lower in the future. With a frequency reuse factor of oneFHCDMA cannot avoid frequency hits between users at the same frequencies inadjacent cells causing a severe performance degradation. Furthermore, coherentdemodulation is not practical in FH systems, which causes a performance penaltyin comparison to DS systems with coherent demodulation. The CDMA signaturewaveforms have usually a significantly larger bandwith than FDMA or TDMAwaveforms. Thus, CDMA signature waveforms offer protection against fading,which is an impairment of mobile radio channels [23]. The advantages of TDMAin comparison to CDMA include its simpler implementation in many cases and theinfrastructure of some existing systems.
1Similar nonidealities, such as time or frequencydispersive channel, often remove also theorthogonality of FDMA, TDMA, or FHCDMA.
2As an example, for DSCDMA applied to cellular mobile communications contrary views arepresented, e.g., by Viterbi & Vembu [19, 20] and by Verdu [21].
24
CDMA has reached most interest in application to wireless cellular terrestrial[1, 24, 25, 26, 27, 7] or satellite [28] communications. The IS95 cellular system[29, 30] is a second generation cellular wireless communication system applyingCDMA technology. Since the bandwith of IS95 is relatively narrow (1.25 MHz),IS95 is often called a narrowband CDMA system. There are several third generation systems under development, which utilize the DSCDMA technique. One ofthem is a CDMA system utilizing multiuser detection [31, 32] proposed in a European research project FRAMES [33, 34]. Another developing DSCDMA systemis the so called wideband CDMA (WCDMA) system proposed for Japan [35, 36].The modified IS95 standard IS665 introduces also a wideband CDMA system[37]. All the above mentioned third generation CDMA systems are proposed toutilize multiuser receivers, namely some form of multipleaccess interference (MAI)cancellation.
1.2. Multiuser demodulation
As discussed in the previous section, nonorthogonal multipleaccess can potentiallyoffer better system capacity than orthogonal schemes. The price to be paid fornonorthogonal signature waveforms is the fact that the conventional singleusermatched filter or correlator receiver is not optimal for demodulation. Actually,obtaining the maximum Shannon capacity requires joint decoding of the data ofall users [22]. The problem becomes increasingly significant if the received powerlevels of the users are dissimilar. This is the so called nearfar problem. Strongsignals may completely bury the weak ones if the conventional receiver is applied.Therefore, the design of conventional CDMA systems relies on accurate powercontrol [7, 38] to alleviate the nearfar problem, and spreading sequence design[4, 5, 6, 7, 39] to reduce crosscorrelations between the signature waveforms of theusers. If the number of users is large, the performance of the conventional singleuser receiver is poor even in the absence of the nearfar problem due to the largelevel of MAI.
An alternative to the conventional receiver is to apply a receiver designed to takethe multipleaccess interference into consideration, i.e., multiuser demodulation.The multiuser demodulation is related to cochannel interference rejection [40].Cochannel interference is caused by signals of users transmitting at the samefrequency band, and it is usually rejected by adaptive filtering [41, 42]. This canbe seen as a special case of multiuser demodulation. A multiuser detector can alsomake a joint detection of the data of all users.
The first publication on multiuser detection was presented by Schneider [43], whostudied the zeroforcing decorrelating detector. Later Kashihara [44] and Kohno etal. [45] studied multipleaccess interference cancellation receivers. Both Schneiderand Kohno also suggested the use of the Viterbi algorithm for optimal detection inasynchronous multiuser communications. The real trigger to the increasing interestin multiuser detection was Verdu’s work on multiuser detection [46, 47, 48], wherethe application of the Viterbi algorithm for optimal maximum likelihood sequence
25
detection (MLSD) was developed, and its performance was analyzed. Verdu showedthat the CDMA systems are neither interference nor nearfar limited, but both areactually limitations of the conventional singleuser receiver.
Since the optimal multiuser detection is prohibitively complex to implement formany practical applications, numerous suboptimal schemes have been investigated.A review of multiuser demodulation literature will be presented in Chapter 2. Tutorial reviews can also be found in [49, 50, 51] and an overview in [52]. The workon multiuser receivers has demonstrated that even suboptimal detector with a significantly lower implementation complexity than the optimal detector can greatlyimprove the detection performance and capacity of multiuser communication systems. Furthermore, robust detection in the presence of a nearfar problem wasshown to be possible.
1.3. Aim and outline of the thesis
In spite of the major research effort invested in multiuser demodulation techniques,several practical as well as theoretical open problems still exist in the field ofmultiuser receivers. Some of them are considered in more detail in this thesis.The aim of the thesis is to develop practical multiuser demodulation algorithmsfor mobile communication systems with frequencyselective fading channels, and toanalyze their implementation complexity. The emphasis is restricted to the uplink(i.e., reverse link) of asynchronous DSCDMA systems where users transmit in anuncoordinated manner and are received by one centralized receiver.
The thesis is presented as a monograph for clarity and to make it easier to read.However, parts of the literature review in Chapter 2 and parts of the main contributions in Chapters 3–5 have been published earlier or submitted for publication.The rest of the thesis is organized as follows.
Chapter 2, literature review of which is in part included in Paper I, presentsthe background knowledge for the main contributions of the thesis. Notationsand a mathematical model of a CDMA system utilized in the later chapters areintroduced. Relevant literature on singleuser fading channel receivers as well as onmultiuser demodulation is reviewed. Based on the literature review, open problemsto be considered in Chapters 3–5 are pointed out.
Chapter 3, results of which are in part included in Papers II–IV, considersthe approximation of ideal infinite memorylength (infinite impulse response, IIR)linear multiuser detectors by finite memorylength (finite impulse response, FIR)detectors in asynchronous CDMA systems. The stability and performance of theFIR detectors are analyzed and numerical examples are presented.
Chapter 4, results of which are in part included in Papers V–VI, considersmultiuser demodulation in relatively fast fading channels. An optimal multiuserreceiver is derived, and the performance of two suboptimal receivers, namely thedecorrelating and parallel interference cancellation receivers, is studied. In particular, the performance of different channel estimation filters, dataaided (DA) anddecisiondirected (DD) channel estimation, and the bit error rates of the decorre
26
lating and the parallel interference cancellation receivers are compared.Chapter 5, results of which are in part included in Papers III, VII–X, focuses on
the implementation issues of the linear receivers. The computational complexityof updating the receivers to changes in communication scenario is analyzed andcompared to the parallel interference cancellation receivers.
Chapter 6 concludes the thesis. The results and contributions are summarizedand discussed. Furthermore, some open problems for future research are pointedout.
1.4. Author’s contribution to publications
The thesis is in part based on the ten original publications. The author has hadthe main responsibility for making the analysis and writing all the Papers I–X.The author has also implemented the software to perform the numerical analysisand computer simulations except in Paper VI, where Markku Heikkila compiledthe software in the guidance of the author and Matti Latvaaho.
In Paper I, the author has compiled the literature review of multiuser demodulation utilized in Chapter 2. In Papers, II–III, the author invented the main ideas,developed the analysis, and produced the examples. The second author providedhelp, ideas, and criticism during the process. In Papers IV and VIII–X, the ideaof application of the conjugate gradient algorithm to multiuser detection is dueto Jorma Lilleberg. The author developed the idea and the analysis, as well asproduced the examples. The second and third authors provided help, ideas, andcriticism during the process. Papers V and VII are author’s own work. In PaperVI, the author developed the ideas and analysis together with the help of the otherauthors.
2. Preliminaries
Some preliminaries necessary for analysis in the following chapters are presented inthis chapter. In Section 2.1 a multiuser CDMA system is defined in mathematicalterms. A review of the earlier and parallel work regarding receiver design for fadingchannel communications and multiuser demodulation is presented in Section 2.2.The open problems addressed in this thesis are defined in Section 2.3.
2.1. System model
A general multiuser CDMA system is illustrated in Fig. 2.1. The so called multipleaccess channel [22, Chap. 14] is considered in this thesis. In this model K usersshare the same communication media and the signals transmitted by the userspass through separate and independent channels. The outputs of the channelsare added to a common noise process. The transmitted data is demodulated ina centralized multiuser receiver, which makes a joint decision of the data of allusers. In a mobile communication system, for example, the setup is valid for theuplink. The mathematical formulation of the transmission system presented in thissection has been inspired by several earlier papers, e.g., [53, 47, 54, 55, 56, 57]. InSection 2.1.1, the system model is defined in a conventional way with continuoustime variables. The corresponding discretetime model, which is more suitable foralgorithm derivations than the continuoustime one, is defined in Section 2.1.2. InSection 2.1.3, the system model for a truncated observation window is presented.The statistical model of the fading channel is described in Section 2.1.4. To ease thereading of the thesis some notational principles are summarized in Section 2.1.5.
28
b
t
b
b
n
n
n
1
2
( )
( )
( )
( )
c t1( )
c t2( )
...
c t( )
z t
r t
( )
( )
K
K
s n1( )
t( )s n2( )
t( )s nK( )
Σ ΣA2
A1
AK
Fig. 2.1. CDMA system.
2.1.1. Continuoustime model
A user k ∈ 1, 2, . . . ,K transmits in the nth symbol interval t ∈[(n − 1)T, nT
)complex signal
b(n)k Aks
(n)k (t− τk), (2.1)
where T is the length of the symbol period, b(n)k ∈ Ξ is the transmitted complex data
symbol1, Ξ is the modulation symbol alphabet, Ak =√Eke
jφk is the transmittedcomplex amplitude of user k (assumed to be constant over the transmission), Ek isthe energy per symbol of the corresponding real bandpass signal, φk is the carrierphase, τk ∈ [0, T ) is the delay of kth user’s transmitted signal, and s
(n)k (t) is the
signature waveform of user k. For convenience, s(n)k (t) is assumed to be real (the
analysis can be straightforwardly generalized to the complex case) and normalizedso that s(n)
k (t) = 0, if t 6∈ [0, T ), and∫ T
0  s(n)k (t) 2 dt = 1. In a DSCDMA system
the signature waveforms are of the form
s(n)k (t) =
Nc−1∑m=0
s(n)k,mψ(t−mTc), (2.2)
where s(n)k,m is the mth chip of user k on the symbol interval n, Tc is the length of
the chip period, Nc = T/Tc is the processing gain, and ψ(t) is the chip waveform.
1Uncoded transmission is studied in this thesis, i.e., the data symbols b(n)k, ∀ k, n are assumed
to be i.i.d. random variables with uniform distribution into Ξ.
29
In this work the chips are assumed binary, i.e., s(n)k,m ∈ −1, 1. If the signature
waveforms are periodic with period T , i.e., s(n)k (t) = s
(i)k (t) ∀ n, i or for DS signals
s(n)k,m = s
(i)k,m ∀ n, i, they will be called timeinvariant, otherwise timevarying. Con
stant envelope modulation (e.g., MPSK) is assumed, therefore, b = 1, ∀ b ∈ Ξ. Itis assumed that the CDMA system under investigation is asynchronous in the sensethat the delays are uniformly distributed into the interval τk ∈
[0, T
)∀ k, l. The
CDMA system is called synchronous if the delays are equal (and, thus, normalizedto zero), i.e., τ1 = τ2 = . . . = τK = 0, and quasisynchronous if the delays are smallcompared to the symbol interval.
It is assumed that the channel of user k appears as a linear filter with impulseresponse c
(n)k (t) (Fig. 2.1). It is further assumed that the channel impulse re
sponses consists of discrete multipath components [23, Chap. 14] so that they canbe expressed as
c(n)k (t) =
L∑l=1
c(n)k,l δ
(t− τ (n)
k,l
), (2.3)
where L is the number of multipath components2 of the channel, c(n)k,l is the complex
coefficient (gain) of the lth multipath component of user k at symbol interval n,τ
(n)k,l ∈
[0, Tm
)is the delay of the lth multipath component of user k at symbol
interval n, Tm is the delay spread of the channel and δ(t) is the Dirac’s deltafunction. The effect of timevarying delays is not analyzed in this thesis and thedelays are assumed to be perfectly tracked. Thus, they will be denoted by τk,l inthe forthcoming analysis. Furthermore, it is assumed that the delay spread of thechannel is less than the symbol interval, i.e., Tm < T .
The received CDMA signal is the convolution of the transmitted signal (2.1)and the channel impulse response (2.3) plus the additive channel noise. Thus, thecomplex envelope of the received signal can be expressed as
r(t) =Nb−1∑n=0
K∑k=1
b(n)k Aks
(n)k (t− nT − τk) ∗ c(n)
k (t) + z(t)
=Nb−1∑n=0
K∑k=1
b(n)k Ak
L∑l=1
c(n)k,l s
(n)k (t− nT − τk − τk,l) + z(t), (2.4)
where Nb is the number of symbols in the data packet, the asterix ∗ denotesconvolution, z(t) is complex zero mean additive white Gaussian noise process withtwosided power spectral density σ2.
It has been shown that the set of matched filter (MF) outputs sampled once in asymbol interval forms sufficient statistics for the detection of the transmitted data3
[47, 56]. The sampled output of the filter matched to the kth users lth multipath
2The number of propagation paths is assumed to be equal for all users for notational simplicity.3This is true due to the key assumption that the channel noise z(t) has Gaussian complex
amplitude distribution and the delays are known so that the MF outputs can be sampled atcorrect times.
30
component is
y(n)k,l =
∫ (n+1)T+τk+τk,l
nT+τk+τk,l
r(t)s(n)k (t− nT − τk + τk,l)dt. (2.5)
Let the vectors of MF output samples for the nth symbol interval be defined as
y(n)k = (y(n)
k,1 , y(n)k,2 , . . . , y
(n)k,L)> ∈CL (2.6)
y(n) = (y>(n)1 ,y>(n)
2 , . . . ,y>(n)K )> ∈CKL (2.7)
and their concatenation over the whole data packet
y =(
y>(1) y>(2) · · · y>(Nb))> ∈CNbKL. (2.8)
Let R(n)(i) ∈ (−1, 1]KL×KL be a crosscorrelation matrix4 with the partitioning
R(n)(i) =
R(n)
1,1 (i) R(n)1,2 (i) · · · R(n)
1,K(i)R(n)
2,1 (i) R(n)2,2 (i) · · · R(n)
2,K(i)...
.... . .
...R(n)K,1(i) R(n)
K,2(i) · · · R(n)K,K(i)
∈ IRKL×KL, (2.9)
where matrices R(n)k,k′(i) ∈ IRL×L, ∀ k, k′ ∈ 1, 2, . . . ,K have elements(
R(n)k,k′ (i)
)l,l′
=∫ ∞−∞
s(n)k (t− τk − τk,l)s(n−i)
k′ (t+ iT − τk′ − τk′,l′)dt,
∀ l, l′ ∈ 1, 2, . . . , L (2.10)
The vector (2.7) can be expressed as [54]
y(n) = R(n)(2)C(n−2)Ab(n−2) + R(n)(1)C(n−1)Ab(n−1) (2.11)+R(n)(0)C(n)Ab(n) + R(n)(−1)C(n+1)Ab(n+1)
+R(n)(−2)C(n+2)Ab(n+2) + w(n),
whereA = diag (A1, A2, . . . , AK) ∈CK×K (2.12)
is a diagonal matrix of transmitted amplitudes,
C(n) = diag(c(n)
1 , c(n)2 , . . . , c(n)
K
)∈CKL×K , (2.13)
is the matrix of channel coefficient vectors
c(n)k =
(c(n)k,1 , c
(n)k,2, . . . , c
(n)k,L
)>∈CL, (2.14)
4For notational compactness, the discretetime index n will be left out from R(n)(i), whenpossible without confusion.
31
b(n) =(b(n)1 , b
(n)2 , . . . , b
(n)K
)>∈ ΞK , (2.15)
is the vector of the transmitted data and w(n) ∈ CKL is the output vector due tonoise. As in the case of timeinvariant signature waveforms [54], it is easy to showthat R(n)(i) = 0KL, ∀ i > 2 and R(n)(−i) = R>(n+i)(i), where 0KL is an allzeromatrix of size KL×KL.
The concatenation vector of the matched filter outputs (2.8) has the expression
y = RCAb+w = RCh+w, (2.16)
where
R =
R(0)(0) R>(1)(1) R>(2)(2) · · · 0KLR(1)(1) R(1)(0) R>(2)(1) · · · 0KLR(2)(2) R(2)(1) R(2)(0) · · · 0KL
......
.... . .
...0KL 0KL 0KL · · · R(Nb−1))(0)
∈ IRNbKL×NbKL, (2.17)
C = diag(C(0),C(1), . . . ,C(Nb−1)
)∈CNbKL×NbK , (2.18)
A = diag (A,A, . . . ,A) ∈CNbK×NbK , (2.19)
b =(b>(0),b>(1) . . . ,b>(Nb−1)
)>∈ ΞNbK , (2.20)
h = Ab is the dataamplitude product vector, and w is the Gaussian noise outputvector with zero mean and covariance matrix σ2R.
The emphasis in this thesis is on centralized multiuser detectors that process thematched filter output to provide statistics for both channel amplitude estimationand data detection. The multiuser detector output for the nth symbol interval isdenoted by y(n)
[MUD] ∈CKL. Similarly, as in (2.8), the concatenation of the detectoroutputs over the whole data symbol packet is denoted by y[MUD] ∈CNbKL.
2.1.2. Discretetime model
The received continuoustime signal is assumed to be sampled after frontend filtering with Ns samples per symbol interval. The received signal vector for thewhole data packet over time interval t ∈
[0, (Nb + 1)T
), is
r = SCAb+ z, (2.21)
32
where z is the received complex white Gaussian noise sequence, and S is a matrixof samples of signature waveforms of the form
S =
S(0)(0) 0 0 · · · 0S(0)(−1) S(1)(0) 0 · · · 0S(0)(−2) S(1)(−1) S(2)(0) · · · 0
......
.... . .
...0 0 0 · · · S(Nb−1)(0)0 0 0 · · · S(Nb−1)(−1)0 0 0 · · · S(Nb−1)(−2)
∈ IR(N+2)Ns×NKL, (2.22)
where matrix S(n)(0) ∈ IRNs×KL includes the firstNs samples, S(n)(−1) ∈ IRNs×KL
includes the middle Ns samples, and S(n)(−2) ∈ IRNs×KL, includes the last Nssamples of signature waveforms of the users in the nth symbol interval due to thedelay differences of users. Assuming 0 = τ1 < τ2 < · · · < τK < T the elementmatrices have the structure
S(n)(0)
S(n)(−1)
S(n)(−2) 0
0. . .
= ∈ IR3Ns×KL.
A column bar in the matrix above describes the nonzero sampled signature waveform of a particular user, and the zeros at the top are used to represent the delaysof a propagation path of the particular user.
The matched filter output vector has the expression
y = SHr = RCAb+w = RCh+w, (2.23)
where the correlation matrix R in (2.17) has the expression
R = S>S ∈ (−1, 1]NKL×NKL. (2.24)
Therefore the blocks of R have the expressions
R(n)(0) = S>(n)(0)S(n)(0) + S>(n)(−1)S(n)(−1) + S>(n)(−2)S(n)(−2)R(n)(1) = S>(n)(0)S(n−1)(−1) + S>(n)(−1)S(n−1)(−2)R(n)(2) = S>(n)(0)S(n−2)(−2).
2.1.3. Finite processing window model
In purely asynchronous, unslotted CDMA systems the data packet lengths Nb arevery large. Actually, each user activates and deactivates its terminal independently
33
from each other. Thus, it is not practical to assume that the whole received signalr or the matched filter output vector y would be processed in a receiver. Therefore,a finite processing window model will be defined.
The received signal will be processed in processing windows of lengthN = 2P+1,where P is a positive integer and N is the window length measured in symboldurations T . A concatenation of symbols over a processing window is denoted by
b(n) =(b>(n−P ), . . . ,b>(n−1),b>(n),b>(n+1), . . . ,b>(n+P )
)>∈ ΞNK . (2.25)
Similarly, the concatenation of the matched filter outputs over the processing window is defined as
y(n) =(y>(n−P ), . . . ,y>(n−1),y>(n),y>(n+1), . . . ,y>(n+P )
)>∈CNKL. (2.26)
The vector of the matched filter outputs has the expressions
y(n) = R(n)C(n)A(n)b(n) +R(n)e C(n)
e A(n)e b(n)
e +w(n) (2.27)
= R(n)C(n)A(n)b(n) +w(n), (2.28)
where the vector
b(n)e =
(b>(n−P−2),b>(n−P−1),b>(n+P+1),b>(n+P+2)
)>∈ Ξ4K (2.29)
includes the symbols outside the processing window,
b(n) =
(b>(n−P−2),b>(n−P−1), b>(n),b>(n+P+1),b>(n+P+2)
)>∈ Ξ(N+4)K ,
(2.30)includes the symbols both inside and outside the processing window,
C(n) = diag(C(n−P ),C(n−P+1), . . . ,C(n+P )
)∈CNKL×NK , (2.31)
C(n)e = diag
(C(n−P−2),C(n−P−1),C(n+P+1),C(n+P+2)
)∈C4KL×4K , (2.32)
C(n) = diag(C(n−P−2),C(n−P−1), C(n),C(n+P+1),C(n+P+2)
)∈C(N+4)KL×(N+4)K , (2.33)
A(n) = diag (A,A, . . . ,A) ∈CNK×NK , (2.34)
A(n)e = diag (A,A,A,A) ∈C4K×4K , (2.35)
A(n) = diag(A,A,A,A(n),A
)∈C(N+4)K×(N+4)K , (2.36)
R(n) =
R(n−P )(0) R>(n−P+1)(1) R>(n−P+2)(2) · · · 0KL
R(n−P+1)(1) R(n−P+1)(0) R>(n−P+2)(1) · · · 0KLR(n−P+2)(2) R(n−P+2)(1) R(n−P+2)(0) · · · 0KL
......
.... . .
...0KL 0KL 0KL · · · R(n+P )(0)
34
∈ IRNKL×NKL, (2.37)
R(n)e =
(ζ
(n)1 , ζ
(n)2
)
=
R(n−P )(2) R(n−P )(1) 0KL 0KL
0KL R(n−P+1)(2) 0KL 0KL0KL 0KL R>(n+P+1)(2) 0KL0KL 0KL R>(n+P+2)(1) R>(n+P+3)(2)
∈ IRNKL×4KL, (2.38)
i.e., ζ(n)1 ∈ IRNKL×2KL includes the first and ζ(n)
2 ∈ IRNKL×2KL the last 2KLcolumns of R(n)
e , and
R(n) =(ζ
(n)1 ,R(n), ζ
(n)2
)∈ IRNKL×(N+4)KL. (2.39)
In (2.27) the first term is the response due to symbols b inside the processingwindow and the second is the response due to symbols be outside the processingwindow. The third term w ∈ CNKL is the response due to noise, which is a zeromean Gaussian random vector with covariance matrix σ2R. Expression (2.28) isobtained by writing the first two terms in (2.27) as one matrixvector product. It isassumed in this work that matrices R and R(n) are positive definite and thereforenonsingular. This is ideally the case with probability one [54]. In practice, Rand R(n) become singular (positivesemidefinite) if the product KL is large incomparison to the processing gain of a DSCDMA system. The reason is thatpractical bandwith constraints pose upper limits to the dimensionality of signalspace spanned by the columns of S. It has been observed by the author that thenumber of users and multipath components up to KL ≈ 3Nc can be tolerated inasynchronous DSCDMA systems so that the matrix R is still nonsingular.
2.1.4. Statistical fading channel model
The channel coefficient vector
c =(c>(0), c>(1), . . . , c>(Nb−1)
)>, (2.40)
where c(n) =(c>(0)
1 , c>(0)2 , . . . , c>(0)
K
)> is assumed to be complex Gaussian randomvector with zero mean and covariance matrix Σc. It is assumed that the fadingchannel coefficients have a zero mean and variance normalized for convenience sothat
L∑l=1
E(c(n)k,l 2
)= 1, ∀ k. (2.41)
The channel coefficients are assumed to be independent, i.e., E(c(n)k,l , c
∗(n)k′,l′
)=
σ2ck,l
δk,k′δl,l′ , where δk,k′ is the discrete Kronecker delta function, and σ2ck,l
=
35
E(ck,l2
)is the power of the lth path of user k. The assumption is equivalent to the
common uncorrelated scattering (US) model [23]. The channels are assumed to bestationary over the observation interval so that the channel autocorrelation (autocovariance) function ϕk,l(n, n′) = E
(c(n)k,l c
∗(n′)k,l
)is a function of the time difference
n′ − n only. The assumption is equivalent to the common widesense stationary(WSS) model [23]. In other words, the channel autocorrelation becomes
ϕk,l(i) = E(c(n)k,l c
∗(n+i)k,l
). (2.42)
The stationarity assumption is valid if the vehicle speed does not change duringthe transmission. The Doppler power spectrum is assumed to be the classicalJakes’ spectrum [58, Sec. 5.4], which results in the Clarke’s channel autocorrelationfunction
ϕk,l(i) = σ2ck,lJ0(2πfd
i
T), (2.43)
where J0 is the zeroorder Bessel function of the first kind,
fd =v
clightfc (2.44)
is the maximum Doppler spread, v is the speed of the vehicle, clight is the speedof light, and fc is the carrier frequency. The width of the channel autocorrelationfunction is called channel coherence time, denoted by Tcoh. The coherence timesatisfies Tcoh ≈ 1/fd. Channel is said to be slowly fading if Tcoh T or fdT 1,and fast fading if Tcoh < T or fdT > 1. In the intermediate case Tcoh > T orfdT < 1, the channel will be termed relatively fast fading. This is often the case incurrent mobile communication systems with high vehicle speeds.
The covariance matrix of the channel can be partitioned as
Σc =
Σc(0) Σc(0),c(1) · · · Σc(0),c(Nb−1)
ΣHc(0),c(1) Σc(1) · · · Σc(1),c(Nb−1)
......
. . ....
ΣHc(0),c(Nb−1) ΣH
c(1),c(Nb−1) · · · Σc(Nb−1)
. (2.45)
With the WSSUS channel model the blocks in (2.45) can be expressed as
Σc(n),c(n+i) =
Σ
c(n)1 ,c
(n+i)1
0L · · · 0L0L Σ
c(n)2 ,c
(n+i)2
· · · 0L...
.... . .
...0L 0L · · · Σ
c(n)K,c
(n+i)K
, (2.46)
and
Σc(n)k,c
(n+i)k
=
ϕk,1(i) 0 · · · 0
0 ϕk,2(i) · · · 0...
.... . .
...0 0 · · · ϕk,L(i)
. (2.47)
36
2.1.5. Summary of notational conventions
A boldface, lowercase nonitalic symbol with discretetime index as a superscript,e.g., b(n) ∈ ΞK ,h(n) ∈ CK ,y(n) ∈ CKL,w(n) ∈ CKL, denotes a vector of K orKL variables over the nth symbol interval. A boldface, lowercase, italic symbol,e.g., b ∈ ΞNbK ,h ∈ CNbK ,y ∈ CNbKL,w ∈ CNbKL, denotes a vector of NbK orNbKL variables concatenated over the whole data packet of Nb symbol intervals.A boldface, lowercase, italic symbol with discretetime index as a superscript, e.g.,b(n) ∈ ΞNK ,h(n) ∈ CNK ,y(n) ∈ CNKL,w(n) ∈ CNKL, denotes a vector of NK orNKL variables concatenated over the observation window of N symbol intervals.A boldface, lowercase italic symbol with discretetime index as a superscript andsymbol e (denoting for edge) as a subscript, e.g., b(n)
e ∈ Ξ4K ,h(n)e ∈ C4K ,y
(n)e ∈
C4KL,w(n)e ∈ C4KL, denotes a vector of 4K or 4KL variables over the symbol
intervals n − P − 2, n − P − 1, n + P + 1, n + P + 2. A boldface, lowercase,italic symbol with a bar above and discretetime index as a superscript, e.g., b(n) ∈Ξ(N+4)K , h
(n) ∈C(N+4)K , y(n) ∈C(N+4)KL, ¯w(n) ∈ C(N+4)KL, denotes a vector of(N + 4)K or (N + 4)KL variables concatenated over the observation window of Nsymbol intervals and the previous and following two symbol intervals causing theedge effect.
Corresponding conventions apply to matrices as well. The boldface, uppercasesymbols denote matrices for one symbol interval, and the boldface, uppercase,calligraphic symbols, e.g., R, denote matrices concatenated over several symbolintervals. If the discretetime index as a superscript is included to the calligraphicsymbol, the concatenation is over a processing window of length N , otherwiseover data packet length Nb. If there is the bar above the calligraphic symbol, theconcatenation is over N + 4 symbols including the edge effect. The symbol e as asubscript refers to concatenation over the edge symbols only.
2.2. Review of earlier and parallel work
The relevant background literature is reviewed in this section. The main emphasisis on multiuser demodulation techniques for DSCDMA systems5, which are mostimportant either from a practical or theoretical point of view. Most centralized6
multiuser receivers can be illustrated as in Figs. 2.2(a) or 2.2(b). The multiusersignal processing can be performed either before the multipath combining, by processing the matched filter bank output vector y, or after the multipath combining,
5Although the multiuser receivers have gained most interest in conjunction with DSCDMAsystems, they can be applied in any nonorthogonal multipleaccess scheme. They have beenconsidered for TDMA [59], hybrid DSCDMA/TDMA [60, 61], FHCDMA [62, 63], MCCDMA[64, 65, 66, 67], wavelet packet CDMA [15], and spreadsignature CDMA [68] communications.
6Centralized multiuser detectors (called sometimes also joint detectors) make a joint detectionof the symbols of different users. Decentralized multiuser detectors (called sometimes also singleuser detectors) demodulate a signal of one desired user only.
37
by processing the maximal ratio combined matched filter bank output vector
y[MRC] = CHy. (2.48)
It should be noted, however, that the block diagrams in Figs. 2.2 are simplified andcannot fit all multiuser receivers into their framework. Most multiuser receiverscan also be implemented before matched filtering, i.e., by processing the receivedspreadspectrum signal samples r. It should also be noted that most multiuserreceivers alleviate not only the detrimental effects of multipleaccess interference,but the intersymbol interference as well.
The performance of multiuser receivers can be measured by bit error probability(BEP) or bit error rate (BER), as well as by mean squared error (MSE) of thedetector output or channel estimates. Furthermore, other performance criteriayielding simpler analysis than the bit error probability have also been considered.They include the asymptotic multiuser efficiency (AME) [47, 48], and the nearfarresistance (NFR) [69, 70, 54]. The AME describes the asymptotic limit of theloss in the signaltonoise ratio (SNR) as the power spectral density of the noiseapproaches zero. For coherent BPSK modulation in AWGN channels, AME isdefined as
ηk = sup%∈[0,1]
limσ2→0
Pk
Q(√
%Ekσ2
) <∞, (2.49)
where sup denotes the smallest upper bound, Q(x) = 1√2π
∫∞x e−t
2/2dt is the normalized and scaled Gaussian complementary error function, and Pk is the bit errorprobability of user k with the particular multiuser detector. The AME for multiuser detectors in Rayleigh fading channels has been defined in [56, 71, 72]. TheRician fading case has been considered in [73, 74]. The nearfar resistance is thevalue of the AME for the worst possible interfering energy combination and isdefined as
ηk = infEl≥0,l 6=k
ηk. (2.50)
The detector for user k is said to be nearfar resistant if ηk > 0.In Section 2.2.1, some of the key results on singleuser fading channel receiver
techniques are reviewed. Optimal multiuser receivers are considered in Section2.2.2 and suboptimal ones in Section 2.2.3.
2.2.1. Receivers for fading channel communications
Some key aspects of the receivers for singleuser (K = 1) fading channel communications will be reviewed in short in this section. Tutorial expositions of fadingchannel communications have been presented by Turin [75] and Stein [76]. Morecomplete treatments can be found in books by Proakis [23], and Schwartz et al.[77]. A treatment of the mobile radio channel can be found in [58]. A comprehensive survey of the literature on fading channel communications is included in thethesis of Mammela [78].
38
MF
MF
MF
1,1
1,L
K L,
MF ,1K
multipathcombining
multipathcombining
$
$
( )
( )
b
b
n
Kn
1
y
y
y
y
n
Ln
Kn
K Ln
1 1
1
1
,( )
,( )
,( )
,( )
multiuserdetector...
......
......
...
......
y n1
( )[MRC]
y nK
( )[MRC]
r(t)
(a)
MF
MF
MF
1,1
1,L
K L,
......
MF ,1K
...
$( )b n
1
y
y
y
y
n
Ln
Kn
K Ln
1 1
1
1
,( )
,( )
,( )
,( )
r(t) multiuserdetector
y 1n
1,( )
[MUD]
y Ln
1,( )
[MUD]
y 1n
K,( )
[MUD]
......
...
......
multipathcombing &detection
y Ln
K,( )
[MUD]
$( )b n
1
...multipathcombing &detection
(b)
Fig. 2.2. Multiuser receiver structures.
For slowly fading channels the channel impulse response can be estimated precisely and the channel impulse response can be assumed to be known. In that casethe optimal receiver (yielding lowest probability of symbol error) for the single userk includes a filter matched to the convolution of the signature waveform s
(n)k (t) and
the channel impulse response c(n)k (t). In multipath channels such a matched filter
is called a coherent RAKE receiver [79, 23]. The output of the coherent RAKEreceiver for user k is obtained by maximal ratio combing (MRC) the MF outputs
39
for different propagation paths, i.e., by
y(n)[MRC]k = c
H(n)k y(n)
k =L∑l=1
c∗(n)k,l y
(n)k,l . (2.51)
If the delay spread is significantly smaller than the symbol interval (Tm T ), theintersymbol interference (ISI) can be assumed to be negligible and a hard decision on the RAKE output y(n)
[MRC]k yields (near)optimal decision. If the channelintroduces ISI, the receiver minimizing the error probability is significantly morecomplicated to implement. Thus, another optimization criterion, namely the minimum symbol sequence error probability, is selected. The optimum receiver thenperforms maximum likelihood sequence detection [23] in the presence of ISI. TheMLSD can be implemented efficiently by applying the Viterbi algorithm [23, 80, 81].Suboptimal receivers, which are simpler than MLSD and do not require separatechannel estimator, for ISI channels include linear and decisionfeedback (DF) equalizers (DFE) [23]. The DFE’s can be applied also in frequencyselective channels[75]. Their overall impulse response should be such that the equalizer implicitlyperforms both maximal ratio combining and ISI reduction. The equalizers can bemade adaptive so that they automatically tune their impulse response to approximate the desired one [23] or the impulse response can be computed by utilizing achannel impulse response estimate [82].
In fast or relatively fast fading channels, the channel impulse response cannotbe assumed to be known. Thus, the optimal receiver is somewhat different fromthat in the slowly fading channels. The receiver minimizing the symbol error probability is again complex to implement and difficult to analyze [83]. Therefore, theMLSD is usually selected to be the optimal reference receiver. The MLSD receiverconsists of an estimator, which estimates the received noiseless signal, and a correlator, which correlates (multiplies) the received signal with the signal estimate[84, 85, 86, 78]. The receiver structure is called estimatorcorrelator [85]. The optimal received noiseless signal estimator with known delays according to the MLSDcriterion is the estimator which minimizes the mean squared error at the estimator output. The estimator is thus called minimum mean squared error (MMSE)estimator. Since the channel noise as well as the complex channel coefficients areassumed to have a Gaussian distribution, the MMSE estimator is a linear filter.The estimation and correlation must be performed for all possible transmitted datasequences. The data sequence yielding the largest correlator output is selected asthe maximum likelihood sequence decision. Therefore, the computational complexity of the MLSD receiver depends exponentially on the transmitted data packetsize. For that reason the MLSD is not feasible for most practical applications.
Suboptimal receivers which are simpler to implement can be obtained by applying differentially coherent or noncoherent receivers [77], or by decoupling thechannel estimation and data detection. Blind sequence detectors not needing explicit channel estimation have also been proposed [87]. However, their applicabilityto timevarying channels has not been studied. The channel coefficients can be estimated by filtering the MF outputs by a channel estimation filter if the effect ofdata symbols is removed from them. The channel estimation filter can in principle
40
be either a predictor7, a filter8, or a smoother [88, p. 400]. A predictor uses onlythe past samples to estimate the current channel coefficient, whereas a “filter” usesalso the current sample. A smoother uses the past, current, and future samples.The removal of the data modulation can be accomplished either in a dataaided,decisiondirected9, or nondataaided (NDA) manner [89]. The DA channel estimators utilize MF output samples for which the data is known. This can beaccomplished by transmitting a separate channel sounding reference signal (pilotsignal) from which the channel is estimated [90, 91, 92]. For example, codedivisionduplexed pilot signal is utilized in the IS95 CDMA system downlink [29]. Anotherway of implementing DA channel estimation is to utilize known pilot symbols timedivision multiplexed in the transmitted data stream [93, 94, 95, 96, 97, 98, 99].The channel needs to be interpolated between the pilot symbol intervals. The DDchannel estimators utilize the decisions of the receiver to remove the effect of datamodulation [100, 101, 102, 103]. The DD channel estimation often applies prediction of the complex channel coefficients, since only the past decisions are availablefor channel estimator [100, 101, 102]. By using tentative decisions smoother typechannel estimation filters can also be applied [103]. The NDA channel estimators(also called blind channel estimators) estimate the channel without utilizing dataor decisions. There has been an increasing interest in blind channel identification[104, 105, 106]. Their application to fast or relatively fast fading channels hasgained very little attention [107].
2.2.2. Optimal multiuser demodulation
The centralized multiuser receiver minimizing the bit error probability of one symbol of a particular user has been studied by Verdu [46] for the known channel case.The minimum error probability receiver must find the most probably transmitteddata symbol for all users for all symbol intervals. In other words, NbK separateminimizations need to be performed. Each minimization computes a metric forall possible ΞNb(K−1) interfering data symbol combinations, where Ξ denotesthe cardinality of the set Ξ. Although a dynamic programming algorithm can bedevised to implement the minimum probability of error detector, the required number of operations grow exponentially with the number of users. Furthermore, theperformance of the minimum probability of error detector is difficult to analyze.Therefore, similarly to the singleuser ISI channels, the minimum symbol sequenceerror probability is selected to be the optimization criterion. Thus, the maximum
7Only forward predictors are considered in this thesis.8The term “filter” has here unavoidably two meanings: it denotes a general channel estimation
filter or it denotes a certain type of filter as in [88, p. 400]. In nearly all cases the term “filter”has the former meaning in this thesis. In the case of the latter meaning, the word will be givenin quotes in the sequel.
9Decisionfeedback (DF) and decisiondirected are synonyms. In this thesis, however, theterm decisiondirected is used in conjunction with removal of the effect of the data symbols in thechannel estimation, whereas the term decisionfeedback is used in conjunction with the decisionsutilized in intersymbol interference or multipleaccess interference cancellation.
41
likelihood sequence detector will be the optimal multiuser detector.The MLSD multiuser receiver minimizes the probability of an erroneous decision
on the bit vector b including the data symbols of all users on all symbol intervals.If the channel is known, the decision can be expressed in the form [47, 71, 108]
b[MLSD] = arg minb∈ΞNbK
Ω(b), (2.52)
where the loglikelihood function Ω(b) is
Ω(b) = 2Re(bHAHCHy
)− bHAHCHRCAb. (2.53)
The maximum likelihood detector admits the structure of the receiver in Fig.2.2(a). If the signature waveforms are timeinvariant, the minimization can beimplemented by a dynamic programming algorithm so that the implementationcomplexity depends exponentially on the number of users only, not on the datapacket length [46, 47]. However, the implementation complexity makes the MLSDinfeasible for many practical applications. The asymptotic multiuser efficiency ofthe MLSD has been analyzed in [48, 109, 110]. MLSD for trelliscoded modulatedCDMA transmissions in AWGN channels has been studied in [111], and for convolutionally encoded transmissions in [112]. The effect of delay estimation errorsto MLSD has been considered in [113]. Joint maximum likelihood sequence detection and amplitude estimation in AWGN channels has been analyzed in [114, 115].MLSD for flat Rician fading channels with synchronous CDMA have been considered in [73] and two path Rician fading channels with asynchronous CDMA in [74].MLSD in unknown slowly fading channels has been considered in [116].
The performance of the MLSD is analyzed in [47, 48]. It turned out to be impossible to derive a closed form bit error probability expression for the MLSD. Upperand lower bounds, most of which are complicated to calculate, were found. Thesimplest lower bound is the singleuser bound (or matched filter bound), which isthe performance of a communication system with one active user (K = 1). Theperformance results on the MLSD demonstrate that significant performance gainscan be obtained over the conventional singleuser receiver. It has been demonstrated that the CDMA systems are not inherently interference limited, but thatis the limitation of the conventional detector.
The maximum likelihood sequence detection for relatively fast fading channelshas also been analyzed. MLSD for synchronous CDMA in Rayleigh fading channels has been presented in [72, 117]. The resulting MLSD receiver consists of thereceived noiseless signal estimator for all possible data sequences and a correlator,which multiplies the received signal with the estimated received noiseless signal(estimatorcorrelator receiver).
The optimal MLSD receiver for channels with unknown user and multipathdelays τk and τk,l is significantly more difficult to derive. The reason is the fact thatthe received signal depends nonlinearly on the delays, and the MLSD receiver doesnot admit a simple estimatorcorrelator interpretation. One way to approximatethe MLSD for the reception of a signal with unknown delays is to perform jointmaximum likelihood estimation on the data, received complex amplitude, and thedelays [118]. The joint ML estimation has clearly extremely high computational
42
complexity, which is exponential in the product of the number of users K, numberof propagation paths L, and number of samples per symbol interval Ns.
Optimal decentralized multiuser detectors for AWGN channels have been considered in [119], where the multipleaccess interference was modeled as nonGaussian noise. The optimal decentralized multiuser detectors can also admitthe utilization of the knowledge of a subset of the K − 1 interfering signaturewaveforms. The optimal decentralized multiuser detector has also computationalcomplexity which depends exponentially on the number of users.
2.2.3. Suboptimal multiuser demodulation
Due to the prohibitive computational complexity of the optimal MLSD multiuserreceiver suboptimal solutions have been studied extensively. They somehow approximate the optimal MLSD receiver. Most receivers can process either thematched filter bank output (Fig. 2.2(b)) or its maximal ratio combined version(Fig. 2.2(a)). The latter receivers do not eliminate the effect of MAI on channelestimation. Therefore, the multiuser detectors processing the MF bank outputare often more desirable in practice, and the discussion in this thesis will focuson such receivers. Section 2.2.3.1 concentrates on linear equalizer type receivers,whereas interference cancellation receivers are considered in Section 2.2.3.2. Othersuboptimal receivers are reviewed in Section 2.2.3.3.
2.2.3.1. Linear equalizer type multiuser demodulation
Linear equalizer type multiuser receivers process the matched filter output vectory (or the maximal ratio combined vector y[MRC]) by a linear operation. In otherwords, the output y[LIN ] of a linear multiuser detector T ∈CNbKL×NbKL is
y[LIN ] = T >y. (2.54)
Different choices of the matrix T yield different multiuser receivers. The identitymatrix T = INbKL, is equivalent to the conventional singleuser receiver. Thelinear equalizer type receivers apply the principles of linear equalization, which hasbeen used in ISI reduction [23].
The decorrelating or zeroforcing receiver, which completely removes the MAI,corresponds to the choice [54, 120]
T = R−1. (2.55)
Performance of the decorrelating detector in AWGN channels has been analyzedin [70, 54, 121, 122]. It has been shown that the decorrelating detector is optimally nearfar resistant in the sense that it achieves the same NFR as the MLSD.
43
The performance of the decorrelator in known, slowly fading channels has beenanalyzed in [120, 71, 123, 108, 124]. Differentially coherent case has been considered in [125]. The corresponding analysis for estimated, relatively fast fadingchannels has been presented in [73, 74, 72, 126, 127]. The performance of thedecorrelator utilizing the matched filter bank output y or the maximal ratio combined MF bank output y[MRC] has been compared in [128, 129]. The principleof the decorrelating receiver has been extended to receivers utilizing antenna arrays [130, 131, 132, 133, 134, 135, 129], multiple base stations [136, 137, 138], ormultiple data rates [139, 140]. Adaptive implementations of the decorrelating receiver for synchronous CDMA systems have been considered in [141, 142] and forasynchronous CDMA systems in [143]. The decorrelating receiver for convolutionally encoded CDMA transmissions in AWGN channels has been studied in [144].Decorrelating receivers for quasisynchronous CDMA systems in AWGN channelswithout precise delay estimation has been proposed in [145, 146, 147, 148], and forcode acquisition in quasisynchronous CDMA in [149]. The effect of delay estimation errors to the decorrelator performance has been analyzed in [150, 151]. Theimpact of quantization due to the finite precision presentation of the numbers inthe receiver has been considered in [152, 153].
A partial decorrelator, which also makes the additive channel noise componentwhite, so called noisewhitening detector is defined as
T = L−1, (2.56)
where L is lower triangular Cholesky factor of R such that R = L>L [154]10. Thenoisewhitening detector forces the MAI due to past symbols to zero. The MAI dueto future symbols may be suppressed by interference cancellation utilizing decisionfeedback [155, 154] or the MAI may be handled by some suboptimal treesearchalgorithm [157, 158, 159].
If the information symbols b(n)k are independent and uniformly distributed and
the channel is known, the linear receiver which minimizes the mean squared errorsat the detector outputs (so called LMMSE detector) is [88]
T =[R+ σ2
(CAAHCH
)−1]−1
. (2.57)
The LMMSE receiver is equal to the linear receiver maximizing the signaltointerferenceplusnoise ratio (SINR) [160]. Centralized LMMSE receivers have beenproposed for AWGN channels in [161], for fading channels in [124, 162, 163], andfor antenna array receivers [132, 133, 164, 165]. Bounds for the NFR and SINR ofthe LMMSE receiver in AWGN channels have been derived in [166], and the biterror probability has been analyzed in [167].
The LMMSE receivers have attracted most interest due to their applicability todecentralized adaptive implementation. Decentralized LMMSE multiuser receiversfor AWGN or slowly fading channels suitable for adaptive implementation basedon training have been considered in [168, 169, 160, 170]. The convergence of the
10The definition of Cholesky factorization used in this thesis is an upper triangular matrixtimes a lower triangular matrix [155, 154] as opposed to the usual lower triangular times uppertriangular matrix [156].
44
adaptive algorithms for the LMMSE multiuser receivers has been considered in[171, 172, 173]. A modified adaptive multiuser receiver applicable to relatively fastfading frequencyselective channels with channel state information has been proposed in [174]. CDMA system capacity with LMMSE receivers has been studiedin [175, 176], where the spreadingcoding tradeoff11 has been addressed for systemswith multiuser receivers. An improved LMMSE receiver, less sensitive to the timedelay estimation errors, has been proposed in [177]. Receivers suitable for blindadaptation utilizing the minimum output energy (MOE) criterion have been studied in [178, 179, 180, 181]. It has been shown that the linear filter optimal in theMOE sense is equal to the linear filter optimal in the MMSE sense [178]. A blindreceiver performing both the MOE filtering and timing estimation has been studiedin [182]. Another blind algorithm, namely a linearly constrained constant modulusalgorithm, has been applied in [183].
Decentralized linear receivers include MAIwhitening filters, which model themultipleaccess interference as colored noise. The filters are then designed to whitenthe colored MAInoise plus the AWGN. The MAIwhitening filters have been studied in [184, 185, 186, 187, 188]. The implemention of the MAIwhitening filtersis difficult, since it requires information on the MAI covariance. Approximate implementation results in adaptive receivers similar to their LMMSE counterparts[186].
The linear multiuser receivers process ideally the complete received data block,the length of which approaches infinity in asynchronous CDMA systems. In otherwords, the memorylength of the linear equalizer type receivers is infinite. In[54] it was shown that as Nb → ∞ the decorrelating detector approaches a timeinvariant, stable digital multichannel infinite impulse response (IIR) filter withzdomain transfer function
Dd(z) =[R(2)z−2 + R(1)z−1 + R(0) + R(−1)z + R(−2)z2
]−1
. (2.58)
The input of Dd(z) is the matched filter bank output vector sequence y(n). Sincethe matrix algebraic structure of the LMMSE detector is similar to that of thedecorrelating detector, (2.58) can be generalized for it. The corresponding resultapplies for the noisewhitening detector as well [154]. The detectors can be presented in the form of (2.58) in systems with timeinvariant signature waveformsonly. The implementation of the multichannel IIR filter of the form (2.58) is notstraightforward due to the symbolic computation of the inverse. Any multichannelIIR filter of the form (2.58) can also be represented in the form
D(z) =∞∑
i=−∞D(−i)zi, (2.59)
where the blocks D(−i) ∈ IRKL×KL define the filter coefficients. Truncation of(2.59) to obtain finite impulse response (FIR) filters has been suggested in [54] for
11The spreadingcoding tradeoff deals with the question how much of the bandwith expansionshould be invested in forward errorcorrecting encoding and how much should be invested inspreading.
45
the decorrelating detector. However, the effect of such a truncation on the detectorperformance was not analyzed. The truncation of the noisewhitening detector hasbeen studied independently in [158], but the effect of detector memorylength tothe performance has not been analyzed.
Several other ways to obtain finite memorylength multiuser detectors have beenproposed. The most natural way is to leave symbol intervals regularly withouttransmission. This will result in finite blocks of transmitted symbols and obviouslythe detectors would then have finite memorylength [189, 190]. In [189], such anapproach was called “isolation bit insertion”. This, however, degrades the bandwithefficiency and requires some form of synchronism between users. Other approachesto obtain finite memorylength multiuser detectors include nonlinear subtraction ofestimated multipleaccess interference (“edge correction”) [191], and hard decisionapproximation of decorrelator [161], which ends up with the decision directed,nonlinear MAI canceler. Oneshot detection [49, 192] has also been studied. FIRdesigns have been considered in [193].
In addition to the infinite memory, the linear multiuser receivers have relativelyhigh implementation complexity due to the matrix inversion as in (2.55), (2.56), or(2.57). An approximate update algorithm has been proposed in [194, 191]. However, the algorithm is restricted to track only small changes in the correlationscaused by minor delay changes. Approximate multistage linear equalizer type detectors have been proposed in [190]. Their computational requirements are still acubic function of the number of users. Another approach called δadjusted multiuser detection has been proposed in [195], but its nearfar resistance is still anopen problem.
2.2.3.2. Interference cancellation
The idea of interference cancellation (IC) receivers is to estimate the multipleaccess and multipath induced interference and then subtract the interference estimate from the MF bank (or MRC) output. The interference cancellation canbe derived as an approximation of the MLSD receiver with the assumption thatthe data, amplitude, and delays of the interfering users (or a subset of them) areknown [55]. There are several principles of estimating the interference leading todifferent IC techniques. The interference can be canceled simultaneously from allusers leading to parallel interference cancellation (PIC), or on a user by user basisleading to serial (successive) interference cancellation (SIC). Also groupwise serial(GSIC) or parallel (GPIC) interference cancellation are possible. The interferenceestimation can utilize tentative data decisions. The scheme is called hard decision(HD) interference cancellation. If tentative data decisions are not used, the schemeis called soft decision (SD) interference cancellation. The interference cancellationcan also iteratively improve the interference estimates. Such a technique is utilizedin multistage receivers.
The multistage harddecision parallel interference cancellation (HDPIC) output
46
at the mth stage can be presented as [55]
y[HD−PIC](m) = y −(R− INbKL
)C(m− 1)Ab(m− 1), (2.60)
where C(m − 1) and b(m − 1) denote the tentative channel and data estimatesprovided by the stage m − 1 of the multistage HDPIC receiver. The multistagePIC can be initialized by any linear equalizer type receiver. In the softdecisionparallel interference cancellation (SDPIC) the amplitudedata product is estimatedlinearly without making an explicit data decision, or a tentative data decision witha soft nonlinearity (such as hyperbolic tangent of linear clipper) is made. In otherwords, the product C(m− 1)Ab(m− 1) of the estimates C(m− 1), b(m− 1), and Ain (2.60) is replaced by an estimate (CAb)(m− 1) of the product CAb. In contrastto the linear equalizer type multiuser receivers, the PIC receivers have inherentlyfinite memory. The output of the HDPIC receiver for the nth symbol interval is
y(n)[PIC](m) = y(n) − Ψ(n)
[PIC](m). (2.61)
The multipleaccess interference estimate Ψ(n)[PIC](m) has the form
Ψ(n)[PIC](m) =
PPIC∑i=−PPIC
(R(−i)− δi,0IKL) C(n+i)(m− 1)Ab(n+i)(m− 1), (2.62)
where δi,j is the Kronecker delta, PPIC = dT+TmT e, and dxe denotes the smallest
integer larger than or equal to x. The tentative estimates and decisions may bereplaced by final ones at those symbol intervals for which they are available [161].The result is decisionfeedback HDPIC receiver.
The multistage HDPIC receiver has been proposed and analyzed for AWGNchannels in [196, 55, 197, 198, 199, 200, 201]. The corresponding receivers forslowly fading channels have been studied in [202, 203, 204, 205, 206, 207, 208], andfor relatively fast fading channels in [209, 210, 211, 57]. The HDPIC receivers fortransmissions with diversity encoding has been analyzed in [212], and for systemswith multiple data rates has been studied in [213]. HDPIC receivers for trelliscoded modulated CDMA systems in AWGN channels have been studied in [111].The application of the HDPIC to multiuser delay estimation in relatively fastfading channels has been considered in [214, 215, 57]. The SDPIC receivers withlinear dataamplitude product estimation for slowly fading channels have been considered in [216, 217], and for multicellular systems in [218]. The SDPIC receiverswith soft nonlinearity have been considered for AWGN channels in [219, 220, 221].Modifications of the PIC receiver have also been presented. Replacing the matrix(R− INbKL
)C(m− 1)A in (2.60) by an adaptively controlled weighting matrix for
AWGN channels has been proposed in [219, 222, 223, 224, 225]. A partial PICreceiver with a weighting matrix in front of the matrix
(R− INbKL
)C(m− 1)A in
(2.60) has been proposed in [226, 227, 228]. The weights were chosen in an ad hocmanner according to the reliability of the interference estimates. By applying theexpectationmaximization (EM) or the space alternating generalized EM (SAGE)algorithm a class of iterative multistage receivers is obtained [229, 230, 231, 232].
47
The iterative EM or SAGE based receivers lead to the application different modified interference cancellation principles. The effect of delay estimation errors onthe performance of the HDPIC receiver has been considered in [113], and to theSDPIC receiver in [216].
The serial interference cancellation is performed on user by user basis [233, 234].In the SIC, the amplitude and data of user 1 are estimated first. Using the obtainedestimates the MAI estimate of user 1 is subtracted from the MF outputs of the restof the users. Then the amplitude and data of user 2 are estimated, and the MAIestimate of user 2 is subtracted from the MF outputs of the users k = 3, 4, . . . ,Ketc. The cancellation should start with the user with the largest average power(indexed as user 1), the second powerful user (indexed as user 2) should be cancelednext etc. The ordering is a problem in relatively fast fading channels, since it mustbe updated frequently. SDSIC has been considered in [234, 235, 217], and HDSIC in [233, 236, 237, 238, 239]. The SIC for multirate CDMA communications hasbeen studied in [240, 241]. The SIC has the inherent problem that in asynchronousCDMA systems the processing window of user 1 must ideally be K symbols sothat the MAI caused by users 1, 2, . . . ,K − 1 can be canceled from the MF outputof user K [51]. Another problem with the SIC is that it may not yield goodenough performance in heavily loaded CDMA systems, where the performance ofthe conventional receiver is poor. The reason for that is that the SIC is initializedby a conventional receiver for user 1. If the MAI estimate of the signal of user 1 ispoor in the cancellations, the estimation errors propagate to all users. The SIC hasgood performance in systems where the powers of users differ significantly. Thiscannot be the case in systems with very large number of users. The effect of delayestimation errors to the SDSIC has been considered in [242] and to the HDSICin [243]. The combination of the PIC and the SIC receivers has been studied in[244].
The groupwise interference cancellation receivers detect the symbols of the userswithin some group and form an estimate of the MAI caused by the users within thatgroup based on the symbol decisions. The MAI estimate is then subtracted from theother users’ MF outputs. The groupwise interference cancellation can be performedeither serially or in parallel. The groupwise serial interference cancellation has beenproposed in [245], and the groupwise parallel interference cancellation in [246, 247].The grouping can also be performed on consecutive symbols of a particular userin time [246, 247]. The groupwise SIC has been proposed also for multiple datarate CDMA systems utilizing multiple processing gains [248, 249]. The detectorfor a group of users can, in principle, apply any known multiuser detector, suchas the conventional detector, the decorrelating detector, the PIC detector, or themaximum likelihood sequence detector. The groupwise interference cancellation isa special case of more general groupwise multiuser receivers [250, 251].
The interference cancellation can also be combined to linear equalizer type reception. Most often that is based on decisionfeedback of the detected symbolsto perform a subtractive cancellation of part of the MAI. The DFIC in conjunction with the noisewhitening detector has been considered in [155, 154], and inconjunction with an adaptive equalizer in [252, 253]. The DFIC receiver for convolutionally encoded CDMA transmissions in AWGN channels has been studied in
48
[144].
2.2.3.3. Other multiuser receivers
Most suboptimal multiuser receivers fit into the categories presented in Sections2.2.3.1 and 2.2.3.2. The other most interesting techniques are reviewed briefly inthis section.
In addition to the linear equalization or interference cancellation, the MLSDcan be approximated by partial trellissearch algorithms. The loglikelihood metric (2.53) is computed for a subset of all possible data vectors b. Different criteriato choose the subsets result in different partial trellissearch algorithms. The application of sequential decoding has been proposed in [254]. Some of the groupwisemultiuser receivers discussed above in Section 2.2.3.2 can also be interpreted aspartial trellissearch algorithms [246]. Other partial treesearch algorithms havebeen proposed in [255, 256]. A partial trellissearch algorithm for trelliscodedmodulated CDMA transmissions in AWGN channels has been studied in [111],and for convolutionally encoded transmissions in [112].
Multiuser parameter estimation, i.e., the complex amplitude and delay estimation has gained increasing interest. Since there is usually no a priori distributionavailable for the delays, maximum likelihood estimation is usually selected to be theoptimal technique for delay acquisition and tracking [257, 258, 118]. This approachhas also been considered for amplitude estimation [259]. Suboptimal techniquesinclude subspace estimators [258, 260, 261, 262, 263], a hierarchical ML estimation[264], large sample mean ML estimation [265], an extended Kalman filter [266, 267],recursive least squares algorithm [267], and sequential estimation [268, 269]. Theamplitude estimation in AWGN channels with unknown delays has been the topicin [270]. The estimation of the number of active users in AWGN channels has beenstudied in [271, 272, 273, 274, 275].
Multiuser detection based on empirical distribution of the MAI has been proposed in [276, 277, 278]. The distribution of the MAI is estimated by forming acorresponding histogram, and the received symbol is selected so that it matchesbest into the histogram.
Neural networks have been proposed to approximate the decision regions ofthe optimal receivers. Multilayer perceptron networks both for centralized anddecentralized detection in AWGN channels have been proposed in [279] and singlelayer perceptron networks in [280]. Selforganizing maps for centralized detection inAWGN channels have been studied in [281]. Radial basis functions for decentralizeddetection in AWGN channels have been considered in [282]. Hopfield networks forcentralized detection in AWGN channels have been proposed in [283, 284].
49
2.3. Problem formulation
Several interesting open problems exist in the field of multiuser receivers. From thepractical point of view, one of the most important questions is, whether multiuserreceivers are feasible in practical DSCDMA systems or not. The question canalso be posed as whether the price paid in terms of implementation complexity isworth the obtained performance improvement. The final answer is of course notonly technical but also commercial and, thus, out of the scope of this thesis. Toprovide tools for the decision making process, some open problems related to themultiuser receivers which are considered to be most promising from the practicalpoint of view, are addressed in this thesis. The multiuser receivers appearing possibly practical include the class of linear equalizer type and interference cancellationreceivers12. As can be seen from the literature review in Section 2.2, the decorrelating multiuser receiver has received huge attention from the scientific community.The interference cancellation is popular in the proposed CDMA system standardsmentioned at the end of Section 1.1. In the class of interference cancellation receivers the attention is focused on the HDPIC receivers in this thesis. The PICis applied due to problems associated with SIC in purely asynchronous DSCDMAsystems, and due to the potentially better performance of PIC as hard decisionsare applied [51]. Hard decisions yield usually better performance than soft decisions, since HD receivers can utilize efficient channel estimators, whereas the SDreceivers cannot [51]. The groupwise interference cancellation receivers are definitely promising and interesting, but they are neglected in this thesis to make thediscussion clearer and simpler.
The key problems in the implementation of the linear equalizer type multiuserdetectors are the infinite memorylength, and the need for matrix inversion in thedetector update. The memorylength problem is the topic of Chapter 3, where finitememorylength detection is studied. The matrix inversion problem is considered inChapter 5, where implementation algorithms for multiuser receivers are proposed.The implementation requirements of both the linear equalizer type and the HDPIC receivers are also compared in Chapter 5. Emphasis is on detection in dynamicCDMA systems, where the detectors must be updated frequently due to changesin the number of users, in the signature waveforms, in the delays, or in the receivedamplitudes. The HDPIC receivers are relatively straightforward to implement inprinciple, although a large variety of different modifications exist. The performanceof the linear equalizer type and the HDPIC multiuser receivers with real DA orDD channel estimation has been studied only very little. The performance analysisand comparisons of the decorrelating and HDPIC in Rayleigh fading channels istherefore the topic of Chapter 4.
12Several other receiver techniques described in Section 2.2.3.3, may very well become practicalafter a while. However, most of them are currently rather immature.
3. Finite memorylength linear multiuser detection
Most of the linear equalizer type multiuser detectors can be characterized as aninverse of some form of correlation matrix, as discussed in Chapter 2. In an idealimplementation their memory equals the data packet length, which often can beassumed to approach infinity. The linear multiuser detectors for an asynchronousCDMA system can be presented as multichannel IIR filters. Although stable versions of the multiuser detector filters are known to exist in many cases, FIR filtersare more robust in practical systems. Multichannel IIR filters are also complicatedto update as a change in correlations occurs, whereas multichannel FIR filtersadmit easier update formulation. Variations in the number of users, in their signature waveforms (e.g., due to a handover in a cellular system), or in their delayschange the correlations. In such a case, the multiuser detectors must be updatedaccordingly to match to the new received signal. In this chapter, it is shown thatthe infinite memorylength detectors can be accurately approximated by detectorswith finite and also relatively short memorylength. In particular, it is shown thatnearfar resistance to a high degree can be obtained by moderate memorylengths.This result provides a mechanism to implement nearfar resistant linear multiuserdetectors in systems in which the number of users or their propagation delayschange over time.
The detectors studied in this chapter process the MF filter bank output vectorand the multipath combining is performed after the multiuser processing. Theanalysis in this chapter assumes that the channel is constant, i.e., the effects dueto fading are neglected. The assumption is justified since the effect of detectormemorylength to the performance of the detection can be characterized equallyin an additive white Gaussian noise as well as in a fading channel.
The chapter is organized as follows. Linear finite memorylength multiuser detectors are defined in Section 3.1. The results of the stability analysis of finitememorylength detectors are presented in Section 3.2. The effects of the finitememorylength on the bit error probability, the asymptotic multiuser efficiency,and the nearfar resistance of the detectors are analyzed in Section 3.3. In Section 3.4, the results are illustrated by numerical examples. Finally, the results aresummarized and discussed in Section 3.5.
51
3.1. Linear FIR multiuser detectors
The idea to be considered is to replace the NbKL×NbKL detector matrix T by aNKL×KL detector matrix. A finite memorylength linear multiuser detector oflength N = 2P + 1 (referred to as an FIR detector for brevity) is defined as1
DN =(
D(P ) · · · D(1) D(0) D(−1) · · · D(−P ))>
∈ IRNKL×KL, (3.1)
where the blocks D(i) ∈ IRKL×KL, i ∈ −P, . . . , P define a partition of the detector DN . The infinite memorylength linear multiuser detector (referred to as anIIR detector) D corresponding to the FIR detector DN is defined as
D = DN , N →∞. (3.2)
The linear multiuser FIR detector output
y(n)[LIN ] = D>Ny(n) ∈CKL (3.3)
provides a decision statistic for the symbols b(n). The output can be expressed by(2.27) and (2.28) as
y(n)[LIN ] = F>A(n)b(n) + µ(n)(b(n)
e ) + w(n)[LIN ] = F>A(n)b
(n) + w(n)[LIN ], (3.4)
where F = R(n)DN and F = R>(n)DN are the convolutions of the multiuserchannel impulse response R(n) or R(n) and the multiuser detector DN ,
µ(n)(b(n)e ) = D(P )R(n−P )(2)C(n−P−2)Ab(n−P−2)
+ D(P )R(n−P )(1)C(n−P−1)Ab(n−P−1)
+ D(P − 1)R(n−P+1)(2)C(n−P−1)Ab(n−P−1)
+ D(−P + 1)R>(n+P+1)(2)AC(n+P+1)b(n+P+1)
+ D(−P )R>(n+P+2)(1)AC(n+P+1)b(n+P+1)
+ D(−P )R>(n+P+3)(2)AC(n+P+2)b(n+P+2) (3.5)
is the response of the symbols outside the processing window, i.e., the edge effect due to finite detector memorylength, and w(n)
[LIN ] = D>Nw(n) is a zero meanGaussian random vector with covariance matrix σ2D>NR(n)DN . In systems withtimevarying signature waveforms the above formulation should be interpreted asa snapshot of the timevarying detector on a particular symbol interval. Filteringinterpretation of an arbitrary multichannel linear FIR detector is illustrated in Fig.3.1.
To design FIR detectors, or in other words, to find in some sense good NKL×KL matrices DN , the truncation of IIR detectors is first considered. This was
1Note that the time index n is left out for notational convenience, although the detector DNand the convolution matrix F depend on n if the signature waveforms are timevarying.
52
z1
z1
z1
z1
i
i
i
i
i
m
m
m









?
?
?
?
?
6
?
6
q
q
q
q
...
...
...
...
6
6
6
6
6
y(n)
y(nP )[LIN ]
D(P )
D(1)
D(0)
D(1)
D(P )
Fig. 3.1. A FIR linear multiuser detector.
suggested in [54] for the decorrelating detector. A linear multiuser detector D[d]N
satisfyingR(n)D[d]N = UN , (3.6)
where UN = (0KL, . . . ,0KL, IKL,0KL, . . . ,0KL)> ∈ 0, 1NKL×KL, will be calledthe truncated decorrelating detector. It is clear that D[d]N is the NKL × KL
middle block column, i.e., the middle KL columns, of the inverse of R(n). A linearmultiuser detector D[ms]N satisfying
[R(n) + σ2(C(n)A(n)AH(n)CH(n))−1]D[ms]N = UN , (3.7)
will be called the truncated LMMSE detector.2 A linear multiuser detector D[nw]N
satisfyingL(n)D[nw]N = UN , (3.8)
will be called the truncated noisewhitening detector.An alternative to truncation is to optimize detectors based on the finite proc
essingwindow length model (2.28). To generalize the decorrelating detector weshould find a zeroforcing detector D[d]N ∈ IRNKL×KL satisfying
R>(n)D[d]N = UN , (3.9)
which does not have a unique solution. A unique detector can be found by selectingthe pseudoinverse (i.e., MoorePenrose generalized inverse), which yields the best
2Note that at high signaltonoise ratios (σ2 → 0) or at high interference levels (Ek →∞) theLMMSE detector approaches the decorrelating detector.
53
least squares solution to (3.9). Since R(n) is positive definite, R(n) has full rankwith more columns than rows. Thus, the pseudoinverse solution defines the optimalFIR decorrelating detector
D[d]N = mbc
(R(n)R>(n))−1R(n), (3.10)
where mbc denotes “middle block column of”. It should be noted that the abovedetector is the optimal FIR decorrelator in the sense that it minimizes the leastsquares error in the solution of (3.9). However, there is no guarantee that thedetector D[d]N would yield lower bit error probability than the truncated decorrelating detector D[d]N . Since the optimal FIR detector forces the MAI due to edgesymbols b(n)
e to minimum, it cannot any more force the MAI due to symbols b(n)
to zero. This tradeoff can introduce a performance penalty in some cases.The optimal FIR LMMSE detector is by (2.28) and [88, Sec. 12.5]3
D[ms]N = mbc
(R(n))−1R(n)[R>(n)(R(n))−1R(n)
+σ2(C(n)A(n)AH(n)C
H(n))−1]−1. (3.11)
If all diagonal values of σ2(C(n)A(n)AH(n)CH(n))−1 are nonzero, matrix
R>(n)(R(n))−1R(n) + σ2(C(n)A(n)AH(n)CH(n))−1
is nonsingular and (3.11) has a unique solution. If σ2 → 0, the inverse in (3.11)does not exist. In that case there is no noise term in the model in (2.28) andthe problem can be viewed to be deterministic and underdetermined. In that casethere is no LMMSE detector. It should be noted that, contrary to the optimal FIRdecorrelator, the optimal FIR LMMSE detector is never inferior to the truncatedLMMSE detector. Obviously, it is computationally simpler to update the truncateddetectors than the optimal ones. What is more, computation of the truncated FIRdetectors is numerically more stable in practical implementations. However, bothclasses of FIR detectors are studied for completeness.
It is clear that the use of FIR detectors instead of the IIR ones causes someperformance loss. The performance analysis of the FIR detectors will be carriedout in Section 3.3. However, to be able to quantify the performance loss, thestability of linear multiuser detectors is analyzed in the next section.
3.2. Stability of detectors
In this section, conditions for the stability of the multiuser detectors are first discussed. Although it proves to be impossible to find an easy test for detector
3The same result in a different form has been derived in [161]. The expression in (3.11) ismore appropriate for further derivations in subsequent sections than the expression given in [161,Eq. (4.3)].
54
stability, the analysis gives insight into the problem. Furthermore, the analysisprovides us with tools to derive two interesting results for stable detectors.
For notational simplicity, the analysis in this section is presented for an AWGNchannel with one propagation path (i.e., L = 1, and C = INb , R(n)(2) = 0K , ∀ n).However, the generalization of the analysis to the multipath channel case (L ≥ 2)is straightforward.
A multiuser detectorDN is defined to be stable if and only if the impulse responseof the IIR detector is decaying, i.e., D(−P ), and D(P ) → 0K , as N → ∞ (orequivalently P → ∞). The above definition is a consequence of the standardstability definition of a digital IIR filter [285, pp. 81–82]. It is intuitively clearthat, if a multiuser detector is stable, the IIR detector can be truncated to aFIR detector with little performance degradation if the memorylength N is largeenough. This can be predicted from (3.5), where the response of the symbolsoutside the processing window satisfies µ(n) → 0, as N →∞.
For systems with timeinvariant signature waveforms, it was shown in [54] thatthe truncated decorrelating detector is stable if and only if4
det[R>(1)ejω + R(0) + R(1)e−jω ] 6= 0, ∀ ω ∈ [0, 2π). (3.12)
It is clear that (3.12) is difficult to evaluate for all possible delay combinations.Therefore, the most practical solution is to compute numerical examples to determine whether a detector is stable or not. This is particularly true for systems withtimevarying signature waveforms, as will be discussed below.
The result (3.12) was derived via a zdomain approach, which is not applicablein systems with timevarying signature waveforms. For that reason a timedomainanalysis is needed5. First a dimension symbol N is added to R(n) in (2.37) andthe time interval index (n) is dropped to yield RN . To simplify the notations,the nonzero blocks in ith block column of RN are denoted by R>i (1), Ri(1), andRi+1(0) etc.6. Let the inverse of RN be
TN = R−1N =
T11(N) T>21(N) · · · T>N,1(N)T21(N) T22(N) · · · T>N,2(N)
......
...TN,1(N) TN,2(N) · · · TN,N (N)
∈ IRNK×NK , (3.13)
where each Tij(N) ∈ IRK×K . The dependence on N is included in the argument, since the blocks are different for different N . Note that Dd(−P ) =TN,P+1(N). Thus, the stability of the truncated decorrelating detector is equivalent to TN,P+1(N) → 0K , as N → ∞. The following recursive expressions (3.14)and (3.15), which are proved in Appendix 1, provide the tools to study the stability
4Time index n is not needed in R(i), since the signature waveforms are timeinvariant and thedelays are assumed to be constant.
5The analysis will also apply to systems with timeinvariant signature waveforms, since thetimeinvariant case can be viewed as a special case of a system with timevarying signaturewaveforms.
6In the second column, for example, R>2 (1) = R>(n−P+1)(1), R2(0) = R(n−P+1)(0),
R3(1) = R(n−P+2)(1) etc.
55
of the detectors. For any i, j ∈ 1, 2, . . . , N − 1
Ti,j(N) = Ti,j(N − 1) + T>N−1,i(N − 1)R>N(1)TN,N (N)RN (1)TN−1,j(N − 1),(3.14)
TN,j(N) = TN,N(N)RN (1)TN−1,j(N − 1). (3.15)
For any 1 ≤ i < N it is obtained by induction from (3.15) that
TN,i(N) =N∏
j=i+1
[Tj,j(j)Rj(1)]Ti,i(i). (3.16)
A sufficient condition for the stability of the detector, that is for
TN,i(N)→ 0K , as N →∞, (3.17)
is∣∣∣λmax [R>j (1))T−2
j,j (j)Rj(1)] ∣∣∣ < 1, ∀ j ∈ i+ 1, i+ 2, . . . , N [286, p. 69], where
λmax(A) denotes the eigenvalue of the argument matrix A with the largest absolutevalue. The above condition is, however, often overly stringent. It was not satisfiedin most of the numerical examples computed, but still the detectors were stable innearly all cases. In other words, it is very difficult to provide necessary conditionsfor detector stability. Numerical examples are often the only practical way to verifythe stability of detectors.
For systems with timeinvariant signature waveforms it has been shown viathe zdomain approach that the stability of the decorrelating detector implies theuniqueness of the limiting IIR detector [54]. The corresponding result for systemswith timevarying signature waveforms is posed in the following proposition andproved in Appendix 1.
Proposition 1 If the truncated decorrelating, LMMSE, or noisewhitening detectors are stable, the limiting IIR detectors are unique.
The truncated detectors neglect the edge effect caused by the symbols outsidethe observation window, while the optimal FIR detectors take it into consideration.On the other hand, the stability of the detectors implies that the edge effect atthe detector output approaches zero as N is large. Thus, it is expected that thetruncated and the optimal FIR detectors should approach the same limiting IIRdetector, if they are stable. This is indeed the case under mild conditions as statedbelow and proved in Appendix 1.
Proposition 2 Assume that the received energies Ek and noise power spectraldensity σ2 satisfy 0 < Ek
σ2 <∞, ∀ k ∈ 1, 2, . . . ,K. Assume also that the decorrelating and LMMSE detectors are stable. Then both the truncated LMMSE detectorD[ms]N (3.7) and the optimal FIR LMMSE detector D[ms]N (3.11) converge to thesame IIR LMMSE detector as the detector memorylength N approaches infinity,i.e.,
D[ms] = D[ms]. (3.18)
56
The corresponding result for the decorrelating detectors as in Proposition 2,does not have such simple formulation. However, both the truncated decorrelatingdetector D[d]N (3.6) and the optimal FIR decorrelating detector D[d]N (3.10) converge to the same IIR decorrelating detector as the processing window length Napproaches infinity, i.e.,
D[d] = D[d] (3.19)
under mild conditions. The conditions are described at the end of Appendix 1.It was noted in Section 3.1 that the truncated detectors are easier to compute
than the optimal FIR detectors. Moreover, the above results justify the use oftruncated detectors with a large enough memorylength.
3.3. Performance analysis
In the performance analysis it is assumed that BPSK data modulation is applied,and that the carrier phases satisfy φk = 0. However, the extension to more generalcases is straightforward. The delays are assumed to be fixed. For notationalsimplicity and clarity the analysis is performed for a singlepath channel (L = 1)in Section 3.3.1. The extension to a multipath channel is presented in Section3.3.2. The signature waveforms are assumed to be timeinvariant for notationalconvenience so that the discretetime index can be removed from the correlationmatrices.
3.3.1. Singlepath channel
The kth user’s average bit error probability of a linear FIR detector is obtained byaveraging over all possible interfering symbol combinations [197]. Since in a singlepath channel, C = INbK×NbK , the error probability can be expressed by using (3.4)in the forms
Pk =1
2(N+2)K−1
∑b(n)∈−1,1(N+2)K
b(n)k
=0
Q
√Ek(F)(P+1)K+k,k − f>k A(n)b
(n)√σ2[D>NR(n)DN ]kk
(3.20)
=1
2(N+2)K−1
∑b(n)∈−1,1(N+2)K
b(n)k
=0
57
Q
√Ek(F)(P+1)K+k,k − f>k A(n)b(n) − µ(n)k (b(n)
e )√σ2[D>NR(n)DN ]kk
, (3.21)
where fk and fk are the kth columns of F and F defined after (3.4). The term√Ek(F)(P+1)K+k,k is the desired signal component, f>k A(n)b
(n) = f>k A(n)b(n) +µ
(n)k (b(n)
e ) is the remaining MAI, and σ2[D>NR(n)DN ]kk is the Gaussian noise variance at the detector output. Note that the bit error probability can also be expressed in a more compact form as
Pk =1
2(N+2)K−1
∑b
(n)∈−1,1(N+2)K
b(n)k
=1
Q
f>k A(n)b
(n)√σ2[D>NR(n)DN ]kk
. (3.22)
The expression for error probability of an IIR detector is as in (3.21) withµ
(n)k (b(n)
e ) = 0, since there is no edge term with IIR detectors. It is easy to see from(3.5) and (3.21) that in the case of a stable detector the effect of the edge symbolsb(n−P−1) and b(n+P+1) can be made arbitrarily small by selecting large enoughmemorylength N . In the case of the decorrelating detector this becomes even moreclear. Since by (3.6) (F)(P+1)K+k,k = 1, fk = 0 (except (fk)(P+1)K+k = 1), and[D>NR(n)DN ]kk = [Dd(0)]kk, the error probability of the truncated decorrelatorsimplifies from (3.21) to
P[d]k =1
22K
∑b
(n)e ∈−1,12K
Q
√Ek − µ(n)[d]k(b(n)
e )√σ2[Dd(0)]kk
. (3.23)
With a large enough N , [Dd(0)]kk approaches the value of the IIR detector byProposition 1, and µ
(n)[d]k approaches zero if the decorrelator is stable. Thus, the
stable decorrelator approaches the performance of the IIR detector with a largeenough memorylength N .
In the following the asymptotic multiuser efficiency (2.49) and the nearfar resistance (2.50) of the linear FIR detectors will be analyzed. With large argumentvalues we can approximate Q(x) ≈ exp(−x2/2)
2 . At high signaltonoise ratios theworst case symbol combination dominates the value of the sum in the numeratorof (3.20) or (3.21) [54]. Thus, using (3.21) the AME of an arbitrary linear FIRmultiuser detector is
ηk =1Ek
max2
0, min
b(n)∈−1,1(N+2)K
b(n)k
=0
√Ek(F)(P+1)K+k,k − f>k A(n)b
(n)√[D>NR(n)DN ]kk
(3.24)
58
=1Ek
max2
0, minb(n)∈−1,1(N+2)K
b(n)k
=1
f>k A(n)b
(n)√[D>NR(n)DN ]kk
. (3.25)
The minimum above is obtained from the worst possible interfering symbol combination, i.e., with symbols (b(n))i = sgn[(fk)i], ∀ i ∈ 1, 2, . . . , (P + 1)K + k −1, (P + 1)K + k + 1, . . . , NK.
After evaluating the square in (3.24), the AME for the truncated decorrelatingdetector becomes
η[d]k =
0, if µmax[d]k ≥
√Ek
1−ρ[d]k
Dkk(0) , if µmax[d]k <√Ek
, (3.26)
where ρ[d]k =2√Ekµ
max[d]k −(µmax[d]k )2
Ekdescribes the degradation due to the edge effect,
and
µmax[d]k = maxb
(n)e ∈−1,12K
µ(n)[d]k(b(n)
e )
=∑l 6=k
 [Dd(P )R(1)]kl  +  [Dd(−P )R>(1)]kl 
√2El (3.27)
is the maximum absolute value of µ(n)[d]k. The AME of the IIR decorrelator is as in
(3.26) with ρ[d]k = 0, since the edge effect has reduced to zero.It can be verified from (3.26) that η[d]k > 0 if and only if µmax[d]k <
√Ek. In
other words, the truncated decorrelating detector has a positive AME if and onlyif the maximum value of the remaining MAI component at the detector output issmaller than the desired users’ amplitude. If any interfering amplitude approachesinfinity, µmax[d]k at the output of a FIR detector also approaches infinity. Thus, itis clear that the FIR detectors cannot be nearfar resistant in a strict sense7. Forthat reason power limited nearfar resistance will be defined as
ηk = inf0≤El≤Emax,l 6=k
ηk, (3.28)
where Emax is finite. In wireless communication systems, for example, Emax isdetermined by the accuracy of the power control of the CDMA system. If a FIRdetector is stable, µmaxk can be made arbitrarily small by selecting N large enoughfor any Emax. This implies that the truncated decorrelating detector (and alsoLMMSE and dataaided noisewhitening detectors) with large enough (but finite)memorylength, can be made nearfar resistant given an arbitrarily large (but finite) upper bound for the received powers of the interfering users. By (3.19) it isnoted that, with N large enough, the above discussion applies to the optimal FIRdecorrelator as well.
7The use of truncated MF outputs [49] makes an FIR detector strictly nearfar resistant. Theprice is that the data of only very small number of users can be detected.
59
From (3.24) it is seen that the power limited NFR for the FIR detectors canbe computed by substituting El = Emax ∀ l 6= k. Thus, the power limited NFRis a function of the ratio Emax/Ek only. Let Emin be the minimum receivedenergy per symbol a user needs to have to be served by the CDMA system. Theworst case power limited nearfar resistance ηk can be computed by substitutingEk = Emin and El = Emax ∀ l 6= k in (3.24). In practice Emax, Emin, and Nare design parameters of the CDMA system and a tradeoff between them mustbe considered. The larger N the more complicated the implementation of thedetector is. On the other hand, large N poses milder requirements for the powercontrol of the system. In a digital signal processing (DSP) implementation large Nintroduces more roundoff errors and implementation noise so that in practice thereis a finite optimal value for N given the ratio Emax/Emin and the implementationconstraints (filter structure, floating point number word length etc.).
3.3.2. Multipath channel
The performance analysis for a fixed, known multipath channel is conceptuallysimilar to that of a singlepath channel. The output vector of the linear detectory(n)
[LIN ] is multiplied by the complex conjugate of a multipath combining matrix.The optimal combining vector for user k in the known multipath channel is [129,p. 20]
c(n)k = D−1
[d]k,k(0)ck, (3.29)
where D[d]k,k(0) denotes the kth L× L diagonal block of the matrix D[d](0). Theoptimal combiner matrix C(n) is then formed from the vectors c(n)
k as in (2.13).Therefore, the maximal ratio combined vector is
y(n)[LIN,MRC] = CH(n)y(n)
[LIN ] = CH(n)F>C(n)A(n)b(n) + CH(n)w(n)
[LIN ]. (3.30)
Let (C(n))k denote the kth column of C(n). Then the bit error probability expression (3.22) can be generalized to the form
Pk =1
2(N+2)K−1
∑b(n)∈−1,1(N+4)K
b(n)k
=1
Q
(C(n))Hk f>k C(n)A(n)b
(n)√σ2(C(n))H
kD>NR(n)DN (C(n))k
. (3.31)
60
Similarly, the asymptotic multiuser efficiency expression (3.25) for the multipathchannel case becomes
ηk =1Ek
max2
0, minb(n)∈−1,1(N+4)K
b(n)k
=1
(C(n))Hkf>k C(n)A(n)b
(n)√[(C(n))H
kD>NR(n)DN (C(n))k]k
. (3.32)
The nearfar resistance analysis and the discussion of Section 3.3.1 can be appliedto the multipath channel as well.
3.4. Numerical examples
The performance and stability of the detectors is studied by numerical examples.Directsequence spreadspectrum waveforms with BPSK data and spreading modulation as well as coherent detection are considered. The number of users is 33 or20 with a processing gain of 31, i.e., the chip duration Tc = T/31. A length 31 Goldsequence family is used in the examples with timeinvariant signature waveforms.A random code family of length 6200 chips is used in the examples with timevarying signature waveforms so that the results are averaged over 6200/31 = 200symbols. The carrier phases are assumed to be zero. The results are averagedover ten different, randomly selected delay combinations in the examples wheretimeinvariant signature waveforms are applied. Two kinds of chip waveforms areconsidered. One is a rectangular chip waveform, the length of which is limitedto one chip interval. The other chip waveform is a raised cosine waveform, thelength of which is limited to two chip intervals (Fig. 3.2). Examples illustratingthe detector stability are considered in Section 3.4.1, and the bit error probabilityas well as the power limited nearfar resistance results in Section 3.4.2.
3.4.1. Detector stability
The detector stability is illustrated by simulating the convergence of the edge blocksD(P ) and D(−P ) of the truncated decorrelating and LMMSE detectors versus thedetector memorylength. Timeinvariant signature waveforms are used. The meanabsolute values of the elements of D(P ) and D(−P ) are presented in Figs. 3.3 and3.4 for rectangular and raised cosine chip waveforms, respectively. The receivedenergies are equal and the SNR is to be 10 dB with the LMMSE detector.
The results show that both detectors are stable in all cases, except the decorrelator is unstable, when K = 33, L = 3 and the rectangular chip waveform isapplied. It should be noted that the decorrelating detector was not unstable withall delay combinations, but with one only. The one poor delay combination makes
61
−0.5 0 0.5 1 1.5−0.2
0
0.2
0.4
0.6
0.8
1Chip waveform
Time [chip interval]
Am
plitu
de
Fig. 3.2. Raised cosine chip waveform.
the decorrelating detector look unstable also in the average. Actually, the resultsin Figs. 3.3 and 3.4 are somewhat too pessimistic in the sense that averaging theabsolute values of the edge blocks D(P ) and D(−P ) gives too much emphasis tothe large values obtained with a poor delay combination. The conclusion will alsobe confirmed by the examples in the Section 3.4.2. However, the results demonstrate that the detector stability appears to be a mild assumption. Furthermore, itis seen that increasing the channel load KL makes the detector to converge moreslowly with increasing P . This applies in particular to the decorrelating detector,whereas the convergence of the edge blocks of the LMMSE detector suffer ratherlittle penalty from increased channel load. The phenomenon is easy to understandby comparing (3.6) and (3.7). The matrix that needs to be inverted when computing the LMMSE detector in (3.7), is more diagonally dominant than the matrixto be inverted when computing the decorrelating detector in (3.6). Thus, thecomputation of the LMMSE detector is understandably numerically more robustthan the computation of the decorrelating detector. Comparing Figs. 3.3 and 3.4demonstrates that the filtering of the chip waveform has very minor impact to theconvergence speed of the edge blocks D(P ) and D(−P ) versus the memorylength.Since the detectors are always stable with the raised cosine chip waveform, thefiltering of the chip waveform makes the system more robust.
3.4.2. Detector performance
The error probabilities are estimated for low signaltonoise ratio by (3.22) or (3.31).Dataaided detection is assumed for the noisewhitening detector, which may not
62
be practical. However, the effect of finite memorylength can be well illustrated byexamples assuming data aided detection. The performance at high signaltonoiseratios is evaluated by computing AME’s using expressions (3.25) or (3.32). Theresults are represented versus the half memorylength P . All the interfering usersare assumed to have the same energy per symbol, which is denoted by Emax in thefigures. Correspondingly the energy per symbol of the desired user is denoted byEmin. The performance of the ideal IIR detector is estimated with the assumptionthat the edge symbols are zero, and the detector has a large enough block size.
The results of examples with timeinvariant signature waveforms are presentedin Figs. 3.5–3.6 and 3.9–3.10. Timevarying signature waveforms yield the resultsin Figs. 3.7–3.8. The performance of several detectors in a singlepath channel isshown in Figs. 3.5–3.8. The performance of the truncated decorrelating detectorfor three numbers of propagation paths (L = 1, 2, 3) is shown in Figs. 3.9–3.10. Theresults in Fig. 3.10 assume that the chip waveform is the raised cosine waveform,in other examples the chip waveform is the rectangular waveform. The truncateddecorrelating and the LMMSE detectors are considered only in the examples withtimevarying signature waveforms, since the analysis of the optimal FIR and noisewhitening detectors would be computationally intensive. For clarity, the bit errorprobabilities of the ideal LMMSE detector have not been plotted for the casesEmax/Emin = 10 dB and Emax/Emin = 20 dB in Figs. 3.6 and 3.8, since they arevery close to the bit error probability of the decorrelating detector.
It can be seen from Fig. 3.5 that the asymptotic loss in signaltonoise ratioconverges relatively fast. With P = 6 the performance is the same as with an idealIIR detector even in the case Emax/Emin = 20 dB. With perfect powercontrol(Emax/Emin = 0 dB), value P = 4 is required. A 10 dB increase in the MAI levelimplies that the value of P must be roughly incremented by one to maintain thesame performance. In other words, loosening the powercontrol requirements significantly calls for only very minor increase in the required detector memorylength.It is seen from Fig. 3.6 that at lower signaltonoise ratios the value P = 4 yieldsthe same performance as the ideal IIR detector in all cases. From Figs. 3.7 and3.8 it seen that similar conclusions can be drawn for a system with timevaryingsignature waveforms. However, the performance of the ideal IIR detector is slightlybetter with the timeinvariant than with timevarying signature waveforms. This isunderstandable due to small crosscorrelations of the Gold sequences. On the otherhand, a system with timevarying signature waveforms requires slightly smallerFIR detector memorylengths than the system with timeinvariant signature waveforms, particularly, at high signaltonoise ratios. In [158], a similar behavior wasobserved and discussed for the noisewhitening detector. An intuitive explanationcan be seen from (3.16). In systems with timevarying signature waveforms theelements in the matrix Rj(1) are (at least approximately) random variables withzero mean and are independent for different values of j. Thus, the elements of thematrix product Tj,j(j)Rj(1) have also zero mean. In systems with timeinvariantsignature waveforms, on the other hand, the matrices Rj(1) are the same for different values of j and there is “less randomness” in the elements of Tj,j(j)Rj(1).Therefore, timevarying signature waveforms introduce more averaging out into theproduct in (3.16) and result in a faster convergence of the detector to a zero matrix
63
as the detector memorylength N →∞.From Figs. 3.6 and 3.8 it is seen that the optimal FIR detectors perform slightly
better at low signaltonoise ratios than the truncated ones with small values ofP . However, with moderate values of P both are equivalent to the ideal IIR detectors, as is expected by Proposition 2. From Figs. 3.5 and 3.7 it seen that athigh signaltonoise ratios, on the other hand, the truncated decorrelating detectorslightly outperforms the optimal FIR decorrelator. The reason can be understoodfrom the expressions for AME. Although the contribution due to the symbols outside the processing window (µ(n)
k (b(n)e )) for the optimal FIR decorrelating detector
in (3.24) is smaller than for the truncated FIR decorrelating detector in (3.26),the MAI due to other symbols f>k A(n)b(n) in (3.24) is larger. Furthermore, thedesired signal’s energy per symbol
√Ek(F)(P+1)K+k,k may be lower, and the en
hanced additive white Gaussian noise√
[D>NR(n)DN ]kk in (3.24) may be largerthan the corresponding quantities in (3.26) yielding lower asymptotic multiuserefficiency. Thus, the optimal FIR detectors do not yield any universal performanceimprovement in comparison to the truncated detectors.
It can be seen from Figs. 3.9 and 3.10 that the AME is degraded due to increasedinterference caused by multipath propagation. However, moderate memorylengths(P ≤ 6) are sufficient to obtain performance close to the ideal decorrelating detector except in the extreme cases of a severe nearfar problem and/or high channelload (K = 33 and L = 3). The performance of the decorrelating detector withraised cosine chip waveform (Fig. 3.10) is in general better than with rectangularwaveforms. The reason is the fact that filtering smoothes out the CDMA signalsand reduces the crosscorrelations between the signature waveforms. However, forthe high channel load (K = 33 and L = 3) case, the performance with raised cosinechip waveforms is poor. This is understandable due to bandwith limitation posedby filtering. In other words, the linear decorrelating detector is close to its ultimatecapacity limit, when K = 33 and L = 3. The results in Figs. 3.9(a) and 3.10(a)confirm that the stability results in Figs. 3.3(a) and 3.4(a) can indeed be somewhatmisleading, as predicted in Section 3.4.1. The truncated decorrelating detector is inthe average unstable with the rectangular chip waveform and stable with the raisedcosine chip waveform in the threepath case according to the results in Figs. 3.3(a)and 3.4(a). However, the corresponding power limited nearfar resistance resultsin Figs. 3.9(a) and 3.10(a) give a contradicting result: the system with rectangularchip waveform outperforms the system with the raised cosine chip waveform.
The numerical examples show that moderate memorylengths (roughly N ≤ 13)give performance close to ideal IIR detectors in most cases. Even under a severenearfar problem (Emax/Emin = 20 dB) the optimal nearfar resistance is obtainedwith detector memorylength N = 13 except with very high channel load. The useof FIR detectors loosens the required accuracy of the powercontrol significantlywith very moderate detector memorylengths.
64
3.5. Conclusions
Linear multiuser detectors in asynchronous multiuser systems, whose signaturewaveforms are allowed to be timeinvariant or timevarying, have been discussed.Two classes of linear FIR multiuser detectors, namely the truncated and the optimal FIR detectors, approximating the ideal IIR detectors were defined. Thetruncated detectors were obtained by simply truncating the corresponding IIR detector, whereas the optimal FIR detectors were derived by optimizing the detectorto the finite processing window model.
Numerical examples showed that the detectors are stable under relatively mildconditions. The stability was shown to imply asymptotic uniqueness of the limitingIIR detector also in the case of timevarying signature waveforms. The truncatedand the optimal FIR detectors approach asymptotically the same IIR detectorunder relatively mild conditions.
The performance of the finite memorylength detectors was analyzed. It wasshown that the truncated decorrelating, LMMSE, and dataaided noisewhiteningdetectors can be made nearfar resistant under a given ratio between maximumand minimum received power of users by selecting an appropriate memorylength.Numerical examples demonstrated the fact that moderate memorylengths of eithertruncated or optimal FIR detectors are sufficient to gain the performance of theideal IIR detectors even under a severe nearfar problem. At very high channel loadsthe decorrelating detector may become unstable, but the LMMSE detector is morerobust. If the memorylengths are small, the optimal FIR detectors outperformthe truncated ones at low signaltonoise ratios. However, at high signaltonoiseratios the truncated detectors have better performance. The required memorylengths tend to be smaller with timevarying than with timeinvariant signaturewaveforms.
The use of FIR detectors instead of the IIR detectors makes the linear multiuserdetection possible in CDMA systems in which the number of users, their propagation delays, or the signature waveforms change over time. An example of the timevarying signature waveforms is a CDMA system using spreading sequences longerthan one symbol interval (an RCDMA system). The truncated FIR detectors areeasier to update to the changes in a communication system than the optimal FIRdetectors. Because the optimal FIR detectors do not yield any universal performance improvement, the truncated detectors with appropriate memorylength areclearly the detectors of choice in practice. The required memorylength dependson other system parameters, especially on the ratio of maximum and minimum received powers. It should also be noted that the results give insight into the designof decentralized linear (adaptive LMMSE) detectors as well. Similar dependenceof the performance on the memorylength is naturally valid also for them.
65
1 2 3 4 5 6 7 8 9 1010
−10
10−8
10−6
10−4
10−2
100
102
Detector stability
Half memory−length P
Ave
rage
abs
olut
e va
lue
of la
st d
etec
tor b
lock
+ L=1
× L=2
∗ L=3
LMMSE
(none) decorrelator
K = 33
(a)
1 2 3 4 5 6 7 8 9 1010
−10
10−8
10−6
10−4
10−2
100
102
Detector stability
Half memory−length P
Ave
rage
abs
olut
e va
lue
of la
st d
etec
tor b
lock
+ L=1
× L=2
∗ L=3
LMMSE
(none) decorrelator
K = 20
(b)
Fig. 3.3. Mean absolute values of the edge detector blocks D(P ) and D(−P ) of
truncated decorrelating and LMMSE detectors for different numbers of multi
path components versus half memorylength P with timeinvariant signature
waveforms and rectangular chip waveform; (a) K = 33, (b) K = 20. The curves
marked by circles denote the LMMSE detector, and curves without circles
denote the decorrelating detector.
66
1 2 3 4 5 6 7 8 9 1010
−10
10−8
10−6
10−4
10−2
100
102
Detector stability
Half memory−length P
Ave
rage
abs
olut
e va
lue
of la
st d
etec
tor b
lock
+ L=1
× L=2
∗ L=3
LMMSE
(none) decorrelator
K = 33
(a)
1 2 3 4 5 6 7 8 9 1010
−10
10−8
10−6
10−4
10−2
100
102
Detector stability
Half memory−length P
Ave
rage
abs
olut
e va
lue
of la
st d
etec
tor b
lock
+ L=1
× L=2
∗ L=3
LMMSE
(none) decorrelator
K = 20
(b)
Fig. 3.4. Mean absolute values of the edge detector blocks D(P ) and D(−P ) of
truncated decorrelating and LMMSE detectors for different numbers of multi
path components versus half memorylength P with timeinvariant signature
waveforms and raised cosine chip waveform; (a) K = 33, (b) K = 20. The
curves marked by circles denote the LMMSE detector, and curves without
circles denote the decorrelating detector.
67
1 2 3 4 5 6 7 8−10
−9
−8
−7
−6
−5
−4
−3
−2
−1
0Performance as a function of memory−length
Half memory−length P
Pow
er li
mite
d ne
ar−f
ar re
sist
ance
[dB
]
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
∗ truncated DA noisewhitening
K = 33, L = 1
DA noisewhitening
(a)
1 2 3 4 5 6 7 8−10
−9
−8
−7
−6
−5
−4
−3
−2
−1
0Performance as a function of memory−length
Half memory−length P
Pow
er li
mite
d ne
ar−f
ar re
sist
ance
[dB
]
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
+ truncated decorrelator
× optimal FIR decorrelator
K = 33, L = 1
decorrelator
(b)
Fig. 3.5. Power limited nearfar resistances [dB] versus half memorylength
P with timeinvariant signature waveforms and rectangular chip waveform;
(a) DA truncated noisewhitening detector, (b) truncated and optimal FIR
decorrelating detector.
68
1 2 3 4 510
−4
10−3
10−2
10−1
Performance as a function of memory−length
Half memory−length P
Bit
erro
r pro
babi
lity
IIR detector
−−−− FIR Emax/Emin = 0 dB
+ truncated decorrelator
× optimal FIR decorrelator
truncated LMMSE
∗ optimal FIR LMMSE
(none) truncated DA noisewhitening
Emin/σ2 = 10 dB
K = 33, L = 1
decorrelator
LMMSE detector
DA noisewhitening
(a)
1 2 3 4 510
−4
10−3
10−2
10−1
Performance as a function of memory−length
Half memory−length P
Bit
erro
r pro
babi
lity
IIR detector
−·−· FIR Emax/Emin = 10 dB
+ truncated decorrelator
× optimal FIR decorrelator
truncated LMMSE
∗ optimal FIR LMMSE
(none) truncated DA noisewhitening
Emin/σ2 = 10 dB
K = 33, L = 1
decorrelator ≈ LMMSE detector
DA noisewhitening
(b)
69
1 2 3 4 510
−4
10−3
10−2
10−1
Performance as a function of memory−length
Half memory−length P
Bit
erro
r pro
babi
lity
IIR detector
······ FIR Emax/Emin = 20 dB
+ truncated decorrelator
× optimal FIR decorrelator
truncated LMMSE
∗ optimal FIR LMMSE
(none) trunc. DA noisewhitening
Emin/σ2 = 10 dB
K = 33, L = 1
decorrelator ≈ LMMSE detector
DA noisewhitening
(c)
Fig. 3.6. Probabilities of bit error versus half memorylength P with time
invariant signature waveforms and rectangular chip waveform; (a) equal re
ceived energies Emax/Emin = 0 dB, (b) nearfar problem Emax/Emin = 10
dB, (c) nearfar problem Emax/Emin = 20 dB.
70
1 2 3 4 5 6 7 8−10
−9
−8
−7
−6
−5
−4
−3
−2
−1
0Performance as a function of memory−length for R−CDMA
Half memory−length P
Pow
er li
mite
d ne
ar−f
ar re
sist
ance
[dB
]
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
+ truncated decorrelator
K = 33, L = 1
decorrelator
Fig. 3.7. Power limited nearfar resistances [dB] of the truncated decorrelating
detector versus half memorylength P with timevarying signature waveforms
and rectangular chip waveform.
71
1 2 3 4 510
−3
10−2
10−1
Performance as a function of memory−length for R−CDMA
Half memory−length P
Bit
erro
r pro
babi
lity
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
+ truncated decorrelator
truncated LMMSE
Emin/σ2 = 10 dB
K = 33, L = 1
decorrelator
LMMSE detector
Fig. 3.8. Probabilities of bit error of the truncated decorrelating detector
versus half memorylength P with timevarying signature waveform and rect
angular chip waveform.
72
2 3 4 5 6 7 8 9 10−25
−20
−15
−10
−5
0Performance as a function of memory−length
Half memory−length P
Pow
er li
mite
d ne
ar−f
ar re
sist
ance
[dB
]
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
+ L=1
× L=2
∗ L=3
K = 33
(a)
2 3 4 5 6 7 8 9 10−10
−9
−8
−7
−6
−5
−4
−3
−2
−1
0Performance as a function of memory−length
Half memory−length P
Pow
er li
mite
d ne
ar−f
ar re
sist
ance
[dB
]
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
+ L=1
× L=2
∗ L=3
K = 20
(b)
Fig. 3.9. Power limited nearfar resistances [dB] of the ideal (IIR) and trun
cated (FIR) decorrelating detector for different numbers of multipath compo
nents versus half memorylength P with timeinvariant signature waveforms
and rectangular chip waveform; (a) K = 33, (b) K = 20.
73
2 3 4 5 6 7 8 9 10−30
−25
−20
−15
−10
−5
0Performance as a function of memory−length
Half memory−length P
Pow
er li
mite
d ne
ar−f
ar re
sist
ance
[dB
]
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
+ L=1
× L=2
∗ L=3
K = 33
(a)
2 3 4 5 6 7 8 9 10−10
−9
−8
−7
−6
−5
−4
−3
−2
−1
0Performance as a function of memory−length
Half memory−length P
Pow
er li
mite
d ne
ar−f
ar re
sist
ance
[dB
]
IIR detector
−−−− FIR Emax/Emin = 0 dB
−·−· FIR Emax/Emin = 10 dB
······ FIR Emax/Emin = 20 dB
+ L=1
× L=2
∗ L=3
K = 20
(b)
Fig. 3.10. Power limited nearfar resistances [dB] of the ideal (IIR) and trun
cated (FIR) decorrelating detector for different numbers of multipath compo
nents versus half memorylength P with timeinvariant signature waveforms
and raised cosine chip waveform; (a) K = 33, (b) K = 20.
4. Multiuser demodulation in Rayleigh fadingchannels
Multiuser demodulation in relatively fast fading channels is analyzed in this chapter. The goal is to find efficient receivers with moderate implementation complexityfor multiuser demodulation. Coherent detection is considered to obtain a superiorperformance. Therefore, complex channel coefficient estimation is obviously a major problem to solve. Multiuser receivers for fading channels have been consideredin the past, as discussed in Chapter 2. However, several open problems still exist.Even a clear presentation of the optimal demodulation technique for timevaryingmultipath channel is not available in the open literature. The performance of different proposed multiuser receivers has not been compared. Furthermore, the effectof various channel estimation algorithms to the receiver performance in general andto receiver performance differences in particular has gained very limited attentionso far. For the parallel interference cancellation receiver it has been shown that theeffect of complex channel coefficient estimation to the overall receiver performanceis substantial [57]. Thus, the overall receiver design and especially the channelestimation problem in relatively fast Rayleigh fading channels are important problems.
In this chapter, the focus will be on the following three problems. First, the performance difference between optimal and suboptimal complex channel coefficientestimation is evaluated. Second, dataaided and decisiondirected complex channelcoefficient estimation are compared. Third, the bit error rates of the decorrelatingand the parallel interference cancellation receivers are compared. The delays of theuser signals are assumed to be perfectly known throughout the chapter.
The chapter is organized as follows. The optimal multiuser detector for unknownRayleigh fading channels is presented in Section 4.1. Suboptimal decorrelatingand parallel interference cancellation receivers with either DA or decisiondirectedcomplex channel coefficient estimation are considered in Section 4.2. In Section4.3, performance of the suboptimal receivers is studied and numerical examplesare presented. The results are summarized and discussed in Section 4.4.
75
4.1. Optimal receiver
The complete transmitted data block must be demodulated in the optimal MLSDmultiuser receiver [47]. Therefore, the complete data block model (2.8) is considered. To find the optimal MLSD receiver for a frequencyselective Rayleigh fadingchannel the derivation by Kailath [84, 78] is extended to the multiuser detectionproblem. The covariance matrix Σc of the channel coefficient vector c is assumedto be known in the detector derivation. The maximum likelihood sequence detector(often called maximum likelihood sequence estimator [23]) makes its decision as
b[MLSD] = arg maxb∈ΞNbK
p(yb), (4.1)
where p(yb) is the probability density function (pdf) of the MF bank outputvector y conditioned on the data vector b. Since c and w are complex Gaussianrandom vectors independent of each other, y in (2.16) conditioned on b is a complexGaussian random vector with zero mean and covariance matrix
Σyb = RΣhbR+ σ2R, (4.2)
whereΣhb = BΣcB∗ (4.3)
is the covariance matrix of h = CAb conditioned on the data vector b, and
B = diag(A1b
(0)1 IL, A2b
(0)2 IL, . . . , AKb
(0)K IL, A1b
(1)1 IL, . . . , AKb
(Nb−1)K IL
)∈CNbKL×NbKL. (4.4)
Note that the covariance matrix Σc (2.45) of the channel coefficients (2.40) isinsensitive to the data sequence b. Thus, the pdf of y conditioned on b becomes
p(yb) =1
πNbKL det(Σyb)exp(−yHΣ−1
yby). (4.5)
By substituting (4.5) into (4.1), the MLSD rule can be expressed in the form
b[MLSD] = arg minb∈ΞNbK
ln[det(Σyb)] + yHΣ−1
yby, (4.6)
where ln(.) is the natural logarithm. If constant envelope modulation is applied,det(Σyb) does not depend on b [287] and can therefore be neglected in the minimization. In that case the MLSD rule becomes
b[MLSD] = arg minb∈ΞNbK
yHΣ−1yby. (4.7)
By applying the matrix inversion lemma (A1.4) in (4.2) the inverse of the covariancematrix has the form
Σ−1yb = σ−2R− σ−2(R+ σ−2Σ−1
hb)−1. (4.8)
76
Since the first term σ−2R in (4.8) does not depend on the data b, the MLSD rule(4.7) becomes
b[MLSD] = arg maxb∈ΞNbK
hH
[MMSE]y, (4.9)
whereh[MMSE] = (R+ σ−2Σ−1
hb)−1y (4.10)
is the minimum mean squared error estimate of the vector h [88] conditioned onthe data. In other words, the optimal MLSD receiver estimates the received noiseless signal, and multiplies (correlates) the matched filter bank output with theestimated complex amplitude vector. In other words, the result is a generalizationof the wellknown estimatorcorrelator receiver [85, Chap. 2] to multiuser systemwith multiple propagation paths. Since the estimation and correlation must beperformed for all possible data sequences, the MLSD receiver is prohibitively complex for practical implementation. Therefore, suboptimal multiuser detectors andchannel estimators are studied in the following sections.
4.2. Suboptimal receivers
A natural way to approximate the optimal multiuser MLSD receiver is to detectthe data detection by estimating the complex channel coefficients. The channelcan be estimated from the received signal if the effect of the data symbols can beremoved from the MF bank output. This is possible if known (pilot) symbols areavailable or if symbol decisions are utilized in channel estimation1 [89].
By the analysis in Section 4.1, the optimal channel estimation strategy fromdetection point of view is the linear estimation minimizing the mean squared errorof the complex channel coefficient estimate (4.10). However, the ideal LMMSEestimator is often impossible to implement, since the channel covariance matrixΣc depends on the signaltonoise ratios and the fade rates of all the users, and isusually unknown. Furthermore, the matrix inversion in (4.10) has high computational complexity. Adaptive versions of the LMMSE channel estimator have beenproposed for singleuser systems [103, 288]. For multiuser systems a multichanneladaptive receiver would be needed. Adaptation of such filters is difficult. Therefore, the LMMSE complex channel coefficient estimation appears to be impracticalfor most applications. A simplified channel estimator is obtained by decouplingthe estimation of complex channel coefficients of the users from each other. Inother words, the NbKLdimensional joint channel estimation problem can be approximated by KL distinct channel estimation problems for Nb symbol intervals2.
1The received complex amplitudes CA are assumed to consist of the Rayleigh fading channelcoefficients (C) and much more slowly changing transmitted complex amplitudes (A). Since thetransmitted complex amplitudes are assumed to be known, the complex amplitude estimation isconsidered to be the estimation of the Rayleigh fading channel (C) in the sequel.
2This is the standard approach which has been used in deriving most suboptimal multiuserreceivers described in Chapter 2.
77
The implementation complexity is reduced in this way considerably, and realtimecomplex channel coefficient estimation becomes possible. Therefore, the consideredreceiver structure performs first the interference suppression, which separates theNbKL dimensional joint channel estimation and data detection problems to KLdistinct channel estimation and data detection problems for Nb symbol intervalsby reducing both multipleaccess and intersymbol interference. Then the complexchannel coefficients are estimated for all multipath components separately. Finally,the MF outputs are maximal ratio combined. The receiver structure is a generalization of the receivers in [127, 57], and it is illustrated in Fig. 4.1. In the rest ofthe chapter receivers with the structure as in Fig. 4.1 are studied. The alternativesfor the channel estimation block in Fig. 4.1 are considered in Section 4.2.1, and forthe interference suppression block in Section 4.2.2.
MF
MF
MF
1,1
1,L
K L,
...
...
MF ,1K
...
$
$
( )
( )
b
b
n
Kn
1
y
y
y
y
n
Ln
Kn
K Ln
1 1
1
1
,( )
,( )
,( )
,( )
r(t) interferencesuppression
y 1n
1,( )
[MUD]
y Ln
1,( )
[MUD]
y 1n
K,( )
[MUD]
y Ln
K,( )
[MUD]
...
...
...
...
...
weighting
channelestimation
$
( )c n1,L
$
( )c n1,1
$
( )c n1,L
$
( )c nK,L
zJPsm
ΣzJPsm
... weighting
channelestimationchannel
estimation
zJPsm
ΣzJPsm
...
...
...
...
...
...
channelestimation
KKL
Fig. 4.1. A multiuser receiver structure for a Rayleigh fading channel.
4.2.1. Channel estimation
In the considered receiver structure (Fig. 4.1) the channel estimation problems ofusers are separated from each other. Therefore, the channel estimation techniquesof the singleuser receivers can be applied. As in singleuser receivers, the channelestimation blocks in Fig. 4.1 are assumed to include the a block removing of theeffect of the data modulation and a channel estimation filter or several channelestimation filters. The effect of the data modulation can be removed by multiplyingthe interference suppression output by the complex conjugate of the current datasymbol in dataaided or decisiondirected channel estimation. Nondataaided orblind channel estimation will not be considered in this chapter. Finite impulse
78
response channel estimation filters will be considered in the sequel, but infiniteimpulse response filters could also be applied in the receiver structure of Fig. 4.1.A general complex channel coefficient estimation filter structure is illustrated inFig. 4.2. The symbol J denotes the distance of the data or pilot symbols used inchannel estimation. The symbols Ppr and Psm denote the number of coefficientsin the prediction and smoothing parts of the channel estimation filter [88, p. 400],respectively. Since the past samples are always available for channel estimation,it is assumed that Ppr > 0 in all cases. If Psm = 0, and vk,l(0) = 0, the channelestimation filter is a linear predictor, which uses only the past samples for complexchannel coefficient estimation. If Psm = 0, and vk,l(0) 6= 0, the channel estimationfilter is a linear “filter”, which uses the past samples and the current sample forcomplex channel coefficient estimation. If Psm > 0, the channel estimation filteris a linear smoother, which uses both the past and future samples for complexchannel coefficient estimation. The current sample may (vk,l(0) 6= 0) or may not(vk,l(0) = 0) be used.
zJ zJzJ zJ
Σ
... ... zJ
ΣΣ Σ Σ Σ Σ......
$( )bkn
$
( )c nJPk,l
**
sm
( )v nk,l(Psm)
( )vnk,l Psm+1 ( )v n
k,l(1)( )v nk,l (0) ( )v n
k,l (1) ( )v nk,l (Ppr)vk,l P pr( 1)n( )
y ln
k ,( )
[MUD]
)(
Fig. 4.2. A general channel estimation filter structure.
Let the channel estimation filter input vector for the lth path of user k attime interval n be denoted by q(n)
k,l . The corresponding channel estimation filter is
denoted by vector v(n)k,l . The channel estimate can then be expressed in the form
c(n)k,l = v
H(n)k,l q(n)
k,l . (4.11)
The optimal channel estimation filter in the LMMSE sense for decoupled channelestimation is the Wiener filter [88, Sec. 12.4]
v(n)k,l = Σ−1
q(n)k,l
Σq
(n)k,l,c
(n)k,l
, (4.12)
where Σq
(n)k,l
is the covariance matrix of the channel estimation filter input vector,
and Σq
(n)k,l,c
(n)k,l
is the covariance vector between the channel estimation filter input
vector and the desired channel coefficient c(n)k,l to be estimated. The optimal channel
estimation filter depends naturally on the interference suppression scheme, since itdetermines the input correlation matrix Σ
q(n)k,l
.
79
It is assumed that the channel statistics are constant over the observation window, which is a relatively mild assumption. Since the channel statistics depend onthe vehicle speed, which is not known at the receiver, the optimal Wiener channelestimation filters require the estimation of the channel correlation function. TheLMMSE channel estimation filters may also be approximated by adaptive channelestimation filters as in singleuser communications [103, 288], or in a multiuser case[57].
In dataaided channel estimation the delay in Figs. 4.1 and 4.2 is set to J = Np,where Np is the distance of the pilot symbols (Fig. 4.3). I.e., every Npth symbolof the transmitted symbol stream is a pilot symbol known by the receiver. Thechannel is estimated by interpolating the samples at the pilot intervals. The channelestimation filter input vector for ith (i ∈ 1, 2, . . . , Np − 1) data symbol in theframe3 is
q(n)k,l =
(y
(n−i−PprNp)
[MUD]k,l , . . . , y(n−i)[MUD]k,l, y
(n+Np−i)[MUD]k,l , . . . , y
(n+Np−i+PsmNp)
[MUD]k,l
)>∈CPpr+Psm , (4.13)
and the corresponding channel estimation filter is
v(n)k,l =
(v
(n)k,l (−Ppr), . . . , v(n)
k,l (−1), v(n)k,l (1), . . . , v(n)
k,l (Psm))> ∈CPpr+Psm . (4.14)
The removal of the effect of the data, i.e., multiplication of y(n−i)[MUD]k,l by b
∗(n−i)k
is neglected for notational convenience in (4.13). In other words, the channel isinterpolated over past and future samples of the interference suppression outputcorresponding the Ppr past and Psm future pilot symbols. The optimal channelestimation filter is naturally different for each data symbol in the data frame.
PILOT DATA DATA DATA DATA PILOT
Np
Fig. 4.3. Data frame structure.
In decisiondirected channel estimation, the delay in Figs. 4.1 and 4.2 is set toJ = 1. Since the future data symbol decisions are not available, a linear predictorwith Psm = 0, and vk,l(0) = 0 is often applied. The channel estimation filter inputvector is in that case
q(n)k,l =
(y
(n−Ppr)[MUD]k,l, . . . , y
(n−1)[MUD]k,l
)> ∈CPpr , (4.15)
3In other words, the symbol b(n−i)k
is the closest pilot symbol before the symbol b(n)k
, and the
symbol b(n+Np−i)k
is the closest pilot symbol after the symbol b(n)k
.
80
and the channel estimation filter is
v(n)k,l =
(v
(n)k,l (−Ppr), . . . , v(n)
k,l (−1))> ∈CPpr . (4.16)
Channel estimation based on a smoother usually yields better results than the estimation based on a predictor, since more information on the channel coefficientscan be utilized. A smoother can be applied with DD channel estimation if tentative data decisions are available for the current and future symbols. The channelestimation filter input vector is in that case
q(n)k,l =
(y
(n−Ppr)[MUD]k,l, . . . , y
(n−1)[MUD]k,l, y
(n)[MUD]k,l, y
(n+1)[MUD]k,l, . . . , y
(n+Psm)[MUD]k,l
)>∈CPpr+Psm+1, (4.17)
and the channel estimation filter is
v(n)k,l =
(v
(n)k,l (−Ppr), . . . , v(n)
k,l (−1), v(n)k,l (0), v(n)
k,l (1), . . . , v(n)k,l (Psm)
)>∈CPpr+Psm+1. (4.18)
A technique to apply smoothers in DD channel estimation has been proposedfor singleuser communications in [103]. Tentative decisions can be obtained byusing linear predictor for channel estimation. The decisions can then be delayedby Psm symbol intervals to be utilized in the smoother. In other words, the channel is estimated in two stages. The channel estimator structure is illustrated inFig. 4.4. First, a linear predictor of length Ppr using the inputs (4.15) is appliedto obtain first stage channel estimates c(n)
k,l (pr). They are then used in the max
imal ratio combiner to produce tentative data decisions b(n)k (pr). The effect of
data symbols is removed by using the past tentative decisions made based on thepredicted complex channel coefficients. Second, the final channel estimates areobtained by using a linear smoother of length Ppr + Psm + 1. The MUD outputsy
(n−Ppr−Psm)
[MUD]k,l , . . . , y(n−1)[MUD]k,l, y
(n)[MUD]k,l are processed by the smoother to produce
the final channel estimates c(n−Psm)k,l (sm). The effect of data symbols is removed
by utilizing the delayed tentative decisions provided by the first stage predictedchannel estimates. The channel estimates obtained from the smoother are appliedin a maximal ratio combiner, and final data decisions b(n−Psm)
k (pr) are made. Theuse of the smoother improves the channel estimation performance in comparisonto the use of the predictor alone, since the memory of the fading channel can beutilized more efficiently. The smoother naturally causes an extra decision delay ofPsm symbols, which may not be acceptable in some applications.
The DD channel estimation is sensitive to decision errors, which may causehangup (cycle slip), i.e., a locking to incorrect carrier phase with a 180 degreeoffset in comparison to the correct phase (with BPSK modulation). Usually somecountermeasures to protect the channel estimator from hangup are needed. TheDD channel estimators require pilot symbols to be inserted into the data frame.Also some hangup detection and correction scheme may be necessary. A simpleway to detect hangups is to make decisions on the pilot symbols and to check
81
y 1n
k ,( )
[MUD]
predictor
smoother
$( )ck,1n (pr)
$( )ck,1
nP (sm)sm
$( )bkn**
y Ln
k ,( )
[MUD]
predictor
smoother
(pr)
$( )ck,Ln (pr)
$( )ck,LnP (sm)sm
Σ(.)* Σ$
( )bk (sm)
finaldecisions
tentativedecisions
nPsm
...zPsm
zPsm
Fig. 4.4. Twostage DD channel estimator structure.
whether the decisions are correct or not. If a decision error was made, an errorcounter, which is set to zero at the beginning of the transmission, is incrementedby one. If a correct decision was made, the errorcounter is decremented by one,unless the value of the counter is zero. If the errorcounter exceeds a predeterminedvalue, the channel estimator is declared to be in hangup. Then the phase of thesamples of the interference suppression output in the channel estimator is rotated180 degrees (with BPSK modulation), and the errorcounter is reset to zero.
The DA channel estimation is more robust than DD estimation, since decisionerrors are not a problem, and, thus, there are no hangups. Its drawback is that itmay require a shorter pilot symbol distanceNp than DD. The DA channel estimatoralso causes longer decision delay than DD channel estimation. The performance ofDA and DD channel estimation is studied and compared in Section 4.3.
4.2.2. Interference suppression
The interference suppression block in the receiver of Fig. 4.1 may in general applyany multiuser receiver algorithm capable to process KL input propagation pathsand provide KL outputs. For continuous, unpacketized, asynchronous transmission the truncated decorrelating detector and the parallel interference canceler areamong the most promising alternatives in relatively fast fading channels for centralized receivers from practical point of view, as discussed in Chapters 2 and 3.
82
The LMMSE detector of Chapter 3 is not considered any further, since it requirescontinuous updates due to complex channel coefficient variations in fading channels. Thus, the decorrelating and the PIC receivers are studied and compared inthis chapter.
The truncated decorrelating detector described in Chapter 3 is used in the examples. The truncated decorrelator imposes a decision delay of P symbols. Therefore,the total decision delay in conjunction with a smoother for channel estimation isP + Psm symbol intervals. Since the decorrelator needs neither complex channelcoefficient estimates nor data decisions for MAI suppression, the feedback in Fig.4.1 is not needed if the decorrelator is applied.
The PIC receiver needs both tentative complex channel coefficient estimates anddata decisions to perform MAI suppression as seen from (2.61). The interferencecancellation may be performed in several stages. The complex channel coefficientestimates may be formed independently in different stages or the complex channelcoefficient of the last stage may be fed back to the former stages, for example.Therefore, there is a large variety of different PIC receiver versions available. Theperformance of some alternatives is studied in more detail in [289]. Based on thosestudies, a twostage PIC receiver is applied in the sequel. It uses always the latestcomplex channel coefficient and data estimates that are available. In other words,for past symbol intervals the final symbol decisions and complex channel coefficientestimates are used in the interference cancellation. Tentative symbol decisions areused for the current and future symbol intervals. The latest final complex channelcoefficient estimate is used for the current and future symbol intervals. Althoughthis approach neglects changes in the complex channel coefficients, it has beenshown to be superior to the use of tentative complex channel coefficient estimates[289].
4.3. Receiver performance analysis and results
The performance of multiuser receivers in Rayleigh fading channels is consideredin this section. In Section 4.3.1, the performance of the linear receivers is analyzed.The ideal DA joint LMMSE channel estimator and decorrelator combined with decoupled channel estimator is compared. Bit error probability and channel capacityof the DA decorrelating receiver are also analyzed. Furthermore, the performanceof the DA and DD decorrelating receivers is compared. The performance of the DDdecorrelator and parallel interference canceler is compared in Section 4.3.2. Thesensitivity of the bit error rate of the decorrelating and PIC receivers to channelestimation errors as well as the BER in an estimated channel are studied.
Numerical examples are considered. Some of them are obtained via theoreticalanalysis and the others are based on MonteCarlo computer simulations. Simulations are used since the effect of decision errors to the performance of the DD channel estimation and the HDPIC receivers is difficult to analyze. Directsequencespreadspectrum waveforms with BPSK data and spreading modulation are considered. A Gold sequence family with processing gain 31 is used. The delays of the
83
users and propagation paths are assumed to be uniformly distributed into[0, T
)and
[0, Tm
), respectively. Delay spread is assumed to be Tm = T/2. One and
twopath channel examples are considered. In the twopath channel case, equalpower paths, i.e., E
(ck,12
)= E
(ck,22
)= 1
2 , are used. The vehicle speeds areequal for all users. The vehicle speed is 86 km/h (43 km/h in some examples inSection 4.3.1), the carrier frequency is 1.8 GHz, and the symbol rate is 16 kbits/s(i.e., fdT = 0.009). The resulting normalized channel autocorrelation function isillustrated in Fig. 4.5. Both optimal and suboptimal channel estimation filters areconsidered. The optimal channel estimation filters are matched to the true channelcorrelation function (vehicle speed) and to the true average signaltonoise ratio.The suboptimal channel estimation filters are Wiener filters optimized for a singleuser channel and the autocorrelation function assuming vehicle speed of 50 km/hand an average SNR of 10 dB. The fixed, suboptimal channel estimation filtersare used for all users and average SNR’s. The dataaided channel estimators usean interpolator of length 6 (Psm = Ppr = 3) (except in some examples in Section4.3.1) and sample spacing of J = Np. The decisiondirected channel estimatorsuse the twostage channel estimation described in Section 4.2.1 with a predictorof length Ppr = 10 and a smoother of length 21 (Psm = Ppr = 10). The samplespacing is J = 1 in both the predictor and smoother. In the simulations of theDD channel estimators every tenth symbol is a pilot symbol, i.e., Np = 10. Thehangup detection scheme described in Section 4.2.1 with a threshold of 2 erroneousdecisions on pilot symbols is also used. Simulations include examples with equaltransmitted energies for all users, and examples with a nearfar problem (aboutone third of the users have 10 dB larger transmitted energy per symbol than theother, desired users). The edge effect to the decorrelator due to a finite processingwindow (Chapter 3) is neglected in the analysis. However, the simulations includethe edge effect, which was observed to be of minor importance.
4.3.1. Performance of linear receivers
4.3.1.1. MSE of DA channel estimation
The channel estimation performance assuming correct decisions on the data is analyzed to find the expression for the mean squared error of the channel estimators.The optimal joint estimation of all users’ channels as in (4.10) is compared to thedecoupled channel estimation as described in Section 4.2.1. For simplicity and toenable a fair comparison, a DA channel estimation with a data symbol intervalJ = 1 is applied in the examples. The data packet size is assumed to equal theprocessing window length, i.e., Nb = N . The decorrelator is applied as the interference suppression scheme in the decoupled channel estimation. Therefore, theinput to the joint LMMSE estimator is y ∈CNKL, and the inputs to the decoupledchannel estimators are q(n)
k,l =(y
(0)[d]k,l, y
(1)[d]k,l, . . . , y
(N)[d]k,l
)> ∈CN , ∀ k, l.
84
0 10 20 30 40 50 60 70 80 90 100−0.5
0
0.5
1
Delay [symbol intervals]
Nor
mal
ized
cor
rela
tion
Channel autocorrelation function
Fig. 4.5. Channel autocorrelation function.
The performance of the optimal joint LMMSE channel estimator of (4.10) isdescribed by the error covariance matrix [88, p. 391]
Σh−h[MMSE]b = (Σ−1hb + σ−2R)−1, (4.19)
where each diagonal element of Σh−h[MMSE]is equal to the mean squared error of
the optimal LMMSE estimator for that particular channel coefficient. The MSE ofthe decoupled channel estimator can be expressed in the form [88, p. 388]
MSE[d]k,l = σ2
h(n)k,l
−ΣH
q(n)k,l,h
(n)k,l
Σ−1
q(n)k,l
Σq
(n)k,l,h
(n)k,l
, (4.20)
where σ2
h(n)k,l
= E(h(n)k,l 2).
The optimal joint LMMSE channel estimator (4.10) utilizes the information embedded in the dependence of the channel noise components of the MF outputs ofdifferent users and multipath components, whereas the decoupled channel estimator neglects that information. If the optimal Wiener filter of (4.12) is applied atthe decoupled channel estimator, the information of the fading process is utilizedas efficiently as in the joint LMMSE estimator. Therefore, it can be conjecturedthat in several cases, the performance difference between the joint LMMSE anddecoupled estimators is minor. This hypothesis is tested by evaluating the normalized MSE’s of the joint LMMSE and decoupled channel estimators given in (4.19)and (4.20), respectively. The normalized MSE of the estimate c of some parameterc is
MSE =E(c− c2
)E(c2) . (4.21)
85
The results are depicted in Figs. 4.6 and 4.7 for one and twopath channels, respectively. Two vehicle speeds (43 and 86 km/h) and processing window sizes(N = Nb = 7 and N = Nb = 21) are considered in the examples.
The channel estimation performance is worse in a twopath channel than ina onepath channel. There are two reasons for that. Firstly, an increase in thechannel load KL causes more linear dependence between the signals. Therefore,there is more noise enhancement in the decorrelating and joint LMMSE receivers.Secondly, in the examples the received power of a particular user is divided into twocomponents, which both must be estimated independently, in a twopath channel.In a onepath channel, on the other hand, all the power can be utilized in theestimation of the single propagation path. In that sense, the normalization (2.41)is somewhat misleading, and the curves of Fig. 4.7 could be shifted 3 dB to theleft.
It can be seen from the figures that the performance advantage of the jointLMMSE channel estimator over the decoupled channel estimator is indeed minorin most cases as predicted above. In the twopath channel with a small observation window (N = 7) the difference is the largest, roughly 1 dB. It can also beseen that the performance difference due to vehicle speed is very small, as longas optimal channel estimation filters are applied. Thus, it can be concluded thatthe decorrelating receiver is capable of providing near optimal channel estimationperformance with significantly simpler implementation than the joint LMMSE estimator. If decisiondirected channel estimation is applied, the result may be evenmore favorable for the channel estimator using decorrelator, since it is insensitiveto the decisions of the other users, whereas the joint LMMSE estimator is not.
4.3.1.2. BEP of DA decorrelating receiver
The performance of the dataaided decorrelating receiver is analyzed to obtain theexpression for the average bit error probability. The analysis has similarities to[56, 123], where the channel was assumed to be known, or to [127], where errorfree DD channel estimation4 was considered. Here, DA detection with a pilotsymbol distance larger than one (Np > 1) is assumed. The decision variable of thedecorrelating receiver for user k after maximal ratio combining can be expressedin the form
y(n)[d,MRC]k = c(n)H
k y(n)[d]k, (4.22)
wherec(n)k =
(c(n)k,1 , c
(n)k,2 , . . . , c
(n)k,L
)∈CL (4.23)
is the combing vector, and
y(n)[d]k =
(y
(n)[d]k,1, y
(n)[d]k,2, . . . , y
(n)[d]k,L
)∈CL (4.24)
4Errorfree DD channel estimation means actually DA channel estimation with pilot symboldistance one (Np = 1).
86
0 5 10 15 20 2510
−3
10−2
10−1
100
Channel estimation performance of DA joint and decoupled estimators
Average signal−to−noise ratio [dB]
Mea
n sq
uare
d er
ror
−−−− decorrelator
······ joint LMMSE
singleuser bound
speed 86 km/h
× speed 43 km/h
K = 20; L = 1
(a)
0 5 10 15 20 2510
−3
10−2
10−1
100
Channel estimation performance of DA joint and decoupled estimators
Average signal−to−noise ratio [dB]
Mea
n sq
uare
d er
ror
−−−− decorrelator
······ joint LMMSE
singleuser bound
speed 86 km/h
× speed 43 km/h
K = 20; L = 1
(b)
Fig. 4.6. Mean squared errors of DA joint LMMSE and decoupled decorrelated
channel estimators in a flat fading channel (L = 1) for two vehicle speeds;
observation window is (a) N = 7, (b) N = 21.
87
0 5 10 15 20 2510
−3
10−2
10−1
100
Channel estimation performance of DA joint and decoupled estimators
Average signal−to−noise ratio [dB]
Mea
n sq
uare
d er
ror
−−−− decorrelator
······ joint LMMSE
singleuser bound
speed 86 km/h
× speed 43 km/h
K = 20; L = 2
(a)
0 5 10 15 20 2510
−3
10−2
10−1
100
Channel estimation performance of DA joint and decoupled estimators
Average signal−to−noise ratio [dB]
Mea
n sq
uare
d er
ror
−−−− decorrelator
······ joint LMMSE
singleuser bound
speed 86 km/h
× speed 43 km/h
K = 20; L = 2
(b)
Fig. 4.7. Mean squared errors of DA joint LMMSE and decoupled decorrelated
channel estimators in a frequencyselective fading channel (L = 2) for two
vehicle speeds; observation window is (a) N = 7, (b) N = 21.
88
includes the decorrelator outputs for user k. The optimal choice for c(n)k , given the
complex channel coefficient estimate c(n)k , is c(n)
k = D−1[d]k,k(0)c(n)
k (Section 3.3.2).Let
Q =12
(0L ILIL 0L
)∈ 0, 1
22L×2L, (4.25)
andν = (c>(n)
k ,y>(n)[d]k )> ∈C2L. (4.26)
By rewriting (4.22) the decision variable y(n)[d,MRC]k can be expressed in the form
y(n)[d,MRC]k = νHQν. (4.27)
The decorrelator output vector y(n)[d]k conditioned on the data symbol b(n)
k is a
complex Gaussian random vector. Assuming that the weight vector c(n)k is also
Gaussian5 the probability of bit error for user k can be expressed in the case ofBPSK modulation as [287]
Pk =2L∑i=1λi<0
2L∏j=1j 6=i
1
1− λjλi
, (4.28)
where λi, i = 1, 2, . . . , 2L are the eigenvalues of the matrix Σν , and
Σν =
Σc(n)k
Σc(n)k,y
(n)[d]k
ΣH
c(n)k,y
(n)[d]k
Σy
(n)[d]k
(4.29)
is the covariance matrix of the vector ν. The covariance matrix (4.29) depends onthe channel estimation filter. Although the error probability expression in (4.28)is not very intuitive, it is extremely useful in computing numerical examples. Thereason is that it can express the probability of bit error for any channel estimationfilter.
The probability of error is computed for several values of the pilot symbol distance Np with interpolation, as described in Section 4.2.1. The length of thesmoother satisfies Ppr = Psm = 3, i.e., in total six consecutive pilot symbols areused to estimate the complex channel coefficients. Both optimal and suboptimalsmoothers are applied. The results with the optimal channel estimation filter arepresented in Fig. 4.8, and with the suboptimal channel estimation filter in Fig. 4.9.
The results show that with the optimal channel estimation filters the bit errorprobability performance is excellent even at large SNR’s. A distance of Np <30 yields performance that is free of error probability saturation. A distance ofNp > 45 is so large that the error probability starts to saturate at high SNR’s(SNR > 30 dB). From low to moderate SNR’s the performance loss due to increasedpilot symbol distance is relatively low. For example, at error probability of 10−2
5Gaussian assumption holds for any DA linear channel estimation filter.
89
0 4 8 12 16 20 24 28 32 36 4010
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Performance for varying pilot interval − optimal channel estimators
Signal−to−noise ratio [dB]
Bit
erro
r pro
babi
lity
······ DA decorrelator
−·−· ideal decorrelator
singleuser bound
+ Np=50
Np=45
∗ Np=30
× Np=5
K = 20; L = 1
(a)
0 4 8 12 16 20 24 28 32 36 4010
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Performance for varying pilot interval − optimal channel estimators
Signal−to−noise ratio [dB]
Bit
erro
r pro
babi
lity
······ DA decorrelator
−·−· ideal decorrelator
singleuser bound
+ Np=50
Np=45
∗ Np=30
× Np=5
K = 20; L = 2
(b)
Fig. 4.8. Bit error probabilities of DA decorrelating receiver for different pilot
symbol distances with optimal channel estimation filters; (a) L = 1, (b) L = 2.
90
0 4 8 12 16 20 24 28 32 36 4010
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Performance for varying pilot interval − suboptimal channel estimators
Signal−to−noise ratio [dB]
Bit
erro
r pro
babi
lity
······ DA decorrelator
−·−· ideal decorrelator
singleuser bound
∗ Np=30
Np=10
× Np=5
+ Np=2
K = 20; L = 1
(a)
0 4 8 12 16 20 24 28 32 36 4010
−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Performance for varying pilot interval − suboptimal channel estimators
Signal−to−noise ratio [dB]
Bit
erro
r pro
babi
lity
······ DA decorrelator
−·−· ideal decorrelator
singleuser bound
∗ Np=30
Np=10
× Np=5
+ Np=2
K = 20; L = 2
(b)
Fig. 4.9. Bit error probabilities of DA decorrelating receiver for different
pilot symbol distances with suboptimal channel estimation filters; (a) L = 1,
(b) L = 2.
91
the differences in the required SNR between Np = 5 and Np = 50 are about 4.5dB and 3.5 dB for one and twopath channels, respectively. If suboptimal channelestimation filters are applied, the error floor is a significantly more severe problem.Even with a pilot symbol distance of Np = 5, the bit error probability saturates athigh SNR’s. From low to moderate SNR’s the performance loss due to increasedpilot symbol distance is also larger than with optimal channel estimation filters.For example, at BER = 10−2 the differences in the required SNR between Np = 5and Np = 30 are∞ dB and about 5 dB for one and twopath channels, respectively.
It can be concluded that superior performance can be obtained even with largepilot symbol distances if optimal channel estimation filters can be applied. Thefixed, suboptimal channel estimation filters cause a severe performance loss at highSNR’s. However, at low SNR’s fairly good performance can be obtained even witha suboptimal channel estimation filter if the pilot symbol distance is small enough(roughly Np ≤ 10 in the examples).
4.3.1.3. Channel capacity of DA decorrelating receiver
The above analysis provided results on the bit error probability of the DA decorrelating receiver for different pilot symbol distances. The use of pilot symbolsimproves the bit error rate performance, but reduces the effective data rate. Morespecifically, if every Npth symbol is a pilot symbol, the effective data rate is thenominal data rate multiplied by the factor Np−1
Np. Thus, the challenge is to select
the pilot symbol distance so that the overall channel capacity is maximized. Thisproblem is not easy to solve in practical examples, since the solution depends onthe channel encoding and decoding schemes applied. Therefore, the pilot symboldistance should be jointly optimized with the complete signal design.
To give some insight into the pilot symbol distance optimization problem, thefundamental limit provided by the information theoretic Shannon’s channel capacity is studied. For simplicity, it is assumed that the information bit streamis encoded, and transmitted through the fading multipleaccess channel. It isassumed that the receiver consists of a decorrelating multiuser receiver and DAchannel estimation, after which hard decision decoding is performed. Then thebit error probability analysis above applies to these hard decisions. The completecommunications system can then be modeled as a binary symmetric channel [22,pp. 186187] from a singleuser point of view. In other words, from the coding pointof view the channel of user k is a binary symmetric channel with error probabilityPk given in (4.28). Thus, the Shannon capacity of the kth user is [22, pp. 14, 187],[23, p. 381]
CAPk =Np − 1Np
[1 + Pk log2(Pk) + (1− Pk) log2(1− Pk)
]. (4.30)
The Shannon capacity is the data rate at which a user can obtain an asymptoticallyerrorfree transmission as the signaltonoise ratio approaches infinity, by applying
92
the very best encoding scheme that can exist with optimal decoding. The resultsprovided by the Shannon capacity analysis are optimistic in the sense that the verybest encoding scheme cannot be applied in practice, since the scheme is allowed tobe arbitrarily complicated and there is no design rule to find that scheme. On theother hand, the capacity results are pessimistic in the sense that by applying softdecision decoding the performance can be improved.
The Shannon capacity (4.30) is evaluated for the bit error probability resultspresented in Figs. 4.8–4.9. The results are presented in Figs. 4.10 and 4.11. Fromthe capacity results the optimal pilot symbol distances yielding the maximal Shannon capacity were determined. The optimal pilot symbol distances are illustratedversus SNR in Fig. 4.12.
The results demonstrate that the optimal pilot symbol distance depends stronglyon the SNR. At low SNR’s the optimal pilot symbol distance is rather low, whereasat high SNR’s very large pilot symbol distances can be tolerated for maximalchannel capacity if optimal channel estimation filters could be applied. The useof suboptimal channel estimation filters degrade the capacity, especially, at highSNR’s. At low SNR’s the differences in the capacity are significantly smaller. FromFig. 4.12, it can be seen that at SNR’s of 16–20 dB and higher, the optimal pilotsymbol distance depends heavily on the channel estimation.
It can be concluded based both on the bit error probability examples (Figs.4.8–4.9) and on the capacity examples (Figs. 4.10–4.11) that the choice of thechannel estimation filters is crucial in data transmission with very low bit error raterequirement. Thus, in data transmission systems there is clearly need for optimalor nearoptimal channel estimation filters. In speech transmission, on the otherhand, the fixed channel estimation filters can provide a satisfactory performance ifthe system requirements can tolerate a moderate pilot symbol distance (Np ≈ 10).
4.3.1.4. BER of DA and DD decorrelating receivers
The bit error probability of the DA decorrelating receiver is compared to the bit error rate of the DD decorrelating receiver. Inspired by the results in Section 4.3.1.3,pilot symbol distance Np = 10 is used. The results of the analysis (DA decorrelator) and MonteCarlo computer simulations (DD decorrelator) are presented inFig. 4.13.
It can be seen from Fig. 4.13 that the DA decorrelating receiver outperforms theDD decorrelating receiver by approximately 1 dB with optimal channel estimationfilters both in the cases L = 1 and L = 2. Fig. 4.13(a) (L = 1) shows that the DAdecorrelating receiver outperforms the DD decorrelating receiver by approximately2.5 dB at BER = 2 × 10−2 with suboptimal channel estimation filters, and theperformance difference increases with increasing SNR. Fig. 4.13(b) (L = 2) showsthat in the twopath channel the performance difference between the DA and DDdecorrelating receivers is significantly smaller than in the onepath channel. Thisis understandable, since the decisions are more reliable due to diversity, and theDD decorrelating receiver can also yield superior performance. At high SNR’s,
93
2 5 10 15 20 25 30 35 40 45 500.01
0.1
1Capacity for varying SNR − optimal channel estimators
Pilot symbol distance
Sha
nnon
cap
acity
(a)
2 5 10 15 20 25 30 35 40 45 500.01
0.1
1Capacity for varying SNR − optimal channel estimators
Pilot symbol distance
Sha
nnon
cap
acity
(b)
Fig. 4.10. Channel capacities of DA decorrelating receiver for different signal
tonoise ratios (SNR = 0, 4, 8, 12, 40 dB from down to upwards) with optimal
channel estimation filters; (a) L = 1, (b) L = 2.
94
2 5 10 15 20 25 300.01
0.1
1Capacity for varying SNR − suboptimal channel estimators
Pilot symbol distance
Sha
nnon
cap
acity
(a)
2 5 10 15 20 25 300.01
0.1
1Capacity for varying SNR − suboptimal channel estimators
Pilot symbol distance
Sha
nnon
cap
acity
(b)
Fig. 4.11. Channel capacities of DA decorrelating receiver for different signal
tonoise ratios (SNR = 0, 4, 8, 12, 40 dB from down to upwards) with sub
optimal channel estimation filters; (a) L = 1, (b) L = 2.
95
0 4 8 12 16 20 24 28 32 36 400
5
10
15
20
25
30
35
40
45
50Optimal pilot symbol distances
Signal−to−noise ratio [dB]
Opt
imal
pilo
t sym
bol d
ista
nce
······ L=1
−−−− L=2
× optimal channel est.
suboptimal channel est.
K = 20
Fig. 4.12. Optimal pilot symbol distances in Shannon capacity sense.
however, the BER of the DD decorrelating receiver saturates if suboptimal channel estimation filters are be applied. It can be concluded that both the DA andDD decorrelating receivers provide relatively good performance if optimal channel estimation filters are applied. Furthermore, the DA decorrelating receiver ismore robust to the channel estimation filter mismatch than the DD decorrelatingreceiver. In a twopath channel, where the decisions are rather reliable, the DDdecorrelating receiver gives satisfactory performance also with suboptimal channelestimation filters.
4.3.2. Performance comparisons of decorrelating and PICreceivers
4.3.2.1. Sensitivity of BER to channel estimation errors
The sensitivity of the bit error rate to channel estimation errors is studied without simulating the channel estimation process implicitly. The detection with thedecorrelating and the PIC receivers is simulated assuming that the channel estimates c(n)
k,l with mean squared error MSE are given. The estimates are generated
in the simulations assuming a decomposition c(n)k,l = c
(n)k,l + ∆c(n)
k,l , where ∆c(n)k,l
96
0 2 4 6 8 10 12 14 16 18 2010
−3
10−2
10−1
100
Performance of DA and DD decorrelators
Signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− DD decorrelator
······ DA decorrelator
−·−· ideal decorrelator
singleuser bound
suboptimal channel est.
× optimal channel est.
K = 20; L = 1
(a)
0 2 4 6 8 10 12 14 16 18 2010
−3
10−2
10−1
100
Performance of DA and DD decorrelators
Signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− DD decorrelator
······ DA decorrelator
−·−· ideal decorrelator
singleuser bound
suboptimal channel est.
× optimal channel est.
K = 20; L = 2
(b)
Fig. 4.13. Bit error rates of DD and DA decorrelating receivers for Np = 10;
(a) L = 1, (b) L = 2.
97
is the channel estimation error. It is further decomposed in the form ∆c(n)k,l =
∆c(n)k,l (lag) + ∆c(n)
k,l (noise), where ∆c(n)k,l (lag) is the lag error due to channel varia
tions and suboptimal channel estimation [23], and ∆c(n)k,l (noise) is the error due to
the additive white Gaussian noise. In the examples, the absolute value of the lagerror is assumed to be constant for one signaltonoise ratio value, and the errordue to the AWGN is assumed to be a complex Gaussian random variable withzero mean and variance σ2
k,l(noise), which is the variance of the AWGN at theoutput of the optimal channel predictor of length 10 (Ppr = 10, J = 1). The errors∆c(n)
k,l (noise) and ∆c(n′)
k′,l′(noise) are assumed to be independent if k 6= k′ or l 6= l′
or n 6= n′. The absolute value of the lag error term is
∆c(n)k,l (lag) =
√MSE − σ2
k,l(noise), (4.31)
and its phase is assumed to be uniformly distributed into [0, 2π).An exactly known channel (∆c(n)
k,l = 0 or MSE = 0), and three positive MSElevels (MSE = MSEmin, MSE = 1.5MSEmin, MSE = 2MSEmin), whereMSEmin is the mean squared error of the form (4.20) with the optimal predictor (Ppr = 10, J = 1) and no decision errors, are considered. The results areshown in Figs. 4.14 and 4.15 for one and twopath channels, respectively. Both thecases of equal received energies and a nearfar problem are considered.
It can be seen from the Figs. 4.14 and 4.15 that in a perfectly known channelthe PIC receiver clearly outperforms the decorrelating receiver, especially with diversity (Fig. 4.15). This is intuitive, since increasing the channel load KL increasesthe noise enhancement in the decorrelator. Furthermore, the diversity offered bythe twopath channel makes the decisions and MAI estimate in the PIC receivermore reliable. However, the PIC receiver is more sensitive to channel estimationerrors than the decorrelating receiver, which is also understandable. The decorrelator completely decouples the reception of different users, whereas in the PICreceiver the channel estimation errors propagate to MAI estimates and degradethe performance for all users. In the singlepath channel, the performance of thePIC and decorrelating receivers is nearly the same if the channel estimation error islarge. In the twopath channel case, the PIC receiver outperforms the decorrelatingreceiver in the presence of channel estimation errors. An exception is the presenceof a nearfar problem and large channel estimation errors if the system operates athigh SNR’s (see Fig. 4.15(b)). It can be concluded that the PIC receiver yields often better performance than the decorrelating receiver. However, the decorrelatingreceiver is more robust to the channel estimation errors than the PIC receiver.
4.3.2.2. BER in optimally estimated channel
The decisiondirected channel estimator structure is studied. Optimal channelestimation filters are applied. The simulations are performed for different numbersof active users (K = 8, 20, 32). The BER results are shown in Figs. 4.16 and
98
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Sensitivity of decorrelator and PIC to channel estimation errors
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 20; L = 1
(a)
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Sensitivity of decorrelator and PIC to channel estimation errors
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 20; L = 1
(b)
Fig. 4.14. Sensitivity of BER to channel estimation errors in a flat fading
channel (L = 1) for different MSE levels (MSE = (0, 1, 1.5, 2) ×MSEmin from
down to upwards); (a) equal received energies, (b) nearfar problem.
99
4.17 for one and twopath channels, respectively. Both the cases of equal receivedenergies and a nearfar problem are considered.
It can be seen from Fig. 4.16 that the PIC and the decorrelating receivers havenearly the same performance in a onepath channel. When comparing to Fig. 4.14,it is observed that the complex channel coefficient estimation is more challengingfor the PIC receiver than for the decorrelating receiver. Fig. 4.17 demonstratesthat the PIC receiver outperforms the decorrelating receiver in a twopath channel.This is understandable due to the increased noise enhancement in the decorrelatingreceiver caused by the larger channel load KL. However, in a heavily loadedCDMA system with K = 32 under a nearfar problem (K = 32 in Figs. 4.16(b)and 4.17(b)), the BER of the PIC receiver saturates at high SNR’s due to decisionerrors degrading the MAI estimates. The decorrelating receiver does not sufferfrom BER saturation. A similar phenomenon has been observed in an AWGNchannel even with lower channel loads [248]. It can be concluded that, in general,the PIC receiver slightly outperforms the decorrelating receiver if optimal channelestimation filters are applied. However, at high SNR’s and channel loads the PICreceiver suffers from BER saturation, whereas the decorrelating receiver does not.
4.3.2.3. BER in suboptimally estimated channel
The decisiondirected channel estimators with suboptimal channel estimation filtersare applied. The simulations are performed for K = 20 active users only to simplifysimulations. The BER results are shown in Figs. 4.18 and 4.19 for one and twopath channels, respectively. Both the cases of equal received energies and a nearfarproblem are considered.
It can be seen from Fig. 4.18 that the PIC and the decorrelating receiver havenearly the same performance in a onepath channel also with suboptimal channelestimation filters. Fig. 4.19 demonstrates that the PIC receiver outperforms thedecorrelating receiver in twopath channels with suboptimal channel estimation filters at relatively low SNR’s. At high SNR’s the performance loss due to suboptimalchannel estimation is more severe for the PIC receiver than for the decorrelatingreceiver, as expected. The problem is of course even more severe under a nearfarproblem (Figs. 4.18(b) and 4.19(b)). It can be concluded that at high SNR’s thePIC receiver suffers from the BER saturation, whereas the decorrelating receiverdoes not. Therefore, the PIC receiver has the potential to benefit more from adaptive channel estimation filters approximating the optimal channel estimation filtersthan the decorrelating receiver.
100
4.4. Conclusions
Multiuser demodulation in relatively fast Rayleigh fading channels has been studiedin this chapter. The optimal maximum likelihood sequence detector was derived.It estimates the received noiseless signal and correlates the received signal with theestimate. The estimationcorrelation must be performed for all possible receiveddata sequences. Due to the prohibitive complexity of the optimal receiver, suboptimal demodulators were considered. They decouple the data detection and complexchannel coefficient estimation from each other, and estimate the channel coefficientsand detect the data for all users separately. Both dataaided and decisiondirectedcomplex channel coefficient estimation with optimal and suboptimal channel estimation filters were considered. The performance of the decorrelating and parallelinterference cancellation receivers were compared.
The mean squared error of DA linear channel estimators was analyzed. It wasshown that the decoupled complex channel coefficient estimation in the decorrelating receiver can achieve performance that is very close to that of the joint LMMSEestimator. The bit error probability and the channel capacity of the DA decorrelating receiver were analyzed. It was shown that very large pilot symbol distancescan be tolerated with optimal channel estimation filters, whereas suboptimal channel estimation requires a significantly denser pilot symbol insertion. Based on theresults of the chapter, it can be concluded that adaptive channel estimation filters[103, 288, 57] capable to approximate the optimal channel estimation filters arecrucial in data transmission, where very low BER is required. In speech transmission, fixed channel estimation filters give satisfactory performance in most cases.However, the PIC receivers are rather sensitive to channel estimation errors, andthey may need adaptive channel estimation filters also with a relatively high BERrequirement.
The DA complex channel coefficient estimation is more robust than DD complexchannel coefficient estimation, which may suffer from BER saturation caused byhangups at high SNR’s. The DA channel estimation causes a longer decision delaythan DD channel estimation. The DA channel estimation needs different channelestimation filters for different symbols in the data frame, whereas one filter (twofilters in twostage DD channel estimation) is enough for DD channel estimation.The adaptation of several channel estimation filters is more difficult than a singlefilter. On the other hand, DA channel estimation is less sensitive to channel estimation filter mismatch, and fixed filters may yield satisfactory performance withDA estimation, even though that was not the case with DD channel estimation.Thus, both DA and DD complex channel coefficient estimation appear as viablemethods for multiuser receivers.
The PIC receiver achieves better performance in known channels than the decorrelating receiver, but it is more sensitive to complex channel coefficient estimationerrors than the decorrelating receiver, and at high channel loads it suffers fromBER saturation, whereas the decorrelating receiver does not. On the other hand,the decorrelating receiver’s operation relies on exact delay estimation for all users[151], and it is probably more sensitive to delay estimation errors than the PICreceiver. Furthermore, at higher channel loads the decorrelator is rather sensitive
101
to the delay combinations of the users, as noted in some examples in Chapter 3.Therefore, a further study considering delay estimators is required to get a morerealistic comparison on the performance of the PIC and the decorrelating receivers.With the existing knowledge both the decorrelating and the PIC receivers seem tobe possible alternatives for multiuser receivers from the performance point of view.
102
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Sensitivity of decorrelator and PIC to channel estimation errors
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 20; L = 2
(a)
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Sensitivity of decorrelator and PIC to channel estimation errors
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 20; L = 2
(b)
Fig. 4.15. Sensitivity of BER to channel estimation errors in a frequency
selective fading channel (L = 2) for different MSE levels (MSE = (0, 1, 1.5, 2) ×MSEmin from down to upwards); (a) equal received energies, (b) nearfar
problem.
103
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Performance of DD decorrelator and PIC
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 8, 20, 32; L = 1
(a)
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Performance of DD decorrelator and PIC
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 8, 20, 32; L = 1
(b)
Fig. 4.16. Bit error rates with DD channel estimation and optimal channel
estimation filters in a flat fading channel (L = 1) for different numbers of users
(K = 8, 20, 32 from down to upwards); (a) equal received energies, (b) nearfar
problem.
104
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Performance of DD decorrelator and PIC
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 8, 20, 32; L = 2
(a)
0 2 4 6 8 10 12 14 1610
−3
10−2
10−1
100
Performance of DD decorrelator and PIC
Average signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
singleuser bound
K = 8, 20, 32; L = 2
(b)
Fig. 4.17. Bit error rates with DD channel estimation and optimal channel
estimation filters in a frequencyselective fading channel (L = 2) for different
numbers of users (K = 8, 20, 32 from down to upwards); (a) equal received
energies, (b) nearfar problem.
105
0 2 4 6 8 10 12 14 16 18 2010
−3
10−2
10−1
100
Performance of DA and DD decorrelators
Signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
−·−· ideal decorrelator
singleuser bound
× optimal channel est.
suboptimal channel est.
K = 20; L = 1
(a)
0 2 4 6 8 10 12 14 16 18 2010
−3
10−2
10−1
100
Performance of DA and DD decorrelators
Signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
−·−· ideal decorrelator
singleuser bound
× optimal channel est.
suboptimal channel est.
K = 20; L = 1
(b)
Fig. 4.18. Bit error rates with DD channel estimation in a flat fading channel
(L = 1) for different channel estimation filters; (a) equal received energies, (b)
nearfar problem. Ideal decorrelator refers to the decorrelating receiver in a
known channel.
106
0 2 4 6 8 10 12 14 16 18 2010
−3
10−2
10−1
100
Performance of DA and DD decorrelators
Signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
−·−· ideal decorrelator
singleuser bound
× optimal channel est.
suboptimal channel est.
K = 20; L = 2
(a)
0 2 4 6 8 10 12 14 16 18 2010
−3
10−2
10−1
100
Performance of DA and DD decorrelators
Signal−to−noise ratio [dB]
Bit
erro
r rat
e
−−−− decorrelator
······ PIC
−·−· ideal decorrelator
singleuser bound
× optimal channel est.
suboptimal channel est.
K = 20; L = 2
(b)
Fig. 4.19. Bit error rates with DD channel estimation in a frequencyselective
fading channel (L = 2) for different channel estimation filters; (a) equal re
ceived energies, (b) nearfar problem. Ideal decorrelator refers to the decor
relating receiver in a known channel.
5. Multiuser detection in dynamic CDMA systems
Multiuser receiver implementation algorithms are considered in this chapter. Theproblem is treated at the matrix algorithm level, and detailed algorithms or architectures are not considered. The objective is to find efficient detection or detector update algorithms for dynamic CDMA systems where the detectors mustbe updated frequently due to changes in the number of users, in the signaturewaveforms, in the delays, or in the received amplitudes. The goals are to find themost efficient algorithms that exist and to analyze their implementation complexity and performance. The attention is limited to two linear receivers, namely tothe truncated decorrelating and LMMSE detectors, and the harddecision parallelinterference cancellation receiver. The algorithm implementation complexity is analyzed in terms of flops1 and the number of clock cycles2 required by synchronousDSP hardware3. The number of flops describes the computational burden of thealgorithms and the number of clock cycles illustrates to what extent the operationscan be performed in parallel.
Fixed singlepath channels (L = 1) are assumed in the algorithm derivations andnumerical examples in this chapter. The choice is due to notational convenience andto make the computer simulations feasible. The generalization of the algorithms tothe multipath case is straightforward. The implementation complexity expressionfor the multipath case will then be obtained by substituting KL for K in theexpressions in Sections 5.1–5.2. The complexity expressions are summarized withK replaced by KL in Section 5.3. The linear detection algorithms are presentedfor the truncated decorrelating detector. Due to the similarity of decorrelating andLMMSE detection, the generalization of the algorithms to the LMMSE detection isobvious. Both timeinvariant and timevarying signature waveforms are considered
1A floating point operation (flop) is defined to be a multiplication or an addition [156, p. 19].2Required clock cycles are defined to be the minimum number of computation steps that are
required in the sense that the results of the previous computation step are needed in the followingstep. A computation step is assumed to include all operations that can be performed independentof each other. A multiplication followed by an addition are assumed to require two clock cyclesin total. The actual number of required clock cycles depends on implementation details and theestimates given in this thesis are a lower bound to that.
3Synchronous DSP refers to the existence of a global clock which paces the computation flowin the signal processing system.
108
in this chapter. However, the superscript n describing the symbol interval is leftout from the correlation matrices R(n)(i) due to notational convenience throughoutthe chapter.
The major problem is to find practical algorithms for linear multiuser detection.The obvious reason for this is the fact that linear detectors are characterized asinverses of some form of correlation matrices, and matrix inversion is a computationally intensive operation. Implementation of linear detection is considered inthe first two sections. Ideal linear detection and detector update algorithms, whichimplement the multiuser detectors exactly (the effect of rounding errors is neglectedthroughout the chapter), are studied in Section 5.1. Iterative algorithms, whichapproximate the ideal linear detectors, are proposed in Section 5.2. In Section 5.3,the implementation complexity of the decorrelating and PIC receivers is compared.The results are summarized and discussed in Section 5.4.
5.1. Ideal linear detection
Ideal truncated linear detection is considered in this section. Detection algorithmsand their complexity are analyzed in Section 5.1.1. Detector computation algorithms for synchronous and asynchronous CDMA systems are considered in Sections 5.1.2 and 5.1.3, respectively. The synchronous case is studied, since theupdate algorithms can be later utilized in conjunction with iterative detection asdescribed in Section 5.2.
5.1.1. Detection algorithms
If the truncated linear detector DN is known, the detection can be performed asexpressed in (3.3). The detector DN is a NK × K matrix, i.e., it has NK2 elements. Vector y(n) has NK elements. The product D>Ny(n) in (3.3) is equivalentto the inner products of K vectors with NK elements each. One inner productrequires NK multiplications of a complex number by a real number (i.e., 2NK realmultiplications) and NK − 1 complex (i.e., 2(NK − 1) real) additions. Thus, oneinner product requires O[4NK] flops, where O[cxn] denotes a polynomial functionof x with order n and coefficient c for the highest order term4. The overall computational load is O[4NK2] flops. One inner product consists of NK inner products,which can be computed in parallel, if sufficient hardware is available. Thus, themultiplications need one clock cycle. If a NK input summing device is available,another clock cycle is needed for additions. The K inner products are independentof each other allowing parallel implementation. Thus, 2 clock cycles are needed in
4In addition to the dominant term xn the coefficients c in front of it need to be taken intoconsideration in the computational complexity expressions. The constants are needed to makedistinctions to the complexities of some algorithms.
109
total. As mentioned above, the clock cycle estimates are definitely optimistic, butthey yield lower bounds assuming that a multiplication and an addition take oneclock cycle each. The following implementation complexity calculations follow thesame principles as above and will not be presented.
The detector can be computed by solving the defining matrix equation. Forthe truncated decorrelating detector (3.6) needs to be solved. The detector can besolved by Cholesky factoring the correlation matrix R(n). Then (3.6) becomes
L>(n)L(n)D[d]N = UN . (5.1)
The solution of (5.1) by backward and forward substitutions [156] requiresO[6NK3]flops, O[ 3
2NK2] divisions, and 5NK clock cycles.
The detection can also be performed without directly computing the detectorDN itself, but by solving a linear matrix equation instead. For the truncateddecorrelating detector the equation
R(n)y(n)[d] = y(n) ⇔ L>(n)L(n)y
(n)[d] = y(n) (5.2)
must be solved, and after that KL elements in the middle of the vector y(n)[d] are ex
tracted to obtain y(n)[d] . The solution of (5.2) by backward and forward substitutions
[156] requires O[16NK2] flops O[NK] divisions and 5NK clock cycles.The computation of the linear detector D[d]N , as well as the solution of the
detector output vector y(n)[d] without explicit detector computation, requires the
knowledge of the Cholesky factor L(n) of the correlation matrix R(n). Thus, thealgorithms for the Cholesky factor computation will be the core of the ideal linearmultiuser detection in dynamic CDMA systems. Cholesky factor computation willbe analyzed in Section 5.1.3 after considering the equivalent problem for the easiersynchronous CDMA systems in Section 5.1.2.
5.1.2. Detector update in synchronous systems
The detection in synchronous CDMA systems is significantly simpler than that inasynchronous systems, since a oneshot detector is optimal. Because R(i) = 0, ∀i >1, it is sufficient to update the Cholesky factor L ∈ IRK×K of R(0) ∈ (−1, 1]K×K
instead of the Cholesky factor of R(n). It is also possible to update the inverse T =R−1(0) ∈ IRK×K of R(0). The algorithms can also be applied straightforwardlyto update the inverse or the Cholesky factor of R(0) to correlation changes of onepath of one user in asynchronous systems in multipath channels.
It is assumed that the correlations of one user change, while the other correlations remain constant5. In other words, if the correlations of user k change, theelements of the kth row and the kth column of the matrix R(0) are altered. The
5The algorithms in this Section 5.1.2 are designed for systems with timeinvariant signaturewaveforms only.
110
inverse or the Cholesky factor of R(0) are updated by computing the inverse or theCholesky factor of the reduced R(0) ((K − 1)× (K − 1) matrix) which is obtainedby removing the kth row and column from the old R(0). Based on that tentativeresult, the new inverse or Cholesky factor of the true new R(0) is computed by“adding” the kth row and column of the new R(0) to the reduced R(0). Therefore,update algorithms are described below for the case of a new user entering the system (the number of active users is incremented by one from K to K+1), and for thecase of an old user leaving the system (the number of active users is decrementedby one from K + 1 to K). To make the expressions describing the changing number of users precise, a superindex of the form R(K)(0) and T(K) = (R(K)(0))−1
indicating the number of users K is included in subsequent matrix symbols in thisSection 5.1.2.
5.1.2.1. Inverse update
The key tools used in recursive inverse computations are the following two matrixinversion formulae, which are special cases of (A1.4) and (A1.5), [88, pp. 571572].Woodbury’s identity is the rank one update
(A + vv>)−1 = A−1 − A−1vv>A−1
1 + v>A−1v, (5.3)
where A is a nonsingular square matrix and v 6= 0 is a column vector with the samedimension. Under the same conditions as above, we also have the order updateformula(
1 v>
v A
)−1
=(
(1 − v>A−1v)−1 −(1− v>A−1v)−1v>A−1
−(A− vv>)−1v (A− vv>)−1
). (5.4)
As a new user enters the CDMA network, a row and a column must beadded to the correlation matrix. The new user will be indexed to be user 1 andthe indices of old users remaining in the network will be incremented by one. Letthe new user’s correlation vector ρ be
ρ = (R12(0), R13(0), . . . , R1,K+1(0))> ∈ (−1, 1)K . (5.5)
Now R(K+1)(0) can be partitioned as
R(K+1)(0) =(
1 ρ>
ρ R(K)(0)
)∈ (−1, 1](K+1)×(K+1). (5.6)
The new inverse matrix T(K+1) = (R(K+1)(0))−1 can be found by Gaussian elimination, but an orderrecursive algorithm is more efficient. The inverse of R(K+1)(0)is partitioned as
T(K+1) =(t t>
t T
), (5.7)
111
where T ∈ IRK×K , t ∈ (0,∞)K , and t ∈ IR. By using (5.3) and (5.4) the followingalgorithm is obtained
t =1
1− ρ>T(K)ρ, (5.8)
T = T(K) + tT(K)ρρ>T(K) = T(K) +1ttt>, (5.9)
t = −Tρ = −tT(K)ρ. (5.10)
(5.8) follows directly from (5.4). From (5.4) it follows that
T = (R(K)(0)− ρρ>)−1
= T(K) +T(K)ρρ>T(K)
1− ρ>T(K)ρ
= T(K) + tT(K)ρρ>T(K), (5.11)
where the second equation follows by applying (5.3) to the first one. The thirdequation follows by substituting (5.8) to the second. Both forms of (5.10) follow directly from (5.4) and the fact that R(K)(0) and T(K) are symmetric. Bysubstituting T(K)ρ = − 1
t t from (5.10) back to (5.11) the last form of (5.9) follows.As a new user enters the system, the operations in (5.8) require O[2K2] flops,
and one division. Solution of (5.9) demands O[K2] flops; (5.10) does not requireextra computations after (5.9) has been calculated. In total O[3K2] flops, and onedivision are required. The minimum number of clock cycles to compute (5.8)–(5.10)is 9.
As a user is leaving the CDMA network, the problem is opposite to the onediscussed above. Now the detector matrix T(K+1) is known and it has a partitionas in (5.7). The detector matrix T(K) must be computed. Assume that user 1leaves the system. From (5.9) it follows that
T(K) = T− 1ttt>. (5.12)
The updating algorithm in the case the user indexed 1 leaves the system isdescribed in (5.12). On the other hand, if user k ∈ 2, 3, . . . ,K + 1 leaves thesystem, the updating problem is a bit more complicated. The original correlationmatrix has the following partition
R(K+1)(0) =
R11 ρ1 R12
ρ>1 1 ρ>2R>12 ρ2 R22
, (5.13)
where R11 ∈ IRk−1×k−1, R12 ∈ IRk−1×K−k+1 and R22 ∈ IRK−k+1×K−k+1, ρ1 ∈IRk−1, and ρ2 ∈ IRK−k+1. The new detector matrix can be found by virtuallychanging the indexing of users so that the leaving user is indexed to be 1. Theequivalent correlation matrix is then
R(K+1) = U>k R(K+1)(0)Uk, (5.14)
112
where Uk is a unitary permutation matrix of the form
Uk =(
uk u1 · · · uk−1 uk+1 · · · uK+1
)(5.15)
where the elements of the column vector uk ∈ 0, 1K+1 are (uk)i = δk,i, ∀ i ∈1, 2, . . . ,K and δk,i is the discrete Kronecker delta function. Thus, the equivalentinverse becomes
T(K+1) = U>k T(K+1)Uk. (5.16)
The detector matrix T(K) can be computed by using matrix T(K+1) in (5.12).As an old user leaves the system, O[K2] flops, and K divisions are needed in
(5.12). At least 3 clock cycles are needed.
5.1.2.2. Cholesky factor update
As a new user enters the CDMA network, the Cholesky factorization
R(K)(0) = (L(K))>L(K) (5.17)
is assumed to be known and the new factorization
R(K+1)(0) = (L(K+1))>L(K+1) (5.18)
should be found in a recursive form. Let
L(K+1) =(l 01×Kl L
), (5.19)
where L ∈ IRK×K is a lower triangular matrix, l ∈ IRK , and l ∈ IR.By substituting (5.19) into (5.18) we get
(L(K+1))>L(K+1) =(l2 + l>l l>L
L>l L>L
). (5.20)
By equating the corresponding parts in the above equation and in (5.6) the following algorithm is obtained
L = L(K), (5.21)
solve L>l = ρ for l, (5.22)
l =√
1− l>l. (5.23)
The operation in (5.22) is a linear matrix equation with an upper triangularcoefficient matrix. Its solution demands O[K2] flops, and K divisions. The computation in (5.23) requires K multiplications and additions, and one square root.In total O[K2] flops, K divisions, and one square root are required. The numberof required clock cycles is 3K.
113
As an alternative to the above algorithm we will also consider the Choleskyfactorization update based on QR factorization of the code matrix S(0) [156]. It isassumed that the QR factorization
S(K)(0) = Q(K)L(K) (5.24)
is known. Now we are interested in computing the new L(K+1) for
S(K+1)(0) =(
s S(K)(0)), (5.25)
where s is the new spreading sequence. By (5.21)–(5.23) it is seen that only the firstcolumn of L(K+1) needs to be computed6. In other words, the QR factorization ofthe Ns ×K matrix (
s0(Ns−K)×K
L(K)
)must be computed. The result is
Q(
s0(Ns−K)×K
L(K)
)=
0(Ns−K−1)×1 0(Ns−K−1)×Kl 01×Kl L(K)
, (5.26)
where Q is unitary. The computation can be performed efficiently by applying(Ns−K−1) Givens rotations [156], which requires 9(Ns−K−1) flops, 2(Ns−K−1)divisions, and (Ns−K−1) square roots. The computational complexity of the QRfactorization based Cholesky update is in general significantly lower than that ofthe update based on correlation matrix. This is not true, however, if the numberof users is much smaller than the number of samples per symbol interval Ns. TheQR factorization based computation does not require the correlation computation.That gives a further advantage in terms of computational complexity. It alsoreduces the effect of rounding errors, which are otherwise introduced while roundingthe correlation coefficients after their computation.
As a user is leaving the CDMA network, the factorization
R(K+1)(0) = (L(K+1))>L(K+1) (5.27)
is assumed to be known and the factorization
R(K)(0) = (L(K))>L(K) (5.28)
needs to be computed. Let
L(K+1) =
L11 0 0l>1 l3 0L21 l2 L22
, (5.29)
6The fact that only the first column changes can also be easily seen by properties of QRfactorizations [156]. This actually provides another proof for (5.21).
114
where L11 ∈ IRk−1×k−1 and L22 ∈ IRK−k+1×K−k+1 are lower triangular matrices,L21 ∈ IRK−k+1×k−1, l1 ∈ IRk−1, l2 ∈ IRK−k+1 are column vectors, and l3 ∈ IR is ascalar. By (5.29) and definition of R(K)(0) = (L(K))>L(K) it follows that
R(K)(0) =(
L>11L11 + l1l>1 + L>21L21 L>21L22
L>22L21 L>22L22
)=
(L>0 L>21
0 L>22
)(L0 0L21 L22
),
where L0 ∈ IRk−1×k−1 is a lower triangular matrix such that L>0 L0 = L>11L11+l1l>1 .Thus, the matrix L(K) is
L(K) =(
L0 0L21 L22
). (5.30)
The result states that the Cholesky factorization cannot be updated totallyrecursively, but the computation of a new factor of a (k − 1) × (k − 1) positivedefinite matrix is necessary. Because the matrix to be Cholesky factored has thespecial structure of the form L>11L11 + l1l>1 , the computation can be performedeffectively by QR factorization [290]. Let
A = L>11L11 + l1l>1 =(
l1 L>11
)( l>1L11
). (5.31)
Let the QR factorization of(l1,L>11
)> be(l>1L11
)= Q′
(0>
L0
), (5.32)
where Q′ ∈ IRk×k is unitary. By substituting (5.32) into (5.31) it is seen that
A =(
l1 L>11
)( l>1L11
)= L>0 L0. (5.33)
Thus, L0 may be computed by QR factorizing the matrix(l1,L>11
)>, which canbe accomplished by applying K Givens rotations. Because L11 is already lowertriangular, this requires in total 6(k − 1)2 + 2(k − 1) flops, 2(k − 1) divisions, and(k − 1) square roots [156].
The Cholesky factorization could again be computed by QR factoring the codematrix S(K)(0). This does not, however, offer any simplification to the computations. (It would stop the accumulation of rounding errors.)
The drawback of the Cholesky factorization is that computing it requires squareroots, which may be a problem in some applications. The square roots can bepartly avoided by using a L>DL factorization instead of Cholesky factorization[156].
115
5.1.3. Detector computation in asynchronous systems
As in Section 3.2, a subindex is added to R(n) to denote its dimension yieldingRN ∈ (−1, 1]NK×NK. In the upcoming orderrecursive derivations we will letthe dimension take the values m = 1, 2, . . . , N . In other words, Rm is R(n) withdimension mK ×mK. Note that Rm can be partitioned as (compare to A1.1)
Rm =(
R(0) γ>m−1
γm−1 Rm−1
)∈ IRmK×mK , (5.34)
whereγm−1 =
(R>(1) 0K · · · 0K
)> ∈ IR(m−1)K×K . (5.35)
The Cholesky factor Lm of Rm can be partitioned as [156]
Lm =(
L11(m) 0>
ζm−1 Lm−1
)∈ IRmK×mK , (5.36)
where Lm−1 ∈ IR(m−1)K×(m−1)K is the Cholesky factor of Rm−1,
ζm−1 =(
L>21(m) 0K · · · 0K)> ∈ IR(m−1)K×K , (5.37)
Lij(m) ∈ IRK×K is the ijth block of Lm, and 0 denotes a mK ×K matrix withzero elements. In other words, the blocks that must be computed at the mth stepare L11(m) and L21(m). Note that
L>mLm =(
L>11(m)L11(m) + ζ>m−1ζm−1 ζ>m−1Lm−1
L>m−1ζm−1 L>m−1Lm−1
). (5.38)
By equating the corresponding parts in (5.38) and in (5.34) and using the definitionsof γm−1 and ζm−1 it is found that we must solve
L>11(m− 1)L21(m) = R(1) for L21(m) (5.39)
L>11(m)L11(m) = R(0)− L>21(m)L21(m) for L11(m), (5.40)
for all m ∈ 2, 3, . . . , N.Solving L21(m) in (5.39) corresponds to solving (K − 1) matrix equations each
giving one column of L21(m). The solution of L21(m) can be parallelized to (K−1)separate linear equations. Each of the solutions requires O[K2] flops, K divisionsand at least 3K clock cycles due to backward substitutions. Thus, in total O[NK3]flops, NK divisions, and 3NK clock cycles are required to compute all blocksL21(m).
Solving L11(m) in (5.40) requires Cholesky factorization of the K × K matrixR(0) − L>21(m)L21(m). A QR factorization based approach to solve L11(m) isderived below. Let L be the Cholesky factor of R(0). Thus,
R(0)− L>21(m)L21(m) = L>L− L>21(m)L21(m) =(jL>21(m) L>
) (jL>21(m) L>
)>, (5.41)
116
where j2 = −1. Let the QR factorization of(jL>21(m) L>
)>be(
jL21(m)L
)= Q
(0KL
), (5.42)
where Q is unitary, and L is lowertriangular. Substituting (5.42) into (5.41) yields
R(0)− L>21(m)L21(m) = L>L, (5.43)
or L11(m) = L. In other words, L11(m) can be computed by QR factoring(jL>21(m) L>
)>. The QR factorization can be computed by Householder reflec
tions or Givens rotations [156], for example. The Householder reflections requirein total 4K clock cycles, and O[4K3] flops and K square roots [156, 41]. In total,the computation of all blocks L11(m) requires O[4NK3] flops, NK square roots,and at least 4NK clock cycles.
The computational requirements to complete the Cholesky factorization of R(n)
are found by summing the requirements to solve (5.39) and (5.40) for all values ofm. Thus, the computation of L(n) requires O[5NK3] flops, NK square roots, NKdivisions, and at least 7NK clock cycles.
The “skinny” QR factorization of the sample matrix S can be represented in theform S = QL(n), where Q ∈ IR(N+1)Ns×NK is an orthogonal matrix [156, p. 217].In other words, the Cholesky factor L(n) can be computed by QR factoring S. Theresulting computational requirements are difficult to analyze exactly. However,they are lower bounded by O[NNsK2] flops. Since Ns is assumed to be greaterthan the number of users K to guarantee the positivedefiniteness of R(n), the QRfactorization of the sample matrix S is usually computationally more intensive thancorrelation matrix based approach. The QR factorization is numerically superior,since the rounding errors introduced in the correlation computation are avoided.The number of clock cycles is at least 4NK.
The computational requirements of solving the decorrelating detector are thesum of the requirements to compute the Cholesky factor L(n) and to solve thedetector D[d] in (5.1). Thus, O[11NK3] flops, NK square roots, and O[3
2NK2]
divisions are required to solve the decorrelating detector. Implementation by synchronous digital signal processing requires at least 12NK clock cycles. The computational requirements are linearly related to N , which was obtained by the use ofthe sparsity of the matrices. The computational requirements are still large havinga cubic dependence on the number of users. If the number of users is large, thecomputational burden of the detector update is substantial, and the operationscannot be parallelized very effectively. At least in systems with timevarying signature waveforms, the proposed orderrecursive algorithm may still be impracticalfor the current DSP hardware in many applications.
117
5.2. Iterative linear detection
To alleviate the implementation complexity of the ideal detector update describedin Section 5.1, iterative implementation of the decorrelating and the LMMSE detectors will be studied in this section. The decorrelating detection can be representedas a linear equation (5.2), which can be solved by several iterative methods. Theiradvantage is the potential of offering significant savings in computational complexity, since there is no need to invert or Cholesky factorize the matrix R(n) explicitlyprior to y(n)
[d] being solved. Some of the iterative algorithms will be consideredin Section 5.2.1. In Section 5.2.2, an efficient way to initialize an iterative algorithm for a multiuser detector is proposed. In Section 5.2.3, the performance ofthe iterative detectors is studied.
5.2.1. Iterative algorithms
The most popular iterative algorithms to solve (5.2) include the steepest descent(SD) and the conjugate gradient (CG) methods7 [156, 291, 292]. Both the SDand CG methods utilize the fact that solving (5.2) is equivalent to minimizing thefunction
Ω(h) =12hHR(n)h− hHy(n). (5.44)
In other words, the decorrelator output
y(n)[d] = min
h∈CNKΩ(h) (5.45)
can be viewed as an estimate of the dataamplitude product vector h. The algorithms start with some initial guess h(0), from which the estimate of the minimumof Ω(h) is improved by iterative steps. The mth estimate is computed in the formh(m) = h(m − 1) + α(m)p(m), where p(m) is the new search direction, and thecoefficient α(m) is chosen so that Ω(h(m)) is minimized given h(m− 1) and p(m).Different strategies to choose the search directions p(m) result in different iterativealgorithms. Since Ω(h(m)) decreases most rapidly in the direction of the negativegradient
q(m)= − ∂Ω(h)∂h
h=
ˆh(m)= y(n) −R(n)h(m), (5.46)
choosing p(m) as a function of q(m − 1) has proved out to be efficient yielding afamily of the so called gradient algorithms, such as the steepest descent and theconjugate gradient algorithms.
7There are also several other simple iterative algorithms (e.g., Jacobi or GaussSeidel methods)available. The GaussSeidel algorithm was also tested, but it yielded significantly worse results.
118
5.2.1.1. Steepest descent and conjugate gradient algorithms
In the SD method the search direction is chosen simply to be the negative gradient[156], i.e., p(m) = q(m − 1). The choice implies that the search directions maybe linearly dependent, even if m < NK and q(m) 6= 0, resulting in redundantminimization of the function Ω(h). In the CG method the new search directionp(m) is chosen so that it satisfies [156]
p(m) = arg minp∈V⊥
m−1
‖p− q(m− 1)‖, (5.47)
where Vm is the space spanned by vectorsR(n)p(1), . . . ,R(n)p(m), and V⊥ denotesthe space orthogonal to V . It is easy to see that p(m) is R(n)conjugate or R(n)orthogonal to the previous search directions, i.e.,
pH(m)R(n)p(i) = 0, ∀i < m. (5.48)
This condition guarantees that the search directions are linearly independent aslong as q(m − 1) 6= 0, since R(n) is positive definite [156]. Thus, the algorithmresults in the exact solution (neglecting the rounding errors) in NK steps or faster.Moreover, in many cases a significantly smaller number of iterations yields solutionsthat are close to the ideal one.
The computationally most complex operation in both SD and CG algorithms isa matrixvector multiplication of the form R(n)p(m). Due to the sparsity of R(n)
this requires O[2NK2] multiplications of a complex number by a real number andO[2NK2] complex additions, i.e., in total O[8MNK2] flops are required, whereM is the number of iterations performed. Since the matrixvector multiplicationconsists of NK separate vector inner products, it lends itself to a fully parallelimplementation. The required number of clock cycles is found to be 11M for theSD algorithm, and 14M for the CG algorithm by [156].
The correlation coefficients in the matrix R(n) must also be computed as achange in the signature waveforms or in their timing occurs. A computation ofone coefficient requires 2Ns flops. If the signature waveforms are timevarying, allthe K2 new correlation coefficients need to be computed, and O[2NsK2] ≤ O[2K3]flops would be required on every symbol interval. In other words, the correlationcomputation would be significantly more complicated than the iterative detection.Thus, the whole method would loose much of its advantages. The applicability to ahighly parallel implementation would be the only one to remain. Fortunately, thereis another way to implement the CG algorithm [291, p. 610]. It is mathematicallyequivalent to the CG algorithm described above. However, its input is the sampledreceived waveform r instead of y(n), and it does not require the correlation matrixcomputation. It solves the leastsquares problem h = arg minh ‖Sh− r‖. Thus, itwill be referred to as CGL (conjugate gradient for solving leastsquares problems)algorithm. The details of the CGL algorithm can be found in [291, p. 610]. Thealgorithm requires matrixvector products of the form Sp(m) and S>ξ(m) on eachiteration, where ξ(m) ∈C(N+1)Ns is a vector needed in the CGL iteration [291, p.610]. Thus, it requires in total O[8MNNsK] flops. The computational complexityof the CGL algorithm is higher than that of CG algorithm. Since the correlation
119
computation is not required separately, the overall complexity is lower with timevarying signature waveforms or with rapidly changing delays. The required numberof clock cycles in the CGL algorithm is found to be 10M by [291, p. 610]. TheCG algorithm can be straightforwardly applied to LMMSE detection by replacingthe matrix R(n) by R(n) + σ2E−1. However, the application of the CGL algorithmto LMMSE detection is not possible, since LMMSE detection does not have aleastsquares problem interpretation.
5.2.1.2. Preconditioned conjugate gradient algorithm
The convergence speed of the conjugate gradient algorithm is determined by thecondition number κ of R(n), which is defined as the eigenvalue ratio
κ(R(n)) =λmax
(R(n)
)λmin
(R(n)
) , (5.49)
where λmax(A) and λmin(A) denote the eigenvalues of a matrix A with the largestand smallest absolute value, respectively. The larger the ratio, the slower theconvergence [156]. The convergence speed can be improved by a preconditioningstrategy [156]. The idea is to replace (5.2) by an equivalent equation
Rhd = y(n), (5.50)
where R = G−1R(n)G−1, hd = Gy(n)[d] , y
(n) = G−1y(n), and G ∈ IRNK×NK is asymmetric, positive definite matrix used to improve the condition number. If theQR factorization of G is G = QH, and H ≈ L (L is the Cholesky factor of R(n)), wehave R ≈ INK [156]. This means that the condition number κ(R) ≈ 1. In otherwords, the application of the preconditioning strategy requires finding a matrix Hthat is in some sense close to the Cholesky factor L. The resulting algorithm doesnot include explicit reference to matrix G. The only extra complication that isintroduced to CG algorithm is the solution of a system of the form
H>Hz = q(m− 1)⇔ z = (H>H)−1q(m− 1), (5.51)
see [156, pp. 527–529] for details.The preconditioned conjugate gradient (PCG) algorithm is proposed for systems
in which the signature waveforms are timeinvariant. A simple preconditioningmatrix is
H = diag(L, . . . , L) ∈ IRNK×NK , (5.52)where L is the Cholesky factor of R(0). The choice implies
(H>H)−1 = diag(R−1(0), . . . ,R−1(0)) ∈ IRNK×NK . (5.53)
The inverse matrix R−1(0) can be updated to correlation changes with computational complexity O[4K2] flops by applying the algorithms described in Section5.1.2. The solution of the system (5.51) requires O[4NK2] flops so that the overallcomputational complexity is O[12MNK2] flops. The PCG algorithm requires 16Mclock cycles by [156], i.e., two more per iteration than the standard CG algorithm.
120
5.2.2. Iterative sliding window detection
Both the steepest descent and the conjugate gradient algorithms converge to thecorrect solutions with any initial guess under relatively mild conditions. A usualchoice for the initial guess is a zero vector if no a priori information of the correctsolution is available [156]. Since there is information about the vector h, it can beexpected that the zero vector is not the best possible initial guess. For example,the matched filter output vector y(n) is clearly closer to the correct h than azero vector. For the case of truncated detection, where the data symbols in themiddle time interval of the observation window are detected, a sliding windowalgorithm is now proposed. It uses as an initial guess the values computed duringprevious symbol interval. More specifically, assume that a vector h
(n0)(M) ∈CNK
was computed and h(n0)(M) ∈ CK was obtained from the middle components of
h(n0)
(M). On the following symbol interval, where h(n0+1) is estimated, the last
(N − 1)K elements of h(n0)
(M) computed on the previous symbol interval are
substituted to be the first (N − 1)K elements of the new initial guess h(n0+1)
(0).The last K elements of the new initial guess are substituted to be the matchedfilter outputs, i.e., the last K element of y(n0+1).
5.2.3. Numerical performance evaluation
It is not possible to find useful analytical expressions for the average bit errorprobability or mean squared error of the iterative detectors. For that reason MonteCarlo computer simulations are carried out. A 31chip Gold sequence family usedin the simulation for the system with timeinvariant signature waveforms. Randomsignature sequences of length 6200 chips are used in the simulation for the systemwith timevarying signature waveforms. A rectangular chip waveform is applied.The number of users is K = 33. BPSK data and spreading modulation withcoherent detection are used. The carrier phases of users are set to zero. The delaysof the users are fixed, randomly selected, and they are the same in all simulations.The interfering users have equal energies (marked by Ek in the illustrations). Mostof the simulations use the sliding window algorithm described in Section 5.2.2 withwindow length N = 2P + 1 = 7. For purposes of comparison the SD, CG, andPCG algorithms approximating both the decorrelating and LMMSE detectors aresimulated.
First, a system with data block length Nb = 1 (processing window naturallyhad length N = Nb = 1) is simulated so that there is no need for the slidingwindow algorithm, since the whole data block can be processed in the detector.In Figs. 5.1(a) and 5.1(b) the simulated mean squared errors of the estimates h(n)
are plotted versus the number of iterations, M . The effect of the initial guess isalso illustrated by using both a zero vector and the MF output vector as the initialguess. In Fig. 5.1(a) the received energies are the same, whereas in Fig. 5.1(b) the
121
nearfar problem is emulated, since the desired user’s (k = 1) energy per symbol is10 dB weaker than the energies of the interfering users. The signaltonoise ratioin the simulations is E1/σ
2 = 8 dB.The results in Fig. 5.1(a) indicate that a moderate number of iterations yields
a performance close to that of the ideal detectors. The convergence of the CGalgorithm to the final solution is faster than that of the SD algorithm, as expected.From Fig. 5.1(b) it is seen that in the case with a severe nearfar problem moreiterations are required to get the ideal detector performance, and the CG algorithmis superior to the steepest descent algorithm. It is noted from Figs. 5.1(a) and5.1(b) that the matched filter output is a better initial guess than a zero vector,as is expected. It can also be seen that the initial guess does not have an adverseeffect on the convergence, however, with a poor initial guess more iterations arerequired to obtain a certain performance level.
The performance of the sliding window algorithms versus the number of iterations per symbol interval is depicted in Figs. 5.2–5.3 and 5.4–5.5 for timeinvariantand timevarying signature waveforms, respectively. The mean squared errors arepresented in Figs. 5.2 and 5.4, and the bit error rates for the same example in Figs.5.3 and 5.5. The signaltonoise ratio in the simulations is again E1/σ
2 = 8 dB.The results are similar to those described above. The number of required iterations in the sliding window algorithms is relatively small. For the CG algorithmfour iterations per symbol are required to reach the minimum mean squared errorperformance in the examples with a nearfar problem and timeinvariant signaturewaveforms. A few more iterations are required in the examples with timevaryingsignature waveforms. One more iteration is required to obtain the optimal bit errorrate than the optimal mean squared error, especially for the LMMSE detector. Itis also noted that the CG algorithm is clearly superior to the SD algorithm, asexpected. Preconditioning speeds up the convergence to the exact solution evenfurther.
The mean squared errors versus the signaltonoise ratio per symbol for thesteepest descent, the conjugate gradient, and the preconditioned conjugate gradientalgorithms with timeinvariant signature waveforms are illustrated in Figs. 5.6,5.7, and 5.8, respectively. It can be seen that at signaltonoise ratios of practicalinterest (≤ 16 dB) the CG decorrelator requires only two iterations to reach theMSE of the ideal decorrelator in the examples of equal received energies. It is notedthat at very high signaltonoise ratio the CG algorithm requires more than fiveiterations to achieve the minimum MSE in the examples with a nearfar problem.The MSE of the PCG algorithm differs only marginally from the MSE of the idealdecorrelator even if only four iterations are performed when there is a nearfarproblem.
An interesting result seen in the figures is that there are some iteration stepswhere the iterative decorrelators “perform” better than the ideal decorrelator atlow signaltonoise ratios. This is possible, since the decorrelating detector is notoptimal in the minimum mean squared error sense. The solutions provided byiterative decorrelators do finally converge to the exact decorrelating solution, asexpected. During the iterative process the length of the error vector εd(m) is
122
reduced at each step of the CG algorithm, where
εd(m) = y(n)[d] − h(m) (5.54)
is the error vector between the true decorrelating solution y(n)[d] and the estimate
h(m) computed by the iterative algorithm. In other words, it is guaranteed thatthe estimates computed by the CG decorrelator converge to the correct solutiony
(n)[d] and on each iteration the solution h(m) gets closer to y(n)
[d] . In the simulation,on the other hand, the elementwise mean squared values of the error vector
e(m) = h− h(m) (5.55)
are measured. Vector e(m) is the deviation between the true value of the receiveddataamplitude product vector h and the estimate h(m) computed by the iterative algorithm. Reduction of the mean squared value of e(m) at each step is notguaranteed. The iterative schemes also perform always worse than the LMMSEdetector, as expected. The phenomenon is discussed further in Appendix 2.
In Chapter 3 it was noted that the length of the decorrelating detector andthe ratio Emax/Emin of maximum and minimum energies are design parameters,and a tradeoff between them needs to be made. The same conclusion can bemade between the ratio Emax/Emin and the number of iterations in the iterativeimplementation of the linear multiuser detectors.
All the numerical examples indicate a convergence to the exact solution after afew iterations while there are NK = 231 unknown variables. The fast convergenceof the algorithms is to a large extent due to the initial guess as described in Section5.2.2. As predicted, the performance of the iterative detectors is highly dependenton the number of iterations performed. On the other hand, the implementationcomplexity is also directly proportional to the number of iterations. Thus, theiterative detectors provide a tradeoff between the implementation complexity andthe performance of the detectors.
5.3. Complexity comparisons
The implementation of the parallel interference cancellation receiver is relativelystraightforward on the algorithm level considered in this work. The interferencecancellation in (2.62) requires O[4MK2] flops and 4M clock cycles, where M isthe number of cancellation stages. If the interference cancellation is performedfor the received spreadspectrum signal (i.e., for the MF input), the correspondingcancellation requires O[4MNsK] flops and 4 clock cycles. The only update that isneeded (in addition to amplitude estimation) is the correlation computation as achange in the communication system occurs.
123
5.3.1. Summary of implementation complexities
The implementation complexity of some of the algorithms of this chapter are summarized in Table 5.1. The detection given L refers to solving (5.2) by backward andforward substitutions without computing the detector DN . Both the total numberof flops and the number of flops per detected symbol are shown for the detectionmethods. The total number of flops only is presented for ideal detector computation methods, since the detector is updated occasionally and the number of flopsper symbol would not be a sensible measure of complexity. It should be noted thatthe computational burden of the computation of the correlations between users’signature waveforms is excluded from the comparisons. To obtain a better understanding of the complexity of different schemes an example is considered below.
Table 5.1. Summary of total implementation requirements of different algorithms.
flops flops/KL clock cyclesCholesky factorization of R O[5N(KL)3] — 7NKLQR factorization of S O[NNs(KL)2] — 4NKLDetector computation given L O[6N(KL)3] — 5NKLMatched filtering O[4NsKL] O[4Ns] 2Lin. detection given D O[4N(KL)2] O[4NKL] 2Lin. detection given L O[16N(KL)2] O[16NKL] 5NKLLin. SD detection O[8MN(KL)2] O[8MNKL] 11MLin. CG detection O[8MN(KL)2] O[8MNKL] 14MLin. CGL detection O[8MNNsKL] O[8MNNs] 10MLin. PCG detection O[12MN(KL)2] O[12MNKL] 16MHDPIC MFOUT detection O[4M(KL)2] O[4MKL] 4MHDPIC MFIN detection O[4MNsKL] O[4MNs] 4M
5.3.2. An example
The complexity of the update based on the ideal Cholesky factoring is determinedby the rate of delay changes and the frequency of changes in the number of usersor their signature waveforms. A mobile radio system example with the systemparameters from the FRAMES project [34] is presented to provide reasonable numerical values for the parameters. Assume that the number of users is K = 256,the number of multipath components is L = 4, the number of chips per symbol(the processing gain) is Nc = 256, number of samples per chip is 5 to guaranteethat KL < Ns. This yields a total number of samples per symbol Ns = 1280. Itis assumed that the users have average vehicle speed of 80 km
h . Assume also that
124
the angle θ between the direction of the mobile unit movement with a line from avehicle to the base station is uniformly distributed into [0, π). The effective speedcausing distance change from the mobile to the base station is then 80 km
h  cos(θ),which results in an average value v ≈ 51 km
h .On a timeinterval ∆t the delay of a user changes ∆τ = v∆t
clight. A delay change is
said to be significant if it exceeds a predetermined level Tδ. It is assumed here Tδ =T
2Ns= Tc
10 , which is a strict synchronization requirement, but is supported by theresults in [151]. Thus, the number of symbols transmitted between significant delaychanges for any of the K users is on the average clightTδ
vKT = clight8vKNc
≈ 32. Assumethat the symbol rate is 20.3 kbaud, i.e., the symbol interval is T = 49.261 µs. Toestimate the frequency of handovers, it is assumed that mobile users are drivingthrough a cell of diameter 1 km after which a handover occurs. Then the averagenumber of symbols transmitted between two handovers for any of the K users is1000 mKvT = 5597. Therefore, the delay changes are more common than handovers so
that the effect of the handovers on the implementation complexity can be neglected.Assume that it is required that the detector is updated within 10 symbol in
tervals (10T ) from a delay change. (Thus, the detector is updated during oneframe [34]). Referring to Table 5.1, the detector computation requires at least12NKL clock cycles. Thus, the DSP hardware must have at least 12NKL clockcycles in time 10T , and the minimum clock frequency becomes 7NKL
10T ≈ 320 MHz,if N = 13. Since the detector update requires O[11N(KL)3] flops as seen fromTable 5.1, the DSP hardware must be able to perform about 11N(KL)3 flops intime 10T . In other words, the DSP must have speed 11N(KL)3
10T ≈ 310 Tflops/s.If the CG detection applies M = 124 iterations8 per symbol interval, the clockfrequency 14M/T = 35 MHz is required by Table 5.1. The required number ofarithmetic operations is 8MN(KL)2
T ≈ 280 Tflops/s. The corresponding clock frequency requirement for the CGL algorithm is 10M/T = 25 MHz, and the numberof arithmetic operations 8MNNsKL
T ≈ 340 Tflops/s, as is easy to see from Table5.1. If the PCG method applies M = 62 iterations on a symbol interval, it requiresthe clock frequency 16M/T = 20 MHz, and the required number of arithmeticoperations is 12MN(KL)2
T ≈ 210 Tflops/s. Assuming two (M = 2) multistage iterations (as in the examples of Chapter 4) the HDPIC receiver requires a speedof 4M/T = 160 kHz, and a computation power of 4M(KL)2 = 8.4 Mflops/s. Ifthe HDPIC receiver processes the received wideband signal, the DSP speed of4MNsKL = 10 Mflops/s is required. Finally, to put the above numbers into perspective it is noted that the matched filtering requires the minimum clock frequencyof 2/T = 41 kHz, and the number of arithmetic operations 4NsKL ≈ 5.2 Mflops/s.The results are summarized in Table 5.2.
The example illustrates that due to the higher degree of parallelism, the DSPclock frequency requirements of the iterative algorithms are significantly less stringent than those of the ideal detection. The number of flops required is also smaller.
8The number of required iterations was estimated by assuming that the number of necessaryiterations divided by the number of unknown variables is the same as in the numerical examples.Since M = 4 iterations yield an excellent performance for K = 33 unknown variables in Section5.2.3, it is assumed that M = 4
33×KL ≈ 124 iterations are needed in the current example.
125
Table 5.2. Results of the example.
flops flops/KL clock freq.Matched filtering 5.2 Mflops/s 5.1 kflops/s 41 kHzDetector computation 310 Tflops/s 300 Gflops/s 320 MHzLin. CG detection 280 Tflops/s 270 Gflops/s 35 MHzLin. CGL detection 340 Tflops/s 340 Gflops/s 25 MHzLin. PCG detection 210 Tflops/s 200 Gflops/s 20 MHzHDPIC MFOUT detection 8.4 Mflops/s 8.2 kflops/s 160 kHzHDPIC MFIN detection 10 Mflops/s 10 kflops/s 160 kHz
The iterative algorithms use the computation resources steadily, whereas the idealdetector computation requires high computation peaks for detector computation.The PCG algorithm is the simplest in terms of both required DSP clock frequencyand number of flops required. The CGL algorithm would clearly be the simplestalgorithm for linear multiuser detection in an RCDMA system with timevaryingsignature waveforms. The example also quantifies the wellknown fact that theparallel interference cancellation receivers are significantly simpler to implementthan the linear equalizer type receivers. The number of arithmetic operations thatneeds to be performed in decorrelating receiver in a unit of time is roughly 2× 107
times the corresponding number for the HDPIC receiver. The corresponding ratiofor the clock frequency is approximately 100. The number of arithmetic operations required by the parallel interference cancellation is roughly twice the numberrequired by the matched filter bank. The clock frequency is approximately fourtimes that of the MF bank. In that sense the parallel interference cancellation canbe considered to be a relatively simple technique to improve the performance ofthe CDMA systems.
The linear detector implementation requires a very large number of flops persecond. The requirements for the processor clock frequency, on the other hand, aremoderate. Furthermore, the number of flops per second is not a perfect measure ofthe implementation complexity, if an application specific integrated circuit (ASIC)is used. In other words, if all the potential for parallelism (measured by the minimum clock frequency) can be applied, the implementation of linear detectors willbecome feasible in the future.
5.4. Conclusions
Implementation of multiuser receivers in synchronous and asynchronous CDMAsystems has been discussed. It was assumed that the delays, the signature waveforms, or the number of users may change over time. Algorithms for the ideallinear decorrelating or LMMSE detector computation were derived. The detectorcomputation based on an orderrecursive Cholesky factorization of the correlation
126
matrix was shown to require O[11N(KL)3] flops and NKL square roots, and atleast 12NKL clock cycles. The computational load is huge, since it has cubicdependence on the number of users times the number of multipath components.Iterative detectors were investigated to reduce the complexity of the linear detectors. Steepest descent and conjugate gradient algorithms were proposed forthe decorrelating and the LMMSE detector implementation. The computationalrequirements of the algorithms are O[8M(KL)2] flops and at least 14M clock cycles. Preconditioned conjugate gradient algorithm was studied to obtain fasterconvergence. It requires O[12MN(KL)2] flops, and at least 16M clock cycles. Aconjugate gradient algorithm not requiring separate correlation computation wasproposed to be applied in CDMA systems with timevarying signature waveforms.A sliding window algorithm utilizing the values computed on the previous symbolinterval was developed to reduce the required number of iterations. Simulationresults demonstrate that moderate number of iterations with the CG or the PCGalgorithm gives the essentially the same performance as the ideal detectors have.The results show that the preconditioned conjugate gradient algorithm yields thefastest convergence.
In the mobile communication example the preconditioned conjugate gradient algorithm was found to be the simplest linear detection scheme for DCDMA systemwith timeinvariant signature waveforms. The CGL version of the conjugate gradient algorithm is found to be the simplest linear detection scheme for RCDMAsystem with timevarying signature waveforms. It was also noted that the parallelinterference cancellation receivers are significantly simpler to implement than thelinear receivers, as can be expected. The example demonstrated that the requiredclock frequency for linear receivers is roughly 100 times that for HDPIC receiver.The corresponding factor for number of arithmetic operations is 2× 107. The parallel interference cancellation requires twice as many arithmetic operations as thematched filter bank. The clock frequency requirement is four times that of the MFbank. Thus, the PIC multiuser receivers are significantly more desirable from theimplementation point of view than the linear equalizer type multiuser receivers.Furthermore, the PIC receivers are only moderately more complex to implementthan the conventional MF receivers.
127
1 3 5 7 9 11 13 15 17 190.09
0.1
0.11
0.12
0.13
0.14
0.15
0.16
0.17
0.18Performance of iterative detection of a complete data block
Number of iterations
Mea
n sq
uare
d er
ror
ideal detector
−·−· CG detector
······ SD detector
× zero initial guess
+ MF output initial guess
K = 33
ideal decorrelator
ideal LMMSE
(a)
1 3 5 7 9 11 13 15 17 190.09
0.1
0.11
0.12
0.13
0.14
0.15
0.16
0.17
0.18Performance of iterative detection of a complete data block
Number of iterations
Mea
n sq
uare
d er
ror
ideal detector
−·−· CG detector
······ SD detector
× zero initial guess
+ MF output initial guess
K = 33
ideal decorrelator
ideal LMMSE
(b)
Fig. 5.1. Mean squared errors of the iterative decorrelating detectors with
timeinvariant signature waveforms, Nb = N = 1, and SNR = 8 dB; (a) equal
received energies, (b) nearfar problem.
128
1 2 3 4 5 6 7 80.1
0.15
0.2
0.25
0.3Performance of sliding window iterative detection
Number of iterations
Mea
n sq
uare
d er
ror
ideal detector
−−−− PCG detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(a)
1 2 3 4 5 6 7 80.1
0.15
0.2
0.25
0.3Performance of sliding window iterative detection
Number of iterations
Mea
n sq
uare
d er
ror
ideal detector
−−−− PCG detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(b)
Fig. 5.2. Mean squared errors of the iterative decorrelating and LMMSE
sliding window detectors with timeinvariant signature waveforms, Nb = 200,
N = 7, and SNR = 8 dB; (a) equal received energies, (b) nearfar problem.
129
1 2 3 4 5 6 7 80.005
0.01
0.05Performance of sliding window iterative detection
Number of iterations
Bit
erro
r rat
eideal detector
−−−− PCG detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(a)
1 2 3 4 5 6 7 80.005
0.01
0.05Performance of sliding window iterative detection
Number of iterations
Bit
erro
r rat
e
ideal detector
−−−− PCG detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(b)
Fig. 5.3. Bit error rates of the iterative decorrelating and LMMSE sliding
window detectors with timeinvariant signature waveforms, Nb = 200, N = 7,
and SNR = 8 dB; (a) equal received energies, (b) nearfar problem.
130
1 2 3 4 5 6 7 80.1
0.15
0.2
0.25
0.3Performance of sliding window iterative detection for R−CDMA
Number of iterations
Mea
n sq
uare
d er
ror
ideal detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(a)
1 2 3 4 5 6 7 80.1
0.15
0.2
0.25
0.3Performance of sliding window iterative detection for R−CDMA
Number of iterations
Mea
n sq
uare
d er
ror
ideal detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(b)
Fig. 5.4. Mean squared errors of the iterative decorrelating and LMMSE
sliding window detectors with timevarying signature waveforms, Nb = 200,
N = 7, and SNR = 8 dB; (a) equal received energies, (b) nearfar problem.
131
1 2 3 4 5 6 7 80.005
0.01
0.05Performance of sliding window iterative detection for R−CDMA
Number of iterations
Bit
erro
r rat
e
ideal detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(a)
1 2 3 4 5 6 7 80.005
0.01
0.05Performance of sliding window iterative detection for R−CDMA
Number of iterations
Bit
erro
r rat
e
ideal detector
−·−· CG detector
······ SD detector
+ decorrelator
LMMSE
K = 33
(b)
Fig. 5.5. Bit error rates of the iterative decorrelating and LMMSE sliding
window detectors with timevarying signature waveforms, Nb = 200, N = 7,
and SNR = 8 dB; (a) equal received energies, (b) nearfar problem.
132
0 2 4 6 8 10 12 14 16 18 200.01
0.05
0.1
0.5
1Performance of sliding window iterative detection
Signal−to−noise ratio per symbol [dB]
Mea
n sq
uare
d er
ror
conventional detector
ideal decorrelator
······ SD detector
+ 2 iterations
∗ 3 iterations
4 iterations
× 5 iterations
K = 33
(a)
0 2 4 6 8 10 12 14 16 18 200.01
0.05
0.1
0.5
1Performance of sliding window iterative detection
Signal−to−noise ratio per symbol [dB]
Mea
n sq
uare
d er
ror
ideal decorrelator
······ SD detector
+ 2 iterations
∗ 3 iterations
4 iterations
× 5 iterations
K = 33
(b)
Fig. 5.6. Mean squared errors of the decorrelating steepest descent sliding
window detector for different numbers of iterations with timeinvariant signa
ture waveforms, Nb = 200, and N = 7; (a) equal received energies, (b) nearfar
problem.
133
0 2 4 6 8 10 12 14 16 18 200.01
0.05
0.1
0.5
1Performance of sliding window iterative detection
Signal−to−noise ratio per symbol [dB]
Mea
n sq
uare
d er
ror
conventional detector
ideal decorrelator
−·−· CG decorrelator
+ 2 iterations
∗ 3 iterations
4 iterations
× 5 iterations
K = 33
(a)
0 2 4 6 8 10 12 14 16 18 200.01
0.05
0.1
0.5
1Performance of sliding window iterative detection
Signal−to−noise ratio per symbol [dB]
Mea
n sq
uare
d er
ror
ideal decorrelator
−·−· CG decorrelator
+ 2 iterations
∗ 3 iterations
4 iterations
× 5 iterations
K = 33
(b)
Fig. 5.7. Mean squared errors of the decorrelating conjugate gradient sliding
window detector for different numbers of iterations with timeinvariant signa
ture waveforms, Nb = 200, and N = 7; (a) equal received energies, (b) nearfar
problem.
134
0 2 4 6 8 10 12 14 16 18 200.01
0.05
0.1
0.5
1Performance of sliding window iterative detection
Signal−to−noise ratio per symbol [dB]
Mea
n sq
uare
d er
ror
conventional detector
ideal decorrelator
−−−− PCG detector
+ 2 iterations
∗ 3 iterations
4 iterations
× 5 iterations
K = 33
(a)
0 2 4 6 8 10 12 14 16 18 200.01
0.05
0.1
0.5
1Performance of sliding window iterative detection
Signal−to−noise ratio per symbol [dB]
Mea
n sq
uare
d er
ror
ideal decorrelator
−−−− PCG detector
+ 2 iterations
∗ 3 iterations
4 iterations
× 5 iterations
K = 33
(b)
Fig. 5.8. Mean squared errors of the decorrelating preconditioned conjugate
gradient sliding window detector for different numbers of iterations with time
invariant signature waveforms, Nb = 200, and N = 7; (a) equal received ener
gies, (b) nearfar problem.
6. Conclusions
6.1. Summary
Multiuser demodulation algorithms for centralized receivers of asynchronous DSCDMA systems in frequencyselective fading channels were considered. The literature on singleuser fading channel receivers and on multiuser demodulation wasreviewed in Chapter 2. The problems to be analyzed in more detail were identifiedbased on the review.
The approximation of ideal infinite memorylength linear multiuser detectorsby finite memorylength detectors in asynchronous CDMA systems was considered in Chapter 3. The performance of the finite memorylength detectors wasanalyzed. It was shown that the FIR detectors can be made nearfar resistantunder a given ratio between maximum and minimum received power of users byselecting an appropriate memorylength. Numerical examples demonstrated thefact that moderate memorylengths of the FIR detectors are sufficient to achievethe performance of the ideal IIR detectors even under a severe nearfar problem.The required memorylength was shown to depend on other system parameters,especially on the ratio of maximum and minimum received powers.
Multiuser demodulation in relatively fast fading channels was the topic of Chapter 4. The optimal maximum likelihood sequence detector was derived. Due tothe prohibitive complexity of the optimal receiver, suboptimal demodulators wereconsidered. They decouple the data detection and complex channel coefficientestimation and estimate the channel coefficients for all users separately after suppressing the MAI. Decorrelating and parallel interference cancellation multiuserreceivers were considered. The results show that the decoupled complex channelcoefficient estimation yields excellent performance in comparison to joint LMMSEestimation. Furthermore, it was concluded that optimal or nearoptimal channelestimation filters are crucial in data transmission where very low BER is required.In speech transmission, fixed channel estimation filters were shown to give satisfactory performance in most cases. The DA complex channel coefficient estimationwas shown to be more robust than the DD complex channel coefficient estimation,which may suffer from BER saturation caused by hangups at high SNR’s. TheDD complex channel coefficient estimation was noted to be somewhat simpler to
136
be implemented than the DA complex channel coefficient estimation. The PIC receiver was demonstrated to achieve better performance in known channels than thedecorrelating receiver, but it was observed to be more sensitive to complex channelcoefficient estimation errors than the decorrelating receiver. At high channel loadsthe PIC receiver was seen to suffer from BER saturation, whereas the decorrelatingreceiver was demonstrated to be free of the BER saturation.
The implementation issues of the multiuser receivers in dynamic CDMA systems were analyzed in Chapter 5. Implementation of linear multiuser detectors insynchronous and asynchronous CDMA systems was studied. Algorithms for ideallinear decorrelating or LMMSE detector computation were derived. The detectorcomputation was shown to have a cubic dependence on the number of users timesthe number of multipath components. Iterative detectors were investigated to reduce the complexity of the linear detectors. Steepest descent, conjugate gradient,and preconditioned conjugate gradient algorithms were proposed for decorrelatingand LMMSE detector implementation. The computational requirements for oneiteration were shown to be a quadratic function of the number of users times thenumber of multipath components. Furthermore, the iterative detectors were provedto be more applicable to parallel implementation than the ideal ones. A slidingwindow algorithm utilizing the values computed on the previous symbol intervalwas developed to reduce the number of required iterations. Simulation resultsdemonstrated that a moderate number of iterations gives the same performanceas the ideal detectors have. In the mobile communication example the wellknownfact that the parallel interference cancellation receivers are significantly simpler toimplement than the linear equalizer type receivers was quantified. What is more,it was demonstrated that the PIC receiver is only moderately more complex toimplement than the conventional matched filter receiver.
6.2. Discussion
The results of the thesis show that the decorrelating multiuser receiver has oftena performance advantage over the hard decision parallel interference cancellationreceiver. This is especially true at high signaltonoise ratios and/or with a poorchannel complex coefficient estimation accuracy. The price for the performanceadvantages of the decorrelating receiver over the PIC receiver is the considerablyhigher implementation complexity. The choice between the two receivers dependsclearly on the cost of DSP circuits. As the DSP techniques develop, the implementation cost may become insignificant sometimes in the future, and the choice ofreceiver algorithms can be based on the performance only. However, the parallelinterference cancellation is clearly the choice as long as today’s or the next decade’stechnology is concerned. It should also be noted that the superior performance ofthe decorrelating receiver requires very accurate estimation of the delays of thereceived signal, which poses another strict implementation requirement. Furthermore, the performance of the PIC algorithms can probably be improved with amoderate increase in complexity by the partial cancellation [226, 227, 228] men
137
tioned in Section 2.2.3.2.Although the HDPIC receivers are more suitable for practice than the linear
equalizer type receivers, the study of linear receivers has been and still is invaluable.Since the linear receivers are easy to analyze, significant information and insightabout the multiuser demodulation problem can be obtained by studying them.
The attention in this thesis was limited to the linear equalizer type and harddecision parallel interference cancellation multiuser receivers, since they appearedto be among the most promising multiuser receiver techniques from the practicalpoint of view. Therefore, the choice of the PIC receiver is based on the comparisonto the linear equalizer type receivers only. Adaptive decentralized implementationsof the linear equalizer type receivers are significantly simpler than the centralizedones. If the convergence problems associated to the adaptive receivers can beovercome in the future, they may appear as one alternative for practical multiuserreceivers. However, the PIC receiver appears to be the most promising one forthe first generation of multiuser receivers. Notably the same principle has beenproposed for several evolving CDMA system standards, as mentioned in Section1.1.
6.3. Future research directions
There are several interesting open problems in multiuser receivers requiring furtherstudy. Some of them are discussed here in short.
The performance of the parallel interference cancellation receivers can possiblybe improved in some cases. As mentioned in Chapter 2, one alternative is toweight the cancellation according to the reliability of the MAI estimates [227, 228].However, a simple and robust way to measure the reliability and to determine thecancellation weights remains to be found. Since the reliability depends on the stateof the communication channel, the weights should be adapted to the changes incomplex channel coefficients. That poses strict requirements to the speed of suchweight determination. Thus, simple adaptive weighting, as proposed in [225], maynot be fast enough in fading channels.
The thesis has concentrated on the reception of transmissions without forwarderror correction coding. This was reasonable, since the emphasis was on the receiver algorithms to achieve coherent detection, i.e., to estimate the fading channelcoefficients reliably. From the data detection point of view channel encoding shouldbe taken into consideration. The encoded transmission and reception for CDMAsystems utilizing multiuser receivers are important research problems. The overall signal design (design of modulation and coding) for multiuser channels withsome efficient low complexity joint decoding algorithms for all users would be ofmajor interest. The goal could be to design a superior signal structure yieldingthe best possible performance with a given degree of decoding complexity. Thatwould solve the spreadingcoding tradeoff problem for the particular scenario. Theproblem is probably intractable, but an iterative process towards that goal shouldbe continued.
138
The impact of several system level aspects to the multiuser receiver performancewould be worth investigating. The same applies also vice versa. The impact ofmultiuser receivers on the overall system capacity has not been analyzed thoroughlyyet. For example, the effect of the existence of multiple cells is often neglected inthe multiuser receiver analysis. Multiuser receivers could naturally handle theintracell MAI by exploiting some ordinary multiuser receiver, e.g., a PIC receiver.The intercell MAI, on the other hand, could be compressed by some decentralizedreceiver technique. Multiuser receiver design and receiver performance in CDMAsystems with multiple data rates in realistic fading channels has been studied verylittle. The application of groupwise multiuser receivers, where grouping could bebased on the data rates of the users, appears as an interesting alternative [249].The performance of multiuser receivers with antenna arrays should also be takeninto consideration in the studies.
Centralized multiuser receivers have been considered in this thesis. There are,however, several applications (e.g., downlink receiver of a mobile communicationsystem), where decentralized receivers need to be applied. There has been a considerable amount of interest in decentralized adaptive receivers, as discussed inChapter 2. Several open problems still exist. A most severe problem is the factthat there are convergence problems associated with most adaptive receivers dueto the large number of taps required by direct form FIR filters. Therefore, there isroom for further work on dimension reduction techniques to reduce the number offilter taps needed, as well as for work on efficient adaptive algorithms to enhancethe convergence.
The effect of optimal and suboptimal channel estimation filters to the multiuserreceiver performance was studied in this thesis. However, there are only preliminary results on the application of adaptive channel estimation filters in conjunctionwith multiuser receivers [57]. More work on the performance of different adaptivealgorithms is required. In general, the impact of various practical nonidealities(e.g., delay estimation errors and quantization in DSP hardware) to the performance of the receivers should be considered. The performance of the multiuserreceivers with more realistic channel models and system parameters should bestudied. The analysis of all reallife nonidealities is impossible, and MonteCarlocomputer simulations of nonidealities are intractable due to long simulation timesand incomplete models for nonidealities. Thus, it will be necessary to carry outhardware simulations and construct testbeds and trial systems to determine thepractical feasibility of multiuser demodulation for future communication systems.
References
[1] Lee WCY (1991) Overview of cellular CDMA. IEEE Transactions on VehicularTechnology 40(2): p 291–302.
[2] Scholtz RA (1977) The spread spectrum concept. IEEE Transactions on Communications 25(8): p 748–755.
[3] Pickholtz RL, Schilling DL & Milstein LB (1982) Theory of spreadspectrum communications — a tutorial. IEEE Transactions on Communications 30(5): p 855–884.
[4] Simon MK, Omura JK, Scholtz RA & Levitt BK (1994) Spread Spectrum Communications Handbook. McGrawHill, New York City, New York, USA.
[5] Dixon RC (1994) Spread Spectrum Systems with Commercial Applications. JohnWiley and Sons, New York City, New York, USA.
[6] Peterson RL, Ziemer RE & Borth DE (1995) Introduction to Spread SpectrumSystems. PrenticeHall, Englewood Cliffs, New Jersey, USA.
[7] Viterbi AJ (1995) CDMA: Principles of Spread Spectrum Communication. AddisonWesley, Reading, Massachusetts, USA.
[8] Scholtz RA (1982) The origins of spreadspectrum communications. IEEE Transactions on Communications 30(5): p 822–854.
[9] Scholtz RA (1983) Notes on spreadspectrum history. IEEE Transactions on Communications 31(1): p 82–84.
[10] Scholtz RA (1983) Further notes and anecdotes on spreadspectrum origins. IEEETransactions on Communications 31(1): p 85–97.
[11] Prasad R & Hara S (1996) An overview of multicarrier CDMA. Proc. IEEE International Symposium on Spread Spectrum Techniques and Applications (ISSSTA),Mainz, Germany, 1: p 107–114.
[12] Learned RE, Krim H, Claus B, Willsky AS & Karl WC (1994) Waveletpacketbasedmultiple access communication. Proc. SPIE International Symposium on Optics,Imaging, and Instrumentation.
[13] Medley M, Saulnier G & Das P (1994) Applications of the wavelet transform inspread spectrum communications systems. Proc. SPIE International Symposium onOptics, Imaging, and Instrumentation, Princeton, New Jersey, USA, 2242 WaveletApplications: p 54–68.
[14] Livingston JN & Tung CC (1996) Bandwidth efficient PAM signaling using wavelets.IEEE Transactions on Communications 44(12): p 1629–1631.
140
[15] Learned RE, Willsky AS & Boroson DM (1997) Low complexity optimal joint detection for oversaturated multiple access communications. IEEE Transactions onSignal Processing 45(1): p 113–123.
[16] Wornell GW (1995) Spreadsignature CDMA: Efficient multiuser communication inthe presence of fading. IEEE Transactions on Information Theory 41(5): p 1418–1438.
[17] Wornell GW (1996) Spreadresponse precoding for communication over fading channels. IEEE Transactions on Information Theory 42(2): p 488–501.
[18] Bingham JAC (1990) Multicarrier modulation for data transmission: An idea whosetime has come. IEEE Communications Magazine 28(5): p 5–14.
[19] Viterbi AJ (1994) The orthogonalrandom waveform dichotomy for digital mobilepersonal communication. IEEE/ACM Personal Communications 1(1): p 18–24.
[20] Vembu S & Viterbi AJ (1996) Two different philosophies in CDMA — a comparison. Proc. IEEE Vehicular Technology Conference (VTC), Atlanta, Georgia, USA,p 869–873.
[21] Verdu S (1997) Demodulation in the presence of multiuser interference: Progressand misconceptions. In: Docampo D, FigueirasVidal A & PerezGonzalez F (eds)Intelligent Methods in Signal Processing and Communications, Birkhauser, Boston,Massachusetts, USA, p 15–44.
[22] Cover TM & Thomas JA (1991) Elements of Information Theory. John Wiley andSons, New York City, New York, USA.
[23] Proakis JG (1995) Digital Communications. McGrawHill, New York City, NewYork, USA, 3rd edn.
[24] Pickholtz RL, Milstein LB & Schilling DL (1991) Spread spectrum for mobile communications. IEEE Transactions on Vehicular Technology 40(2): p 313–322.
[25] Gilhousen KS, Jacobs IM, Padovani R, Viterbi AJ, Weaver LA & Wheatley IIICEW (1991) On the capacity of a cellular CDMA system. IEEE Transactions onVehicular Technology 40(2): p 303–312.
[26] Schilling DL, Milstein LB, Pickholtz RL, Bruno F, Kanterakis E, Kullback M, ErcegV, Biederman W, Fishman D & Salerno D (1991) Broadband CDMA for personalcommunications systems. IEEE Communications Magazine 29(11): p 86–93.
[27] Jung P, Baier PW & Steil A (1993) Advantages of CDMA and spread spectrumtechniques over FDMA and TDMA in cellular mobile radio applications. IEEETransactions on Vehicular Technology 42(3): p 357–364.
[28] De Gaudenzi R, Giannetti F & Luise M (1996) Advances in satellite CDMA transmission for mobile and personal communications. Proceedings of the IEEE 84(1):p 18–39.
[29] Telecommunication Industry Association (1993) Mobile Station–Base Station Compatibility Standard for DualMode Wideband Spread Spectrum Cellular Systems.
[30] Ross AHM & Gilhousen KS (1996) CDMA technology and the IS95 north americanstandard. In: Gibson JD (ed) The Mobile Communications Handbook, CRC Press,chap 27, p 430–447.
[31] Ojanpera T, Rikkinen K, Hakkinen H, Pehkonen K, Hottinen A & Lilleberg J(1996) Design of a 3rd generation multirate CDMA systems with multiuser detection, MUDCDMA. Proc. IEEE International Symposium on Spread SpectrumTechniques and Applications (ISSSTA), Mainz, Germany, 1: p 334–338.
[32] Hottinen A & Pehkonen K (1996) A flexible multirate CDMA concept with multiuser detection. Proc. IEEE International Symposium on Spread Spectrum Techniques and Applications (ISSSTA), Mainz, Germany, 2: p 556–560.
141
[33] Ojanpera T, Castro J, Emmer D, Gudmundson M, Jung P, Klein A, Kramer G,Pirhonen R, Rademacher L, Skold J & Toskala A (1996) FRAMES – hybrid multiple access technology. Proc. IEEE International Symposium on Spread SpectrumTechniques and Applications (ISSSTA), Mainz, Germany, 1: p 320–324.
[34] Ovesjo F, Dahlman E, Ojanpera T, Toskala A & Klein A (1997) FRAMES multiple access mode 2 — wideband CDMA. Proc. IEEE International Symposium onPersonal, Indoor, and Mobile Radio Communications (PIMRC), Helsinki, Finland,1: p 42–46.
[35] Ohno K, Sawahashi M & Adachi F (1995) Wideband coherent DSCDMA. Proc.IEEE Vehicular Technology Conference (VTC), Chicago, Illinois, USA, p 779–783.
[36] Adachi F, Sawahashi M, Dohi T & Ohno K (1996) Coherent DSCDMA: Promisingmultiple access for wireless multimedia mobile communications. Proc. IEEE International Symposium on Spread Spectrum Techniques and Applications (ISSSTA),Mainz, Germany, 1: p 351–358.
[37] Fukasawa A, Sato T, Takizawa Y, Kato T, Kawabe M & Fisher RE (1996) Wideband CDMA system for personal radio communications. IEEE CommunicationsMagazine 34(10): p 116–123.
[38] Pichna R & Wang Q (1996) Power control. In: Gibson JD (ed) The Mobile Communications Handbook, CRC Press, chap 23, p 370–380.
[39] Karkkainen K (1996) Code Families and Their Performance Measures for CDMAand Military SpreadSpectrum Systems. Acta Universitatis Ouluensis C89, University of Oulu Press, Oulu, Finland.
[40] Batra A & Barry JR (1995) Blind cancellation of cochannel interference. Proc.IEEE Global Telecommunication Conference (GLOBECOM), Singapore, 1: p 157–162.
[41] Haykin S (1991) Adaptive Filter Theory Prentice Hall, Englewood Cliffs, New Jersey, USA, 2nd edn.
[42] Widrow B & Stearns SD (1985) Adaptive Signal Processing. PrenticeHall, Englewood Cliffs, New Jersey, USA.
[43] Schneider KS (1979) Optimum detection of code division multiplexed signals. IEEETransactions on Aerospace and Electronic Systems 15(1): p 181–185.
[44] Kashihara TK (1980) Adaptive cancellation of mutual interference in spread spectrum multiple access. Proc. IEEE International Conference on Communications(ICC), p 44.4.1–44.4.5.
[45] Kohno R, Imai H & Hatori M (1983) Cancellation technique of cochannel interference in asynchronous spreadspectrum multipleaccess systems. IEICE Transactionson Communications 65A: p 416–423.
[46] Verdu S (1984) Optimum multiuser signal detection. Ph.D. thesis, Department ofElectrical and Computer Engineering, University of Illinois at UrbanaChampaign,Urbana, Illinois, USA.
[47] Verdu S (1986) Minimum probability of error for asynchronous Gaussian multipleaccess channels. IEEE Transactions on Information Theory 32(1): p 85–96.
[48] Verdu S (1986) Optimum multiuser asymptotic efficiency. IEEE Transactions onCommunications 34(9): p 890–897.
[49] Verdu S (1993) Multiuser detection. In: Advances in Statistical Signal Processing.JAI Press, Greenwich, Connecticut, USA, 2: p 369–409.
[50] DuelHallen A, Holtzman J & Zvonar Z (1995) Multiuser detection for CDMAsystems. IEEE/ACM Personal Communications 2: p 46–58.
142
[51] Moshavi S (1996) Multiuser detection for DSCDMA communications. IEEE Communications Magazine 34(10): p 124–137.
[52] Jung P & Alexander PD (1996) A unified approach to multiuser detectors for CDMAand their geometrical interpretations. IEEE Journal on Selected Areas in Communications 14(8): p 1595–1601.
[53] Pursley MB (1977) Performance evaluation for phasecoded spreadspectrummultipleaccess communication–Part I: System analysis. IEEE Transactions onCommunications 25(8): p 795–799.
[54] Lupas R & Verdu S (1990) Nearfar resistance of multiuser detectors in asynchronous channels. IEEE Transactions on Communications 38(4): p 496–508.
[55] Varanasi MK & Aazhang B (1990) Multistage detection in asynchronous codedivision multipleaccess communications. IEEE Transactions on Communications38(4): p 509–519.
[56] Zvonar Z (1993) Multiuser detection for Rayleigh fading channels. Ph.D. thesis, Department of Electrical and Computer Engineering, Northeastern University, Boston,Massachusetts, USA.
[57] Latvaaho M & Lilleberg J (1998) Parallel interference cancellation in multiuserCDMA channel estimation. Wireless Personal Communications, Kluwer AcademicPublishers, in press.
[58] Parsons JD (1992) The Mobile Radio Propagation Channel. Pentech Press, London,U.K.
[59] Ranta PA, Hottinen A & Honkasalo ZC (1995) Cochannel interference cancellingreceiver for TDMA mobile systems. Proc. IEEE International Conference on Communications (ICC), Seattle, Washington, USA, 1: p 17–21.
[60] Blanz J, Klein A, Nasshan M & Steil A (1994) Performance of a cellular hybridC/TDMA mobile radio system applying joint detection and coherent receiver antenna diversity. IEEE Journal on Selected Areas in Communications 12(4): p 568–579.
[61] Kramer G, Loher U, Ruprecht J & Jung P (1996) A comparison of demodulationtechniques for code time division multiple access. Proc. IEEE Global Telecommunication Conference (GLOBECOM), London, U.K., 1: p 525–529.
[62] Mabuchi T, Kohno R & Imai H (1994) Multiuser detection scheme based on canceling cochannel interference for MFSK/FHSSMA system. IEEE Journal on SelectedAreas in Communications 12(4): p 593–604.
[63] Halford KW & BrandtPearce M (1996) Performance of a multistage multiuser detector for a frequency hopping multipleaccess system. Proc. Conference on Information Sciences and Systems (CISS), Princeton University, Princeton, New Jersey,USA, 1: p 605–610.
[64] Haimovich A & BarNess Y (1996) On the performance of a stochastic gradientbased decorrelation algorithm for multiuser multicarrier CDMA. Wireless PersonalCommunications, Kluwer Academic Publishers 2(4): p 357–371.
[65] Rasmussen LK & Lim TJ (1996) Detection techniques for direct sequence & multicarrier variable rate broadband CDMA. Proc. IEEE International Conference onCommunication Systems and IEEE International Workshop on Intelligent SignalProcessing & Communication Systems (ICCS/ISPACS), Singapore, 3: p 1526–1530.
[66] Sanada Y & Nakagawa M (1996) A multiuser interference cancellation techniqueutilizing convolutional codes and orthogonal multicarrier modulation for wirelessindoor communications. IEEE Journal on Selected Areas in Communications 14(8):p 1500–1509.
143
[67] Vandendorpe L & van de Wiel O (1996) Performance analysis of linear joint equalization and multiple access interference cancellation for multitone CDMA. WirelessPersonal Communications, Kluwer Academic Publishers 3(12): p 17–36.
[68] Beheshi S & Wornell GW (1997) Interference cancellation and decoding in spreadsignature CDMA systems. Proc. IEEE Vehicular Technology Conference (VTC),Phoenix, Arizona, USA, 1: p 26–30.
[69] Lupas R (1989) Nearfar resistant linear multiuser detection. Ph.D. thesis, Department of Electrical Engineering, Princeton University, Princeton, New Jersey, USA.
[70] Lupas R & Verdu S (1989) Linear multiuser detectors for synchronous codedivisionmultipleaccess channels. IEEE Transactions on Information Theory 34(1): p 123–136.
[71] Zvonar Z & Brady D (1994) Multiuser detection in singlepath fading channels.IEEE Transactions on Communications 42(2/3/4): p 1729–1739.
[72] Vasudevan S & Varanasi MK (1996) Achieving nearoptimum asymptotic efficiencyand fading resistance over the timevarying Rayleighfaded CDMA channel. IEEETransactions on Communications 44(9): p 1130–1143.
[73] Varanasi MK & Vasudevan S (1994) Multiuser detectors for synchronous CDMAcommunication over nonselective Rician fading channels. IEEE Transactions onCommunications 42(2/3/4): p 711–722.
[74] Vasudevan S & Varanasi MK (1994) Optimum diversity combiner based multiuserdetection for timedispersive Ricean fading CDMA channels. IEEE Journal on Selected Areas in Communications 12(4): p 580–592.
[75] Turin GL (1980) Introduction to spreadspectrum antimultipath techniques andtheir application to urban digital radio. Proceedings of the IEEE 68(3): p 328–353.
[76] Stein S (1987) Fading channel issues in system engineering. IEEE Journal on Selected Areas in Communications 5(2): p 68–89.
[77] Schwartz M, Bennett WR & Stein S (1966) Communication Systems and Techniques. McGrawHill, New York City, New York, USA.
[78] Mammela A (1995) Diversity Receivers in a Fast Fading Multipath Channel. VTTPublications 253, Technical Research Centre of Finland, Oulu, Finland.
[79] Price R & Green PE (1958) A communication technique for multipath channels.Proceedings of the IRE 46: p 555–570.
[80] Forney GD (1972) Maximumlikelihood sequence estimation of digital sequences inthe presence of intersymbol interference. IEEE Transactions on Information Theory18(3).
[81] Forney GD (1973) The Viterbi algorithm. Proceedings of the IEEE 61(3): p 268–278.
[82] Stojanovic M, Proakis JG & Catipovic JA (1995) Analysis of the impact of channelestimation errors on the performance of a decisionfeedback equalizer in fadingmultipath channels. IEEE Transactions on Communications 43(2/3/4): p 877–886.
[83] Kam PY (1991) Optimal detection of digital data over the nonselective Rayleighfading channel with diversity reception. IEEE Transactions on Communications39(2): p 214–219.
[84] Kailath T (1960) Correlation detection of signals perturbed by a random channel.IRE Transactions on Information Theory 6(3): p 361–366.
[85] Van Trees HL (1971) Detection, Estimation, and Modulation Theory, Part III. JohnWiley and Sons, New York City, New York, USA.
144
[86] Haeb R & Meyr H (1989) A systematic approach to carrier recovery and detectionof digitally phase modulated signals on fading channels. IEEE Transactions onCommunications 37(7): p 748–754.
[87] Tong L (1995) Blind sequence estimation. IEEE Transactions on Communications43(12): p 2986–2994.
[88] Kay S (1993) Fundamentals of Statistical Signal Processing: Estimation Theory.PrenticeHall, Englewood Cliffs, New Jersey, USA.
[89] Gardner FM (1990) Demodulator reference recovery techniques suited for digitalimplementation. Technical Report, ESTEC Contract no. 6847/86/NL/DG, European Space Agency.
[90] Davarian F (1989) Mobile digital communications via tone calibration. IEEE Transactions on Vehicular Technology 36(2): p 55–62.
[91] Cavers JK (1991) Performance of tone calibration with frequency offset and imperfect pilot filter. IEEE Transactions on Vehicular Technology 40(2): p 426–434.
[92] Li HWH & Cavers JK (1991) An adaptive filtering technique for pilotaided transmission systems. IEEE Transactions on Vehicular Technology 40(3): p 532–545.
[93] Moher ML & Lodge JH (1989) TCMP — a modulation and coding strategy forRicean fading channels. IEEE Journal on Selected Areas in Communications 7(9):p 1347–1355.
[94] Aghamohammadi A, Meyr H & Ascheid G (1991) A new method for phase synchronization and automatic gain control of linearly modulated signals on frequencyflatfading channels. IEEE Transactions on Communications 39(1): p 25–29.
[95] Cavers JK (1991) An analysis of pilot symbol assisted modulation for Rayleighfading channels. IEEE Transactions on Vehicular Technology 40(4): p 686–693.
[96] Lo NWK, Falconer DD & Sheikh AUH (1991) Adaptive equalization and diversitycombing for mobile radio using interpolated channel estimates. IEEE Transactionson Vehicular Technology 40(3): p 636–645.
[97] Irvine GT & McLane PJ (1992) Symbolaided plus decisiondirected reception forPSK/TCM modulation on shadowed mobile satellite fading channels. IEEE Journalon Selected Areas in Communications 10(8): p 1289–1299.
[98] Fechtel SA & Meyr H (1994) Optimal parametric feedforward estimation offrequencyselective fading radio channels. IEEE Transactions on Communications42(2/3/4): p 1639–1650.
[99] Mammela A & Kaasila VP (1997) Smoothing and interpolation in diversity reception. International Journal of Wireless Information Networks, in press.
[100] Kam PY & Teh CH (1983) Reception of PSK signals over fading channels viaquadrature amplitude estimation. IEEE Transactions on Communications 31(8):p 1024–1027.
[101] Kam PY & Teh CH (1984) An adaptive receiver with memory for slowly fadingchannels. IEEE Transactions on Communications 32(6): p 654–659.
[102] Kam PY & Teh CH (1987) Adaptive diversity reception over a slow nonselectivefading channel. IEEE Transactions on Communications 35(5): p 572–574.
[103] Liu Y & Blostein SD (1995) Identification of frequency nonselective fading channels using decision feedback and adaptive linear prediction. IEEE Transactions onCommunications 43(2/3/4): p 1484–1492.
[104] Viterbi AJ & Viterbi AM (1983) Nonlinear estimation of PSKmodulated carrierphase with application to burst digital transmission. IEEE Transactions on Information Theory 29(4): p 543–551.
145
[105] Xu G, Liu H, Tong L & Kailath T (1995) A leastsquares approach to blind channelidentification. IEEE Transactions on Signal Processing 43(12): p 2982–2993.
[106] Zeng HH & Tong L (1995) Blind channel estimation: Comparison studies and anew algorithm. Proc. IEEE International Conference on Communications (ICC),Seattle, Washington, USA, 1: p 12–16.
[107] Tsatsanis MK (1995) Timevarying system identification and channel equalizationusing wavelets and higherorder statistics. In: Leondes CT (ed) Control and Dynamic Systems: Advances in Theory and Applications, Academic Press, San Diego,California, USA, 68: p 333–394.
[108] Zvonar Z (1996) Multiuser detection in asynchronous CDMA frequencyselectivefading channels. Wireless Personal Communications, Kluwer Academic Publishers3(3–4): p 373–392.
[109] Lu L & Sun W (1997) The minimal eigenvalues of a class of blocktridiagonalmatrices. IEEE Transactions on Information Theory 43(2): p 787–791.
[110] Schlegel C & Wei L (1997) A simple way to compute the minumum distance inmultiuser CDMA systems. IEEE Transactions on Communications 45(5): p 532–535.
[111] Fawer U & Aazhang B (1996) Multiuser receivers for codedivision multipleaccesssystems with trellisbased modulation. IEEE Journal on Selected Areas in Communications 14(8): p 1602–1609.
[112] Giallorenzi TR & Wilson SG (1996) Multiuser ML sequence estimator for convolutionally coded asynchronous DSCDMA systems. IEEE Transactions on Communications 44(8): p 997–1007.
[113] Gray SD, Kocic M & Brady D (1995) Multiuser detection in mismatched multipleaccess channels. IEEE Transactions on Communications 43(12): p 3080–3089.
[114] Poor HV (1989) On parameter estimation in DS/SSMA formats. In: Porter WA &Kak SC (eds) Advances in Communications and Signal Processing, SpringerVerlag,Berlin–Heidelberg, Germany, vol 129 of Lecture Notes in Control and InformationSciences, p 59–70.
[115] Xie Z, Rushforth CK, Short RT & Moon TK (1993) Joint signal detection andparameter estimation in multiuser communications. IEEE Transactions on Communications 41(7): p 1208–1216.
[116] Hagmanns FJ & Hespelt V (1994) On the detection of bandlimited directsequencespreadspectrum signals transmitted via fading multipath channels. IEEE Journalon Selected Areas in Communications 12(5): p 891–899.
[117] Sung PA & Chen KC (1996) A linear minimum mean square error multiuser receiverin Rayleighfading channels. IEEE Journal on Selected Areas in Communications14(8): p 1583–1594.
[118] Lilleberg J, Nieminen E & Latvaaho M (1996) Blind iterative multiuser delayestimator for CDMA. Proc. IEEE International Symposium on Personal, Indoor,and Mobile Radio Communications (PIMRC), Taipei, Taiwan, 2: p 565–568.
[119] Poor HV & Verdu S (1988) Singleuser detectors for multiuser channels. IEEETransactions on Communications 36(1): p 50–60.
[120] Klein A & Baier PW (1993) Linear unbiased data estimation in mobile radio systems applying CDMA. IEEE Journal on Selected Areas in Communications 11(7):p 1058–1066.
[121] Varanasi MK & Aazhang B (1991) Optimally nearfar resistant multiuser detectionin differentially coherent synchronous channels. IEEE Transactions on InformationTheory 37(4): p 1006–1018.
146
[122] Varanasi MK (1993) Noncoherent detection in asynchronous multiuser channels.IEEE Transactions on Information Theory 39(1): p 157–176.
[123] Zvonar Z & Brady D (1995) Suboptimal multiuser detector for frequencyselectiveRayleigh fading synchronous CDMA channels. IEEE Transactions on Communications 43(2/3/4): p 154–157.
[124] Klein A (1996) Multiuser detection of CDMA signals – algorithms and their application to cellular mobile radio. VDIVerlag, Dusseldorf, Germany.
[125] Zvonar Z & Brady D (1995) Differenially coherent multiuser detection in asynchronous CDMA flat Rayleigh fading channels. IEEE Transactions on Communications 43(2/3/4): p 1252–1255.
[126] Stojanovic M & Zvonar Z (1996) Linear multiuser detection in timevarying multipath fading channels. Proc. Conference on Information Sciences and Systems(CISS), Princeton University, Princeton, New Jersey, USA, 1: p 349–354.
[127] Stojanovic M & Zvonar Z (1996) Performance of linear multiuser detectors in timevarying multipath fading CDMA channels. Proc. Communication Theory MiniConference (CTMC) in conjunction with IEEE Global Telecommunication Conference (GLOBECOM), London, U.K., p 163–167.
[128] Kawahara T & Matsumoto T (1995) Joint decorrelating multiuser detection andchannel estimation in asynchronous CDMA mobile communications channels. IEEETransactions on Vehicular Technology 44(3): p 506–515.
[129] Huang HC (1996) Combined multipath processing, array processing, and multiuserdetection for DSCDMA channels. Ph.D. thesis, Department of Electrical Engineering, Princeton University, Princeton, New Jersey, USA.
[130] Miller SY (1989) Detection and estimation in multipleaccess channels. Ph.D. thesis,Department of Electrical Engineering, Princeton University, Princeton, New Jersey,USA.
[131] Miller SY & Schwartz SC (1995) Integrated spatialtemporal detectors for asynchronous Gaussian multipleaccess channels. IEEE Transactions on Communications 43(2/3/4): p 396–411.
[132] Jung P, Blanz J, Nasshan M & Baier PW (1994) Simulation of the uplink of JDCDMA mobile radio systems with coherent receiver antenna diversity. WirelessPersonal Communications, Kluwer Academic Publishers 1(2): p 61–89.
[133] Jung P & Blanz J (1995) Joint detection with coherent receiver antenna diversityin CDMA mobile radio systems. IEEE Transactions on Vehicular Technology 44(1):p 76–88.
[134] Brown T & Kaveh M (1995) A decorrelating detector for use with antenna arrays.International Journal of Wireless Information Networks 2(4): p 239–246.
[135] Zvonar Z (1996) Combined multiuser detection and diversity reception for wirelessCDMA systems. IEEE Transactions on Vehicular Technology 45(1): p 205–211.
[136] Kandala S, Sousa ES & Pasupathy S (1995) Multiuser multisensor detectors forCDMA networks. IEEE Transactions on Communications 43(2/3/4): p 946–957.
[137] Kandala S, Sousa ES & Pasupathy S (1995) Decorrelators for multisensor systemsin CDMA networks. European Transactions on Telecommunications 6(1): p 29–40.
[138] Juntti MJ & Lilleberg JO (1997) Comparative analysis of conventional and multiuser detectors in multisensor receivers. Proc. IEEE Military Communications Conference (MILCOM), Monterey, California, USA.
[139] Saquib M, Yates R & Mandayam N (1996) Decorrelating detectors for a dualrate synchronous DS/CDMA system. Proc. IEEE Vehicular Technology Conference(VTC), Atlanta, Georgia, USA, 1: p 377–381.
147
[140] Juntti MJ & Lilleberg JO (1997) Linear FIR multiuser detection for multipledata rate CDMA systems. Proc. IEEE Vehicular Technology Conference (VTC),Phoenix, Arizona, USA, 2: p 455–459.
[141] Chen DS & Roy S (1994) An adaptive multiuser receiver for CDMA systems. IEEEJournal on Selected Areas in Communications 12(6): p 808–816.
[142] Mitra U & Poor HV (1996) Analysis of an adaptive decorrelating detector for synchronous CDMA channels. IEEE Transactions on Communications 44(2): p 257–268.
[143] Mitra U & Poor HV (1996) Adaptive decorrelating detectors for CDMA systems.Wireless Personal Communications, Kluwer Academic Publishers 2(4): p 415–440.
[144] Giallorenzi TR & Wilson SG (1996) Suboptimum multiuser receivers for convolutionally coded asynchronous DSCDMA systems. IEEE Transactions on Communications 44(9): p 1183–1196.
[145] Kajiwara A & Nakagawa M (1994) Microcellular CDMA system with a linear multiuser interference canceler. IEEE Journal on Selected Areas in Communications12(4): p 605–611.
[146] Van Heeswyk F, Falconer DD & Sheikh AUH (1996) Decorrelating detectors forquasisynchronous CDMA. Wireless Personal Communications, Kluwer AcademicPublishers 3(12): p 129–147.
[147] Van Heeswyk F, Falconer DD & Sheikh AUH (1996) A delay independent decorrelating detector for quasisynchronous CDMA. IEEE Journal on Selected Areas inCommunications 14(8): p 1619–1626.
[148] Iltis RA & Mailaender L (1996) Multiuser detection of quasisynchronous CDMAsignals using linear decorrelators. IEEE Transactions on Communications 44(11):p 1561–1571.
[149] Iltis RA (1996) Demodulation and code acquisition using decorrelator detectors forQSCDMA. IEEE Transactions on Communications 44(11): p 1553–1560.
[150] Zheng FC & Barton SK (1995) On the performance of nearfar resistant CDMAdetectors in the presence of synchronization errors. IEEE Transactions on Communications 43(12): p 3037–3045.
[151] Parkvall S, Strom E & Ottersten B (1996) The impact of timing errors on the performance of linear DSCDMA receivers. IEEE Journal on Selected Areas in Communications 14(8): p 1660–1668.
[152] Paris BP (1994) Finite precision decorrelating receiver for multiuser CDMA communication systems. IEEE Transactions on Communications 44(4): p 496–507.
[153] Bulumulla S & Venkatesh SS (1996) On the quantized input decorrelating detector.Proc. Conference on Information Sciences and Systems (CISS), Princeton University, Princeton, New Jersey, USA, 1: p 595–598.
[154] DuelHallen A (1995) A family of multiuser decisionfeedback detectors for asynchronous codedivision multipleaccess channels. IEEE Transactions on Communications 43(2/3/4): p 421–434.
[155] DuelHallen A (1993) Decorrelating decisionfeedback multiuser detector for synchronous codedivision multipleaccess channel. IEEE Transactions on Communications 41(2): p 285–290.
[156] Golub GH & van Loan CF (1989) Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, USA, 2nd edn.
[157] Wei L & Schlegel C (1994) Synchronous DSSSMA system with improved decorrelating decisionfeedback multiuser detection. IEEE Transactions on Vehicular Technology 43(3): p 767–772.
148
[158] Wei L & Rasmussen LK (1996) A near ideal noise whitening filter for an asynchronous timevarying CDMA system. IEEE Transactions on Communications44(10): p 1355–1361.
[159] Schlegel C, Roy S, Alexander PD & Xiang ZJ (1996) Multiuser projection receivers.IEEE Journal on Selected Areas in Communications 14(8): p 1610–1618.
[160] Madhow U & Honig ML (1994) MMSE interference suppression for directsequencespreadspectrum CDMA. IEEE Transactions on Communications 42(12): p 3178–3188.
[161] Xie Z, Short RT & Rushforth CK (1990) A family of suboptimum detectors forcoherent multiuser communications. IEEE Journal on Selected Areas in Communications 8(4): p 683–690.
[162] Klein A, Kaleh GK & Baier PW (1996) Zero forcing and minimum meansquareerror equalization for multiuser detection in codedivision multiple access channels.IEEE Transactions on Vehicular Technology 45(2): p 276–287.
[163] Wu WC & Chen KC (1996) Linear multiuser detectors for synchronous CDMAcommunication over Rayleigh fading channels. Proc. IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Taipei,Taiwan, p 578–582.
[164] Bernstein X & Haimovich AM (1996) Spacetime optimum combining for CDMAcommunications. Wireless Personal Communications, Kluwer Academic Publishers3(12): p 73–89.
[165] Gray SD, Preisig JC & Brady D (1997) Multiuser detection in a horizontal underwater acoustic channel using array observations. IEEE Transactions on SignalProcessing 45(1): p 148–160.
[166] Honig ML & Veerakachen W (1996) Performance variability of linear multiuserdetection for DS/CDMA. Proc. IEEE Vehicular Technology Conference (VTC),Atlanta, Georgia, USA, 1: p 372–376.
[167] Poor HV & Verdu S (1997) Probability of error in MMSE multiuser detection. IEEETransactions on Information Theory 43(3): p 858–871.
[168] Rapajic PB & Vucetic BS (1994) Adaptive receiver structures for asynchronousCDMA systems. IEEE Journal on Selected Areas in Communications 12(4): p 685–697.
[169] Rapajic PB & Vucetic BS (1995) Linear adaptive transmitterreceiver structuresfor asynchronous CDMA systems. European Transactions on Telecommunications6(1): p 21–27.
[170] Miller SL (1995) An adaptive directsequence codedivision multipleaccess receiver for multiuser interference rejection. IEEE Transactions on Communications43(2/3/4): p 1746–1755.
[171] Miller SL (1996) Training analysis of adaptive interference suppression for directsequence codedivision multipleaccess systems. IEEE Transactions on Communications 44(4): p 488–495.
[172] Lee KB (1996) Orthogonalization based adaptive interference suppression for directsequence codedivision multipleaccess systems. IEEE Transactions on Communications 44(9): p 1082–1085.
[173] Woodward G, Rapajic P & Vucetic BS (1996) Adaptive algorithms for asynchronousDSCDMA receivers. Proc. IEEE International Symposium on Personal, Indoor,and Mobile Radio Communications (PIMRC), Taipei, Taiwan, p 583–587.
[174] Latvaaho M & Juntti M (1997) Modified adaptive LMMSE receiver for DSCDMAsystems in fading channels. Proc. IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Helsinki, Finland, 2: p 554–558.
149
[175] Veeravalli VV & Aazhang B (1996) On the codingspreading tradeoff in CDMAsystems. Proc. Conference on Information Sciences and Systems (CISS), PrincetonUniversity, Princeton, New Jersey, USA, 2: p 1136–1141.
[176] Oppermann I & Vucetic BS (1996) Capacity of a coded direct sequence spreadspectrum system over fading satellite channels using an adaptive LMSMMSE receiver. IEICE Transactions on Fundamentals of Electronics Communications andComputer Sciences E79A(12): p 2043–2049.
[177] Chu LC & Mitra U (1996) Improved MMSEbased multiuser detectors for mismatched delay channels. Proc. Conference on Information Sciences and Systems(CISS), Princeton University, Princeton, New Jersey, USA, 1: p 326–331.
[178] Honig ML, Madhow U & Verdu S (1995) Blind adaptive multiuser detection. IEEETransactions on Information Theory 41(3): p 944–960.
[179] Schodorf JB & Williams DB (1997) A constrained optimization approach to multiuser detection. IEEE Transactions on Signal Processing 45(1): p 258–262.
[180] Tsatsanis MK (1997) Inverse filtering criteria for CDMA systems. IEEE Transactions on Signal Processing 45(1): p 102–112.
[181] Wang X & Poor HV (1997) Multiuser diversity receivers for frequencyselectiveRayleigh fading CDMA channels. Proc. IEEE Vehicular Technology Conference(VTC), Phoenix, Arizona, USA, 1: p 198–202.
[182] Madhow U (1997) Blind adaptive interference suppression for the nearfar resistantacquisition and demodulation of directsequence CDMA. IEEE Transactions onSignal Processing 45(1): p 124–136.
[183] Mangalvedhe NR & Reed JH (1997) Blind CDMA interference rejection in multipath channels. Proc. IEEE Vehicular Technology Conference (VTC), Phoenix,Arizona, USA, 1: p 21–25.
[184] Rupf M, Tarkoy F & Massey JL (1994) Userseparating demodulation for codedivision multipleaccess systems. IEEE Journal on Selected Areas in Communications 12(6): p 786–795.
[185] Monk AM, Davis M, Milstein LB & Helstrom CW (1994) A noisewhitening approach to multiple access noise rejection–Part I: Theory and background. IEEEJournal on Selected Areas in Communications 12(5): p 817–827.
[186] Davis M, Monk A & Milstein LB (1996) A noise whitening approach to multipleaccess noise rejection–Part II: Implementation issues. IEEE Journal on SelectedAreas in Communications 14(8): p 1488–1499.
[187] Yoon YC & Leib H (1996) Matched filters with interference suppression capabilitiesfor DSCDMA. IEEE Journal on Selected Areas in Communications 14(8): p 1510–1521.
[188] Mailander L & Iltis R (1996) Single user CDMA detection using the whitenedmatched filter. Proc. Conference on Information Sciences and Systems (CISS),Princeton University, Princeton, New Jersey, USA, 2: p 846–851.
[189] Zheng FC & Barton SK (1995) Nearfar resistant detection of CDMA signals viaisolation bit insertion. IEEE Transactions on Communications 43(2/3/4): p 1313–1317.
[190] Moshavi S, Kanterakis EG & Schilling DL (1996) Multistage linear receivers forDSCDMA systems. International Journal of Wireless Information Networks 3(1):p 1–17.
[191] Wijayasuriya SSH, Norton GH & McGeehan JP (1996) A sliding window decorrelating receiver for multiuser DSCDMA mobile radio networks. IEEE Transactionson Vehicular Technology 45(3): p 503–521.
150
[192] Kajiwara A & Nakagawa M (1991) Crosscorrelation cancellation in SS/DS blockdemodulator. IEICE Transactions on Communications E 74(9): p 2596–2602.
[193] Tsatsanis MK & Giannakis GB (1996) Optimal decorrelating receivers for DSCDMA systems: A signal processing framework. IEEE Transactions on Signal Processing 44(12): p 3044–3055.
[194] Wijayasuriya SSH, Norton GH & McGeehan JP (1993) A novel algorithm for dynamic updating of decorrelator coefficients in mobile DSCDMA. Proc. IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications(PIMRC), Yokohama, Japan p 292–296.
[195] Yang LL & Scholtz RA (1996) δadjusted mth order multiuser detector. Proc. IEEEGlobal Telecommunication Conference (GLOBECOM), London, U.K., 3: p 1555–1560.
[196] Varanasi MK (1989) Multiuser detection in codedivision multipleaccess communications. Ph.D. thesis, Department of Electrical and Computer Engineering, RiceUniversity, Houston, Texas, USA.
[197] Varanasi MK & Aazhang B (1991) Nearoptimum detection in synchronous codedivision multipleaccess systems. IEEE Transactions on Communications 39(5):p 725–736.
[198] Mowbray RS, Pringle RD & Grant PM (1992) Increased CDMA system capacitythrough adaptive cochannel interference regeneration and cancellation. IEE Proceedings I 139(5): p 515–524.
[199] Tachikawa S (1992) Characteristics of Mary/spread spectrum multiple access communication systems using cochannel interference cancellation techniques. IEICETransactions on Communications E76B(8): p 941–946.
[200] Abrams BS, Zeger AE & Jones TE (1995) Efficiently structured CDMA receiverwith nearfar immunity. IEEE Transactions on Vehicular Technology 44(1): p 1–13.
[201] Sanada Y & Wang Q (1996) A cochannel interference cancellation technique using orthogonal convolutional codes. IEEE Transactions on Communications 44(5):p 549–556.
[202] Kohno R, Imai H, Hatori M & Pasupathy S (1990) Combination of an adaptivearray antenna and a canceller of interference for directsequence spreadspectrummultipleaccess system. IEEE Journal on Selected Areas in Communications 8(4):p 675–682.
[203] Yoon YC, Kohno R & Imai H (1993) Cascaded cochannel interference cancellatingand diversity combining for spreadspectrum multiaccess system over multipathfading channels. IEICE Transactions on Communications E76B(2): p 163–168.
[204] Yoon YC, Kohno R & Imai H (1993) A spreadspectrum multiaccess system withcochannel interference cancellation for multipath fading channels. IEEE Journal onSelected Areas in Communications 11(7): p 1067–1075.
[205] Saifuddin A, Kohno R & Imai H (1995) Integrated receiver structures of stageddecoder and CCI canceller for CDMA with multilevel coded modulation. EuropeanTransactions on Telecommunications 6(1): p 9–19.
[206] Saifuddin A & Kohno R (1995) Performance evaluation of nearfar resistant receiverDS/CDMA cellular system over fading multipath channel. IEICE Transactions onCommunications E78B(8): p 1136–1144.
[207] Fawer U & Aazhang B (1995) A multiuser receiver for code division multiple accesscommunications over multipath channels. IEEE Transactions on Communications43(2/3/4): p 1556–1565.
[208] Sanada Y & Wang Q (1997) A cochannel interference cancellation technique using orthogonal convolutional codes on multipath Rayleigh fading channel. IEEETransactions on Vehicular Technology 46(1): p 114–128.
151
[209] Hottinen A, Holma H & Toskala A (1995) Performance of multistage multiuserdetection in a fading multipath channel. Proc. IEEE International Symposium onPersonal, Indoor, and Mobile Radio Communications (PIMRC), Toronto, Ontario,Canada, 3: p 960–964.
[210] Holma H, Toskala A & Hottinen A (1996) Performance of CDMA multiuser detection with antenna diversity and closed loop power control. Proc. IEEE VehicularTechnology Conference (VTC), Atlanta, Georgia, USA, 1: p 362–366.
[211] Latvaaho M & Lilleberg J (1996) Parallel interference cancellation in multiuserdetection. Proc. IEEE International Symposium on Spread Spectrum Techniquesand Applications (ISSSTA), Mainz, Germany, 3: p 1151–1155.
[212] Saifuddin A & Kohno R (1996) Performance evaluation of DS/CDMA scheme withdiversity coding and MUI cancellation over fading multipath channel. IEICE Transactions on Fundamentals of Electronics Communications and Computer SciencesE79A(12): p 1994–2001.
[213] Hottinen A, Holma H & Toskala A (1996) Multiuser detection for multirate CDMAcommunications. Proc. IEEE International Conference on Communications (ICC),Dallas, Texas, USA.
[214] Latvaaho M & Lilleberg J (1996) Parallel interference cancellation based delaytracker for CDMA receivers. Proc. Conference on Information Sciences and Systems(CISS), Princeton University, Princeton, New Jersey, USA, 2: p 852–857.
[215] Latvaaho M & Lilleberg J (1996) Delay trackers for multiuser CDMA receivers. Proc. IEEE International Conference on Universal Personal Communications (ICUPC), Boston, Massachusetts, USA, 1: p 326–330.
[216] Buehrer RM, Kaul A, Striglis S & Woerner BD (1996) Analysis of DSCDMAparallel interference cancellation with phase and timing errors. IEEE Journal onSelected Areas in Communications 14(8): p 1522–1535.
[217] Buehrer RM, Correal NS & Woerner BD (1996) A comparison of multiuser receiversfor cellular CDMA. Proc. IEEE Global Telecommunication Conference (GLOBECOM), London, U.K., 3: p 1571–1577.
[218] Agashe P & Woerner B (1996) Interference cancellation for a multicellular CDMAenvironment. Wireless Personal Communications, Kluwer Academic Publishers 3(12): p 1–15.
[219] Chen DW, Siveski Z & BarNess Y (1994) Synchronous multiuser CDMA detectorwith soft decision adaptive canceler. Proc. Conference on Information Sciences andSystems (CISS), Princeton University, Princeton, New Jersey, USA, 1: p 139–143.
[220] BarNess Y & Sezgin N (1995) Adaptive threshold setting for multiuser CDMAsignal separators with soft tentative decisions. Proc. Conference on InformationSciences and Systems (CISS), The Johns Hopkins University, Baltimore, Maryland,USA, p 174–179.
[221] Vanghi V & Vojcic B (1996) Soft interference cancellation in multiuser communications. Wireless Personal Communications, Kluwer Academic Publishers 3(12):p 111–128.
[222] BarNess Y, Siveski Z & Chen DW (1994) Bootstrapped decorrelating algorithmfor adaptive interference cancelation in synchronous CDMA communications systems. Proc. IEEE International Symposium on Spread Spectrum Techniques andApplications (ISSSTA), Oulu, Finland, 1: p 162–166.
[223] Zhu B, Ansari N & Siveski Z (1995) Convergence and stability analysis of a synchronous adaptive CDMA receiver. IEEE Transactions on Communications 43(12):p 3073–3079.
[224] BarNess Y & Punt JB (1996) Adaptive ’bootstrap’ CDMA multiuser detector.Wireless Personal Communications, Kluwer Academic Publishers 3(12): p 55–71.
152
[225] EldersBoll H, Herper M & Busboom A (1997) Adaptive receivers for mobileDSCDMA communication systems. Proc. IEEE Vehicular Technology Conference(VTC), Phoenix, Arizona, USA, 3: p 2128–2132.
[226] Divsalar D & Simon M (1995) Improved CDMA performance using parallel interference cancellation. Technical Report, Jet Propulsion Laboratory, California Instituteof Technology, Pasadena, California, USA.
[227] Divsalar D & Simon M (1996) A new approach to parallel interference cancellation for CDMA. Proc. IEEE Global Telecommunication Conference (GLOBECOM),London, U.K., 3: p 1452–1457.
[228] Correal NS, Buehrer RM & Woerner BD (1997) Improved CDMA performancethrough bias reduction for parallel interference cancellation. Proc. IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications(PIMRC), Helsinki, Finland, 2: p 565–569.
[229] Radovic A & Aazhang B (1993) Iterative algorithms for joint data detection anddelay estimation for code division multiple access communication systems. Proc.Annual Allerton Conference on Communications, Control, and Computing, AllertonHouse, Monticello, Illinois, USA.
[230] Nelson LB & Poor HV (1996) Iterative multiuser receivers for CDMA channels: AnEMbased approach. IEEE Transactions on Communications 44(12): p 1700–1710.
[231] Dahlhaus D, Fleury H & Radovic A (1998) A sequential algorithm for joint parameter estimation and multiuser detection in DS/CDMA systems with multipathpropagation. Wireless Personal Communications, Kluwer Academic Publishers, inpress.
[232] Dahlhaus D, Jarosch A, Fleury H & Heddergott R (1997) Joint demodulation inDS/CDMA systems exploiting the space and time diversity of the mobile radiochannel. Proc. IEEE International Symposium on Personal, Indoor, and MobileRadio Communications (PIMRC), Helsinki, Finland, 1: p 47–52.
[233] Viterbi AJ (1990) Very low rate convolutional codes for maximum theoretical performance of spreadspectrum multipleaccess channels. IEEE Journal on SelectedAreas in Communications 8(4): p 641–649.
[234] Patel P & Holtzman J (1994) Analysis of a simple successive interference cancellation scheme in a DS/CDMA system. IEEE Journal on Selected Areas in Communications 12(10): p 796–807.
[235] Soong ACK & Krzymien WA (1995) Performance of a reference symbol assistedmultistage successive interference cancelling receiver in a multicell CDMA wireless systems. Proc. IEEE Global Telecommunication Conference (GLOBECOM),Singapore, 1: p 152–156.
[236] Soong ACK & Krzymien WA (1996) A novel CDMA multiuser interference cancellation receiver with reference symbol aided estimation of channel parameters. IEEEJournal on Selected Areas in Communications 14(8): p 1536–1547.
[237] Nesper O & Ho P (1996) A reference symbol assisted interference cancelling hybridreceiver for an asynchronous DS/CDMA system. Proc. IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Taipei,Taiwan, 1: p 108–112.
[238] Nesper O & Ho P (1996) A pilot symbol assisted interference cancellation schemefor an asynchronous DS/CDMA system. Proc. IEEE Global TelecommunicationConference (GLOBECOM), London, U.K., 3: p 1447–1451.
[239] Soong ACK & Krzymien WA (1997) Performance of a reference symbol assistedmultistage successive interference cancelling receiver with quadriphase spreading.Proc. IEEE Vehicular Technology Conference (VTC), Phoenix, Arizona, USA, 2:p 460–464.
153
[240] Johansson AL & Svensson A (1995) Successive interference cancellation in multiple data rate DS/CDMA systems. Proc. IEEE Vehicular Technology Conference(VTC), Chicago, Illinois, USA, p 704–708.
[241] Johansson AL & Svensson A (1996) Multistage interference cancellation in multirateDS/CDMA on a mobile radio channel. Proc. IEEE Vehicular Technology Conference(VTC), Atlanta, Georgia, USA, 2: p 666–670.
[242] Cheng FC & Holtzman JM (1994) Effect of tracking error on DS/CDMA successiveinterference cancellation. Technical Report, WINLABTR90 WINLAB, RutgersUniversity, Piscataway, New Jersey, USA.
[243] Soong ACK & Krzymien WA (1996) Robustness of the reference symbol assistedmultistage successive interference cancelling receiver with imperfect parameter estimates. Proc. IEEE Vehicular Technology Conference (VTC), Atlanta, Georgia,USA, 2: p 676–680.
[244] Oon TB, Steele R & Li Y (1997) Performance of an adaptive successive serialparallel CDMA cancellation scheme in flat Rayleigh fading channels. Proc. IEEEVehicular Technology Conference (VTC), Phoenix, Arizona, USA, 1: p 193–197.
[245] Van der Wijk F, Janssen GMJ & Prasad R (1995) Groupwise successive interference cancellation in a DS/CDMA system. Proc. IEEE International Symposium onPersonal, Indoor, and Mobile Radio Communications (PIMRC), Toronto, Ontario,Canada, 2: p 742–746.
[246] Haifeng W, Lilleberg J & Rikkinen K (1997) A new suboptimal multiuser detectionapproach for CDMA systems in Rayleigh fading channel. Proc. Conference on Information Sciences and Systems (CISS), The Johns Hopkins University, Baltimore,Maryland, USA, 1: p 276–280.
[247] Alexander PD, Rasmussen LK & Schlegel C (1997) A linear receiver for codedmultiuser CDMA. IEEE Transactions on Communications 45(5): p 605–610.
[248] Juntti MJ (1998) Multiuser detector performance comparisons in multirate CDMAsystems. Proc. IEEE Vehicular Technology Conference (VTC), Ottawa, Canada,submitted.
[249] Juntti MJ (1997) Performance of multiuser detection in multirate CDMA systems.Wireless Personal Communications, Kluwer Academic Publishers, submitted.
[250] Varanasi MK (1995) Group detection in synchronous Gaussian codedivisionmultipleaccess channels. IEEE Transactions on Information Theory 41(3): p 1083–1096.
[251] Varanasi MK (1995) Parallel group detection for synchronous CDMA communication over frequencyselective Rayleigh fading channels. IEEE Transactions on Information Theory 41(3): p 1083–1096.
[252] Abdulrahman M, Sheikh AUH & Falconer DD (1994) Decision feedback equalizationfor CDMA in indoor wireless communications. IEEE Journal on Selected Areas inCommunications 12(4): p 698–706.
[253] Hafeez A & Stark WE (1996) Combined decisionfeedback multiuser detection/softdecision decoding for CDMA channels. Proc. IEEE Vehicular Technology Conference (VTC), Atlanta, Georgia, USA, 1: p 382–386.
[254] Xie Z, Rushforth CK & Short R (1990) Multiuser signal detection using sequentialdecoding. IEEE Transactions on Communications 38(5): p 578–583.
[255] Wu B & Wang Q (1996) New suboptimal multiuser detectors for synchronousCDMA systems. IEEE Transactions on Communications 44(7): p 782–785.
[256] Juntti MJ, Schlosser T & Lilleberg JO (1997) Genetic algorithms for multiuserdetection in synchronous CDMA. Proc. IEEE International Symposium on Information Theory (ISIT), Ulm, Germany, p 492.
154
[257] Iltis RA & Mailaender L (1994) Multiuser code acquisition using parallel decorrelators. Proc. Conference on Information Sciences and Systems (CISS), Princeton,New Jersey, USA, 1: p 109–114.
[258] Strom EG, Parkvall S, Miller SL & Ottersen BE (1996) Propagation delay estimation in asynchronous directsequence codedivision multiple access systems. IEEETransactions on Communications 44(1): p 84–93.
[259] Steiner B & Jung P (1994) Optimum and suboptimum channel estimation for theuplink of CDMA mobile radio systems with joint detection. European Transactionson Telecommunications 5(1): p 39–50.
[260] Strom EG, Parkvall S, Miller SL & Ottersten BE (1996) DSCDMA synchronizationin timevarying fading channels. IEEE Journal on Selected Areas in Communications 14(8): p 1636–1642.
[261] Bensley SE & Aazhang B (1996) Subspacebased channel estimation for code division multiple access communication systems. IEEE Transactions on Communications 44(8): p 1009–1019.
[262] Parkvall S, Strom EG & Milstein LB (1996) Coded asynchronous nearfar resistantDSCDMA receivers operating without synchronization. Proc. Communication Theory MiniConference (CTMC) in conjunction with IEEE Global TelecommunicationConference (GLOBECOM), London, U.K., p 183–187.
[263] Torlak M & Xu G (1997) Blind multiuser channel estimation in asyncronous CDMAsystems. IEEE Transactions on Signal Processing 45(1): p 137–147.
[264] Joutsensalo J, Lilleberg J, Hottinen A & Karhunen J (1996) A hierarchic maximumlikelihood method for delay estimation in CDMA. Proc. IEEE Vehicular TechnologyConference (VTC), Atlanta, Georgia, USA, 1: p 188–192.
[265] Zheng D, Li J, Miller SL & Strom EG (1997) An efficient codetiming estimator forDSCDMA signals. IEEE Transactions on Signal Processing 45(1): p 82–89.
[266] Iltis RA & Mailaender L (1996) An adaptive multiuser detector with joint amplitudeand delay estimation. IEEE Transactions on Communications 44(11): p 1561–1571.
[267] Lim TJ & Rasmussen LK (1997) Adaptive symbol and parameter estmation inasynchronous multiuser CDMA detectors. IEEE Transactions on Communications45(2): p 213–220.
[268] Steinberg Y & Poor HV (1994) Sequential amplitude estimation in multiuser communications. IEEE Transactions on Information Theory 40(1): p 11–20.
[269] Steinberg Y & Poor HV (1994) On sequential delay estimation in wideband digitalcommunication systems. IEEE Transactions on Information Theory 40(5): p 1327–1333.
[270] Moon TK, Xie Z, Rushforth CK & Short RT (1994) Parameter estimation in amultiuser communication system. IEEE Transactions on Communications 42(8):p 2553–2560.
[271] Halford KW & BrandtPearce M (1994) User identification and multiuser detectionof l out of k users in a CDMA system. Proc. Conference on Information Sciences andSystems (CISS), Princeton University, Princeton, New Jersey, USA, 1: p 115–120.
[272] Halford KW & BrandtPearce M (1995) Maximum likelihood detection and estimation for new users in CDMA. Proc. Conference on Information Sciences and Systems(CISS), The Johns Hopkins University, Baltimore, Maryland, USA, 1: p 193–198.
[273] Mitra U & Poor HV (1996) Activity detection in a multiuser environment. WirelessPersonal Communications, Kluwer Academic Publishers 3(12): p 149–174.
[274] Joutsensalo J (1996) A subspace method for model order estimation in CDMA.Proc. IEEE International Symposium on Spread Spectrum Techniques and Applications (ISSSTA), Mainz, Germany, 2: p 688–692.
155
[275] Liu H & Xu G (1996) A subspace method for signature waveform estimation in synchronous CDMA systems. IEEE Transactions on Communications 44(10): p 1346–1354.
[276] Johnson DH, Lee YK, Kelly OE & Pistole JL (1996) Typebased detection forunknown channels. Proc. IEEE International Conference on Acoustics, Speech, andSignal Processing (ICASSP), Atlanta, Georgia, USA, p 2475–2478.
[277] Djuric PM & Guo M (1997) A novel approach to multiuser detection for CDMAsystems. Proc. IEEE Vehicular Technology Conference (VTC), Phoenix, Arizona,USA, 2: p 563–566.
[278] Lee YK, Johnson DH & Kelly OE (1997) Typebased detection for spread spectrum.Proc. IEEE International Conference on Communications (ICC), Montreal, Canada.
[279] Aazhang B, Paris BP & Orsak G (July 1992) Neural networks for multiuser detection in codedivision multiple access systems. IEEE Transactions on Communications 40(7): p 1212–1222.
[280] Mitra U & Poor HV (1995) Adaptive receiver algorithms for nearfar resistantCDMA. IEEE Transactions on Communications 43(2/3/4): p 1713–1724.
[281] Hottinen A (1994) Selforganizing multiuser detection. Proc. IEEE InternationalSymposium on Spread Spectrum Techniques and Applications (ISSSTA), Oulu,Finland, 1: p 152–156.
[282] Mitra U & Poor HV (1994) Neural network techniques for adaptive multiuserdemodulation. IEEE Journal on Selected Areas in Communications 12(9): p 1460–1470.
[283] Miyajima T, Hasegawa T & Haneishi M (1993) On the multiuser detection usinga neural network in codedivision multipleaccess communications. IEICE Transactions on Communications E76B(9): p 961–968.
[284] Miyajima T & Hasegawa T (1996) Multiuser detection using a Hopfield network forasynchronous codedivision multipleaccess systems. IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences E79A(12): p 1963–1971.
[285] Proakis JG & Manolakis DG (1992) Digital Signal Processing: Principles, Algorithms and Applications Macmillan, New York City, New York, USA, 2nd edn.
[286] Ludyk G (1985) Stability of TimeVariant DiscreteTime Systems. In: Hartmann I(ed) Advances in Control Systems and Signal Processing, vol 5, Friedr. Vierweg &Sohn, Braunschweig, Germany.
[287] Barrett MJ (1987) Error probability for optimal and suboptimal quadratic receiversin rapid Rayleigh fading channels. IEEE Journal on Selected Areas in Communications 5(2): p 302–304.
[288] Mammela A & Kaasila VP (1994) Prediction, smoothing and interpolation in adaptive diversity reception. Proc. IEEE International Symposium on Spread SpectrumTechniques and Applications (ISSSTA), Oulu, Finland, p 475–478.
[289] Latvaaho M, Juntti M & Heikkila M (1997) Parallel interference cancellation receiver for DSCDMA systems in fading channels. Proc. IEEE International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Helsinki,Finland, 2: p 559–564.
[290] Pan CT & Plemmons RJ (1989) Least squares modifications with inverse factorizations: parallel implications. Journal of Computational and Applied Mathematics27: p 109–127.
[291] Stoer J & Bulirsch R (1993) Introduction to Numerical Analysis. SpringerVerlag,New York City, New York, USA.
[292] Hestenes MR & Stiefel E (1952) Methods of conjugate gradients for solving linearsystems. Journal of Research of the National Bureau of Standards 49(6): p 409–436.
APPENDIX 1/1
Stability analysis of linear detectors
Proof of equations (3.14) and (3.15): We define partitions
RN =(RN−1 γN−1
γ>N−1 RN(0)
)∈ IRNK×NK , (A1.1)
where RN−1 ∈ IR(N−1)K×(N−1)K ,
γN−1 =(
0K · · · 0K RN (1))> ∈ IR(N−1)K×K , (A1.2)
and
TN = R−1N =
(CN−1 αN−1
α>N−1 TN,N (N)
)∈ IRNK×NK , (A1.3)
where CN−1 ∈ IR(N−1)K×(N−1)K , and αN−1 ∈ IR(N−1)K×K .By applying the matrix inversion formulae [88, pp. 571572]
(A + BCD)−1 = A−1 −A−1B(DA−1B + C−1)−1DA−1, (A1.4)(A BC D
)−1
=((A−BD−1C)−1 −(A−BD−1C)−1BD−1
−(D−CA−1B)−1CA−1 (D−CA−1B)−1
), (A1.5)
and the fact that RN and TN are symmetric we obtain the following recursionformulae
CN−1 = TN−1 + TN−1γN−1TN,N(N)γ>N−1TN−1, (A1.6)
α>N−1 = −TN,N(N)γ>N−1TN−1. (A1.7)
Now (3.14) and (3.15) follow by the definitions of TN−1, γN−1, and αN−1. Proof of Proposition 1: Assume that the decorrelating detector is stable,
and assume that integer j is such that N → ∞ implies (N − j) → ∞1. Then1Condition states that j is such that its distance to N approaches infinity as N approaches
infinity. The condition is satisfied, e.g., if N = cj, where c is an arbitrary constant. For example,since N = 2P + 1, it follows that N →∞⇒ P →∞.
APPENDIX 1/2
we have by the stability of the decorrelating detector TN−1,j(N − 1) → 0K , asN → ∞. Assume that both i and j are such that N → ∞ implies (N − i) → ∞and (N − j) → ∞. Then the increment part in (3.14) approaches zero matrixas N → ∞, since both TN−1,i(N − 1) and TN−1,j(N − 1) → 0K , as N → ∞.This guarantees the existence of a unique asymptotic limit for Ti,j(N − 1). Thus,the uniqueness of the blocks Ti,P+1(N − 1), i = −P, . . . , P follows. Since thedecorrelating detector DN consists of the blocks Ti,P+1(N − 1), i = −P, . . . , P theuniqueness of the IIR decorrelating detector has been shown. The uniqueness ofthe LMMSE detector follows with exactly similar arguments. The uniqueness ofthe noisewhitening detector follows easily from the uniqueness of the decorrelatingdetector by analyzing the Cholesky factor of TN as N →∞.
Proof of Proposition 2: The result follows by manipulation of (3.11). Bythe definition of R(n), i.e., by R =
(ζ1,R(n), ζ2
)∈ IRNK×(N+2)K , we can write
R(n)>(R(n))−1R(n) =
ζ>1 (R(n))−1ζ1 ζ>1 ζ>1 (R(n))−1ζ2
ζ1 R(n) ζ2
ζ>2 (R(n))−1ζ1 ζ>2 ζ>2 (R(n))−1ζ2
. (A1.8)
Thus, except for the edges (the first and last K columns and rows) the matrix
H(n) = (R(n)>(R(n))−1R(n) + σ2E−1) (A1.9)
is the same asH = (R(n) + σ2(E(n))−1). (A1.10)
If the assumptions of the Proposition 2 are valid, TN,1(N)→ 0, as N →∞, and
ζ>1 (R(n))−1ζ2 = R(n−P−1)(1)T>N,1R>(n+P+1)(1)→ 0, as N →∞. (A1.11)
By utilizing the fact that the mathematical structure of the decorrelating and theLMMSE detectors is similar, it can be seen from (3.14) that the effect of the firstand the last diagonal blocks of H(n) on the middle block column of (H(n))−1 goesto zero as N →∞. Thus, we have shown that
mbc
(H(n))−1→ mbc
H−1
, as N →∞. (A1.12)
Recall that the optimal FIR LMMSE detector is the middle block column of(R(n))−1R(n)(H(n))−1 by (3.11). It is easy to verify by definitions that
(R(n))−1R(n) =(
(R(n))−1ζ1 INK (R(n))−1ζ2
). (A1.13)
Matrix (R(n))−1R(n) is now identity except the first and last block columns.By the stability assumptions the first and last blocks of the middle block column of (H(n))−1 approach zero so that the effect of the first and last block ofmatrix (R(n))−1R(n) vanishes asymptotically. Thus, (R(n))−1R(n) acts asymptotically (N → ∞) as an identity to the middle block column of the matrix(R(n))−1R(n)H(n). Proposition 2 is now proved by the above and (A1.12).
APPENDIX 1/3
Discussion on expression (3.19) to be valid: First note that by thedefinition of R(n), we have
R(n)R(n)> = R2 + ζ1ζ>1 + ζ2ζ
>2 . (A1.14)
The structure of the matrix R(n)R(n)> is similar to that of R(n). Furthermore, thematrix is the same as R2 except the perturbations caused by ζ1ζ
>1 and ζ2ζ
>2 to the
first and last diagonal blocks. Now it easy to understand, that under conditionssimilar to (3.17) being true the middle block row (or column) of (R(n)R(n)>)−1
approaches the middle block row (or column) ofR−2. If, on the other hand, that isthe case, it is easy to see from (3.10) that D[d]N → D[d]N , as N →∞, because thezero blocks of R(n) remove the effect of perturbations caused by ζ1ζ
>1 and ζ2ζ
>2
to the first and last diagonal blocks of R(n)R(n)> as N is sufficiently large.
APPENDIX 2/1
Eigenanalysis of detection based on conjugategradient algorithm
Let the correlation matrix (2.37) be presented with the eigenvalue decomposition[291, Chap. 6]
R(n) = UΛU>, (A2.1)
whereΛ = diag(λ1, λ2, . . . , λNK), (A2.2)
λi, i = 1, 2, . . . , NK are the eigenvalues of R(n), and the matrix U ∈ IRNK×NK
includes the corresponding orthonormal eigenvectors in its columns. Let ui bethe ith column of U . Let the estimation error vector of the decorrelating detectoroutput be
εd = hd − y =[(R(n))−1 − INK
]y =
NK∑i=1
(1λi− 1)(u>i y)ui, (A2.3)
and similarly the estimation error vector of the LMMSE detector output
εms = hms − y. (A2.4)
To explain the better performance of the CG method in comparison to the idealdecorrelating detector it is first justified, why
‖εms‖ < ‖εd‖ (A2.5)
tend to be true, when there is no nearfar problem. The result in (A2.5) statesthat the LMMSE detector is a compromise between the conventional single usermatched filter detector and the decorrelating detector. Assume that the energiessatisfy Ek = El, ∀k, l and
γ =Ekσ2. (A2.6)
Then it follows from (A2.3)
‖εd‖2 =NK∑i=1
(1λi− 1)2(u>i y)2 =
NK∑i=1
f(λi)(u>i y)2, (A2.7)
APPENDIX 2/2
where f(x) = ( 1x − 1)2. Since the LMMSE detector is similar to the decorrelating
detector R−1(n) replaced by (R(n) + γINK)−1, it follows that
‖εms‖2 =NK∑i=1
(1
λi + γ−1− 1)2(u>i y)2 =
NK∑i=1
f(λi + γ−1)(u>i y)2, (A2.8)
By its derivativedf(x)dx
= 2(1x2− 1x3
) (A2.9)
it is easy to see that f(x) decreases very fast, when x < 1, and increases slowly,when x > 1. Usually some of the eigenvalues λi are greater than 1 and some aresmaller than 1. This implies that the values f(λi + γ−1) tend to be much smallerthan values f(λi), when λi < 1. Similarly, the values f(λi + γ−1) tend to be onlyslightly larger than values f(λi), when λi > 1 justifying
‖εms‖ < ‖εd‖. (A2.10)
The CG method used for decorrelating detection reduces distance
ε(m) = h(m)− hd. (A2.11)
at each iteration by smoothly increasing the subspace in which Ω(h) is minimized.Assume that the initial guess is
h(0) = y. (A2.12)
The LMMSE detector output is geometrically between the MF bank and decorrelating detector output vectors. Therefore, it is understandable that there exists msuch that for the mth iteration the estimate h(m) is closer to hms than to hd. TheCG estimate in a way sweeps through the LMMSE estimate.