5
ROBUST ML ESTIMATION FOR UNKNOWN NUMBERS OF SIGNALS: PERFORMANCE STUDY Pei-Jung Chung Institute for Digital Communications School of Engineering and Electronics, The University of Edinburgh, UK [email protected] ABSTRACT We study the performance of a recently proposed robust ML estimation procedure for unknown numbers of signals. This approach finds the ML estimate for the maximum num- ber of signals and selects relevant components associated with the true parameters from the estimated parameter vector. Its computational cost is significantly lower than conventional methods based on information theoretic criteria or multiple hypothesis tests. We show that the covariance matrix of rele- vant estimates is upper and lower bounded by two covariance matrices. These bounds are easy to compute by existing re- sults for standard ML estimation. Our analysis is further con- firmed by numerical experiments over a wide range of SNRs. 1. INTRODUCTION The problem of estimating direction of arrival (DOA) plays a key role in array processing, radar, communications and geo- physics. The maximum likelihood (ML) approach is charac- terized by excellent statistical properties and robustness against small sample numbers, signal coherence and closely located sources. The standard ML method assumes the number of signals, m, to be known and maximizes the concentrated likelihood function over an m-dimensional parameter space. In the case of unknown numbers of signals, conventional information the- oretic criterion based approaches [7] or multiple hypothesis testing procedures [3] compute the ML estimates for a se- quence of model orders and select the best estimate accord- ing to the underlying criterion. The total computational cost can be very high due to the multi-dimensional search for each candidate model order. In contrary to the aforementioned methods, the recently proposed ML estimation procedure [2] carries out optimiza- P.-J. Chung acknowledges support of her position from the Scottish Funding Council and their support of the Joint Research Institute with the Heriot-Watt University as a component part of the Edinburgh Research Part- nership. tion for the maximum possible number of signals. The result- ing parameter vector contains relevant components that are associated with the true parameters. They can be recognized on the relevance value that measures each component's con- tribution to the likelihood function. Further, the number of relevant components provides an estimate for the number of signals. The purpose of this contribution is to get more insight into the performance of this algorithm. In particular, we shall in- vestigate estimation errors of relevant estimates. The main difficulty here is that a overparameterized model is not uniquely identified. We overcome this problem by introducing pseudo sources with negligible powers. Based on this modified model, an upper bound and a lower bound on the covariance matrix of relevant estimates are derived. This paper is outlined as follows. Section 2 gives a brief description of the data model. The robust ML direction find- ing algorithm for unknown numbers of signals [2] is summa- rized in Section 3. We develop the main result, upper and lower bounds on the covariance matrix, in Section 4. Numer- ical results are presented in Section 5. Concluding remarks are given in Section 6. 2. SIGNAL MODEL Consider an array of n sensors receiving m narrow band sig- nals emitted by far-field sources located at Om == [(}1, ... , (}m]T. The array output x (t) is described as x(t) == H m(Om)sm(t) + n(t), t == 1, ... , T, (1) where the ith column d((}i) of represents the steering vector associated with the signal ar- riving from () i. The signal vector 8 m ( t) is considered as a stationary, temporally uncorrelated complex normal process with zero mean and covariance matrix CSm == E[sm(t)sm(t)'] where (.), denotes the Hermitian transpose. The noise vector 978-1-4244-2241-8/08/$25.00 ©2008 IEEE 86

[IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing

Embed Size (px)

Citation preview

Page 1: [IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing

ROBUST ML ESTIMATION FOR UNKNOWN NUMBERS OF SIGNALS:PERFORMANCE STUDY

Pei-Jung Chung

Institute for Digital CommunicationsSchool of Engineering and Electronics, The University of Edinburgh, UK

[email protected]

ABSTRACT

We study the performance of a recently proposed robustML estimation procedure for unknown numbers of signals.This approach finds the ML estimate for the maximum num­ber ofsignals and selects relevant components associated withthe true parameters from the estimated parameter vector. Itscomputational cost is significantly lower than conventionalmethods based on information theoretic criteria or multiplehypothesis tests. We show that the covariance matrix of rele­vant estimates is upper and lower bounded by two covariancematrices. These bounds are easy to compute by existing re­sults for standard ML estimation. Our analysis is further con­firmed by numerical experiments over a wide range of SNRs.

1. INTRODUCTION

The problem of estimating direction of arrival (DOA) plays akey role in array processing, radar, communications and geo­physics. The maximum likelihood (ML) approach is charac­terized by excellent statistical properties and robustness againstsmall sample numbers, signal coherence and closely locatedsources.

The standard ML method assumes the number of signals,m, to be known and maximizes the concentrated likelihoodfunction over an m-dimensional parameter space. In the caseofunknown numbers ofsignals, conventional information the­oretic criterion based approaches [7] or multiple hypothesistesting procedures [3] compute the ML estimates for a se­quence of model orders and select the best estimate accord­ing to the underlying criterion. The total computational costcan be very high due to the multi-dimensional search for eachcandidate model order.

In contrary to the aforementioned methods, the recentlyproposed ML estimation procedure [2] carries out optimiza-

P.-J. Chung acknowledges support of her position from the ScottishFunding Council and their support of the Joint Research Institute with theHeriot-Watt University as a component part of the Edinburgh Research Part­nership.

tion for the maximum possible number of signals. The result­ing parameter vector contains relevant components that areassociated with the true parameters. They can be recognizedon the relevance value that measures each component's con­tribution to the likelihood function. Further, the number ofrelevant components provides an estimate for the number ofsignals.

The purpose of this contribution is to get more insight intothe performance of this algorithm. In particular, we shall in­vestigate estimation errors of relevant estimates. The maindifficulty here is that a overparameterized model is not uniquelyidentified. We overcome this problem by introducing pseudosources with negligible powers. Based on this modified model,an upper bound and a lower bound on the covariance matrixof relevant estimates are derived.

This paper is outlined as follows. Section 2 gives a briefdescription of the data model. The robust ML direction find­ing algorithm for unknown numbers of signals [2] is summa­rized in Section 3. We develop the main result, upper andlower bounds on the covariance matrix, in Section 4. Numer­ical results are presented in Section 5. Concluding remarksare given in Section 6.

2. SIGNAL MODEL

Consider an array of n sensors receiving m narrow band sig­nals emitted by far-field sources located at Om == [(}1, ... , (}m]T.The array output x (t) is described as

x(t) == H m(Om)sm(t) + n(t), t == 1, ... ,T, (1)

where the ith column d((}i) of

represents the steering vector associated with the signal ar­riving from ()i. The signal vector 8 m (t) is considered as astationary, temporally uncorrelated complex normal processwith zero mean and covariance matrix CSm == E[sm(t)sm(t)']where (.), denotes the Hermitian transpose. The noise vector

978-1-4244-2241-8/08/$25.00 ©2008 IEEE 86

Page 2: [IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing

n(t) is a spatially and temporally uncorrelated complex nor­mal process with zero mean and covariance matrix vI n wherev is the noise spectral parameter and In is an n x n identitymatrix. Under these assumptions, the array output x (t) iscomplex normally distributed with zero mean and covariancematrix

C~ == Hm((Jm)CsrnHm((Jm)' + vIn. (3)

Based on array observations {x (t ) } f= 1 and a pre-specified

number of signals, m, the ML estimate iJm is obtained byminimizing the negative concentrated likelihood function [1]

IT((Jm) == logdet (Pm ((Jm)CxPm((Jm)+DP;((Jm)) , (4)

A 1 ..l A)V == --tr(Pm((Jm)C~ (5)

n-m

where Pm((Jm) represents the projection matrix onto the col­umn space of Hm((Jm) and P;((Jm) == In - Pm((Jm)'

A TC~ == ~ Lt=l x(t)x(t)' denotes the sample covariance ma-trix. The problem of central interest is to estimate the DOAparameters when the true number of signals, mo, is unknown.

3. ROBUST ML ESTIMATION FOR UNKNOWNNUMBERS OF SIGNALS

From the asymptotic property of ML estimation under mis­specified numbers of signals, we know that for m > mo, theML estimate iJm contains (m - mo) components correspond­ing to the true parameters in (Jo [4]. Moreover, the value ofthe likelihood function attains its minimum at mo and remainsconstant for m 2: mo. Motivated by this observation, we pro­posed a robust ML estimation procedure in [2] that requiresonly an upper bound on the number of signals M. The likeli­hood function is maximized over an M -dimensional parame­ter space. From the resulting estimate, we select the relevantcomponents according to their contribution to the likelihoodfunction.

More precisely, let iJM denote the ML estimate obtainedby minimizing the negative log-likelihood function (4) withm==M:

iJ M == arg min IT( (J M)' (6)OM

Since M 2: mo, the M x 1 vector iJ M == [01 , ... ,OM]T con­tains more elements than the true parameter vector (Jo does.For large T, a subset of the elements in iJM coincide withthose of Bo. The elements of iJM that are associated with(Jo , are referred to as relevant estimates.

To select the relevant components, we compute the rele­vant value for each component O'i as follows

where Oi contains all elements of iJM except the ith compo­nent Oi:

iJi == [01, ... ,Oi-1, Oi+1,'''' OM]' (8)

Note that the log-likelihood IT(8i ) is computed by (4) us­

ing the (M - 1) dimensional vector 8i . The value of R(Odindicates the contribution of the ith element Om to the log­likelihood function. The normalizing factor 1/IT (iJ !vI) isused to improve numerical stability. As pointed out in [2],a high relevance value R(Oi) implies that Oi is associatedwith one of (Jo 's components. A low relevance value (usuallyclose to zero) indicates that Oi does not correspond to anytrue parameter. Based on this observation, we decide Oi to bea relevant estimate if R(Oi) exceeds a given threshold 0:.

Let R(l) 2: R(2) 2: ... 2: R(M) denote the ordered rele­

vance values with corresponding {0(1)' 0(2) ,... ,O(M)}' The setof relevant estimates is given by

(9)

where

R(l) 2: R(2) 2: ... 2: R(k) 2: 0:. (10)

We also define the k x 1 relevant vector as iJo == [O(l) , ... ,0(k)]T.Asymptotically the number of components in S or iJo coin­cides with the true number of signals mo.

4. COVARIANCE MATRIX OF RELEVANTESTIMATES

The consistency property of ML estimation has been estab­lished previously [4]. In the variance study, the main chal­lenge is that an overparameterizedmodel is not uniquely iden­tified. The asymptotic covariance matrix represents a class ofmatrices involving generalized inverse, rather than a uniquelydefined matrix [5]. In the following, rather than applying theelegant results of [5] directly, we shall take a more practicalapproach to assess estimation errors. More specifically, weshow that the asymptotic covariance matrix of relevant esti­mates is lower and upper bounded by two covariance matri­ces associated with mo and (M + 1) signals in standard MLestimation.

To tackle the identification problem, we consider a modi­fied version of the signal model (1) :

where the subscript (')0 is used to emphasize that the steeringmatrix H 0 ((Jo) and the signal vector So (t) are associated withthe real signal sources. The n x (M - mo) matrix H f ((J f) de­notes the steering matrix associated with the pseudo sources.The (M - mo) x 1 signal vector Sf. (t) corresponding to the

87

Page 3: [IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing

pseudo sources is independent of So (t) with the covariancematrix

(12)

where E > 0 is positive with small magnitude compared tothe real signal powers, i.e. E << tr(C so), The covariancematrix of array outputs (3) becomes

Cx == HoCsoH~ + Hf.Cf. H :. + vln (13)

where H 0 ==Ho(( 0 ), Hf. == Hf. (Of.) and Cso == ES:J(t)S:J(t)'.Note that for E -t 0, (11) and (13) approach to the true signalmodel and covariance matrix, respectively. Without loss ofgenerality, we assume H == [HoH ~J is of full rank.

Based on the model (11) and results from standard MLestimation [6], we shall derive the upper and lower bounds onthe asymptotic covariance matrix of relevant estimates

A A TK o == E[(Oo - ( 0 )(00 - ( 0 ) ]. (14)

4.1. Existing Results for Standard ML Estimation

The following properties for standard ML estimation [6] areessential to the proofof Theorem 1.

Result 1 (Explicitformulafor asymptotic covariance matrix)Suppose the signal model (1) and the number of signals, m,to be known. The asymptotic covariance matrix of the MLestimator K(m) == E[(Om - Om)(Om - Om)T] is given by

v { ~ H ~ H 1 Tl }-l2T Rel(Dm~Dm)8(CsrnHmc;rnHmCsrn)J'

(15)where 8 denotes the Hadamard elementwise product. Then x m matrix D m == [d'(th)·· ·d'(Bm )] contains the firstderivative of the steering vectors d' (B i ) == 8d(Bi ) / aB i .

Result 2 (R7 of[6J) If the (m + 1)st source is uncorrelatedwith the other m, then K (m) increases monotonically withincreasing m. More specifically,

[ 1m 0 1K (m + 1) [ 10' ] 2 K (m) , (16)

where I m is an m x m identity matrix and the (m + 1) x(m + 1) matrix K (m + 1) denotes the covariance matrix when(m + 1) signals are present.

4.2. Upper bound and lower bound on K 0

We introduced (M - mo) pseudo signals in the approximatesignal model (11) so that the parameter OM == [O'{; O;]Tis uniquely identified. The normalized estimator Vf'(0M ­

OM) has asymptotic normal distribution with zero mean andcovariance matrix K (M). This matrix varies with different

values of 0 f. , Eo However, for any choice of the (M + 1)stuncorrelated signal, K(M) is upper bounded by K(M + 1).According to Result 2:

[IM O J K(M+l)[It:]2K(M), (17)

where K(M + 1) is the covariance matrix when m == M +1. Multiplying both sides of (17) with [ I mo OM -mo ] and

[Imo OM-mo ]T where OM-mo denotes an (M - mo) x(M - mo) zero matrix, we obtain an upper bound for thecovariance matrix K 0 as follows,

[Imo OM+l-mo ] K(M+1) [ 0 1 mo] 2: K o· (18)

M+l-mo

The lower bound is derived by applying Result 2 again.More specifically,

K o == [Imo OM-mo ]K(M) [OImo ] 2: K(mo)M-mo

(19)where K (mo) is the covariance matrix for the standard MLestimatior with m == mo. Eq. (19) shows that the estimationaccuracy is affected by overparameterization. By eq. (18)we show that the impact of model order mismatch on esti­mation performance is equivalent to that caused by adding(M + 1 - mo) additional signals to the original mo signals.

A great advantage of the above analysis is that K (M + 1)and K (mo) can be easily computed by the existing formula(15). The parameter vector OM+l == [O'{;, Bmo + 1"" ,BM+1]Tat which K (M + 1) is evaluated has the true parameter 00 asits first mo elements. The remaining parameters can be cho­sen arbitrarily but has to ensure the corresponding steeringmatrix H M +1 to be full rank. The matrix K(mo) is evalu­ated at the true parameter 00 . The lower bound can only beachieved when M == mo, i.e. when no model order mismatchis present. Although these bounds are not necessarily tight,they provide a convenient method to evaluate the performanceof an overparameterized model. The following theorem sum­marizes the above discussion.

Theorem 1 Consider the algorithm described in Section 3.The asymptotic covariance matrix K 0 == E[(00 - (Jo) (00 ­

( 0 )T] where 00 consists of relevant estimates is upper andlower bounded by K(M+1) and K(mo), respectively. Moreprecisely,

[Imo OM+l-mo] K(M+1) [ 0 1mo] 2: K o 2: K(mo).

M+l-mo(20)

The covariance matrices K (M + 1) and K (mo) can be com­puted by the closed form expression for covariance matrix ofstandard ML estimaion (15).

88

Page 4: [IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing

5. SIMULATION

10-210-2L.....--..----L._--'--_--'--_....I.....---'_---'-_---'--_-...-_...&...-.-----'

-10 -8 -6 -4

Variance, M=3101 .-------r'---..,.....----r---~-...__-......-___.__-........---r----"

In the simulation, a uniform linear of 10 sensors with inter­element spacings of half a wavelength is employed. The nar­row band signals are generated by mo == 2 uncorrelated sig­nals located at 60 == [28° 36°] of various strengths. The datalength is given by T == 100. The difference ofsignal strengthsis [1 0] dB where 0 dB corresponds to the reference signal.The signal to noise ratio (SNR) varies from -10 to 10 dB ina 2 dB step. We consider two upper bounds on the numberof signals: M == 3 and M == 4. The later represents a largermismatch in the model order. The threshold ex is chosen to be0.015. Each experiment performs 200 Monte Carlo trials.

-8 -6 -4 -2 0 10

SNR

: ...... --. --....... ....

... -0- : -:0":. ..............i:::o................ .. ................

...... ~......:i~::......... _.......~-...-...

10-2"------L._---l.-_~_....L.....__L....._---L.._--'-- _ __L__...&...-.-------l

-10

Fig 1. shows the square root ofempirical variances for {}1,82 . The lower bound and upper bound computed from The­orem 1 are denoted by LBi and UBi, i == 1,2, respectively.We can observe that the variances decrease with increasingSNR. As predicted by our analysis, var(81 ) and var({}2) arelower bounded by the variances for the optimal case, K (2),and upper bounded by the worst case, K (4). Note that K (2)is calculated at 60 == [28° 36°] and K(4) is calculated at[28° 36° 20° 50°] by eq (15). The reletive strength of the ad­ditional two signals is 0 dB.

10-8 -6 -4 -2

Variance, M=4101

---,r-----r---r------r-----r---,-----,r-----r---r----,

Fig. 1. Square root of empirical variances of relevant estimates. rna = 2,

M=3.

In Fig. 2 we observe that the empirical variances forM == 4 increase due to a larger mismatch in numbers of sig­nals. However, they are still well bounded by the two theoret­ical bounds obtained from K(2) and K(5). The matrix K(5)is calculated at [28° 36° 20° 50° 80°]. The reletive strengthof the additional three signals is 0 dB.

In summary, the variance of relevant estimates is largerthan that in the optimal case where no mismatch is present,i.e. m == mo. Furthermore, it is also upper bounded by thecorresponding diagonal elements of K (M + 1) as predictedby Theorem 1.

6. CONCLUSION

We studied performance of a recently proposed robust MLdirection finding algorithm for unknown numbers of signals.The underlying algorithm computes ML estimates for the max­imally hypothesized model and select DOA estimates by therelevant value associated with each component. Due to over­parameterization, the signal model and the asymptotic covari­ance matrix are not uniquely identifiable. We overcome thisdifficulty by considering an approximate signal model withpseudo signals ofnegligible strengths and derive an upper andlower bound on the varince of relevant estimates. The impactof overparameterization on variance is similar to that of in­troducing additional signals. Simulation results under varioussettings show good agreement with the theoretical analysis.

... .... ..... : =0: : : ~: : : i : • ...

............. -o- ~::: _--................. -0- ......

10-2'-------I..--..L-----'---~_-'---_'___----I.._-..L-_---'-----'

-10 -8 -6 -4 -2 0 10

SNR

Fig. 2. Square root of empirical variances of relevant estimates. rna = 2,

M=4.

89

Page 5: [IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing

7. REFERENCES

[1] J. F. B6hme. Statistical array processing of measuredsonar and seismic data. In Proc. SPIE 2563 AdvancedSignal Processing Algorithms, pp. 2-20, San Diego, Jul1995.

[2] P.-1. Chung. Robust ML estimation for unknown num­bers of signals. In Proc. EUSIPCO, Poznan, Poland,Septermber 2007.

[3] P.-J.Chung and 1. F. Bohme and C. F. Mecklenbdiukerand A. O. Hero. Detection of the Number of SignalsUsing the Benjamini-Hochberg Procedure. IEEE Trans.Signal Processing, 55(6), pp. 2497-2508, June 2007.

[4] P.-1. Chung. Stochastic maximum likelihood estimationunder misspecified numbers ofsignals, IEEE Trans. Sig­nal Processing, Vol 55(9), pp. 4726 - 4731, September2007.

[5] A. Shapiro. Asymptotic theory of overparameterizedstructural models. Journal of American Association,81(393):142-149, March 1986.

[6] P. Stoica and A. Nehorai. Performance study of condi­tional and unconditional direction-of-arrival estimation.IEEE Trans. Acoustics, Speech, and Signal Processing,38(10): 1783-1795, October 1990.

[7] M. Wax and I. Ziskind. Detection of the number ofcoherent signals by the MDL Principle. IEEE Trans.Acoust., Speech, Signal Processing, 37(8), pp. 1190­1196, August 1989.

90