[IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing

ROBUST ML ESTIMATION FOR UNKNOWN NUMBERS OF SIGNALS:PERFORMANCE STUDY

Pei-Jung Chung

Institute for Digital CommunicationsSchool of Engineering and Electronics, The University of Edinburgh, UK

[email protected]

ABSTRACT

We study the performance of a recently proposed robustML estimation procedure for unknown numbers of signals.This approach finds the ML estimate for the maximum number ofsignals and selects relevant components associated withthe true parameters from the estimated parameter vector. Itscomputational cost is significantly lower than conventionalmethods based on information theoretic criteria or multiplehypothesis tests. We show that the covariance matrix of relevant estimates is upper and lower bounded by two covariancematrices. These bounds are easy to compute by existing results for standard ML estimation. Our analysis is further confirmed by numerical experiments over a wide range of SNRs.

1. INTRODUCTION

The problem of estimating direction of arrival (DOA) plays akey role in array processing, radar, communications and geophysics. The maximum likelihood (ML) approach is characterized by excellent statistical properties and robustness againstsmall sample numbers, signal coherence and closely locatedsources.

The standard ML method assumes the number of signals,m, to be known and maximizes the concentrated likelihoodfunction over an m-dimensional parameter space. In the caseofunknown numbers ofsignals, conventional information theoretic criterion based approaches [7] or multiple hypothesistesting procedures [3] compute the ML estimates for a sequence of model orders and select the best estimate according to the underlying criterion. The total computational costcan be very high due to the multi-dimensional search for eachcandidate model order.

In contrary to the aforementioned methods, the recentlyproposed ML estimation procedure [2] carries out optimiza-

P.-J. Chung acknowledges support of her position from the ScottishFunding Council and their support of the Joint Research Institute with theHeriot-Watt University as a component part of the Edinburgh Research Partnership.

tion for the maximum possible number of signals. The resulting parameter vector contains relevant components that areassociated with the true parameters. They can be recognizedon the relevance value that measures each component's contribution to the likelihood function. Further, the number ofrelevant components provides an estimate for the number ofsignals.

The purpose of this contribution is to get more insight intothe performance of this algorithm. In particular, we shall investigate estimation errors of relevant estimates. The maindifficulty here is that a overparameterized model is not uniquelyidentified. We overcome this problem by introducing pseudosources with negligible powers. Based on this modified model,an upper bound and a lower bound on the covariance matrixof relevant estimates are derived.

This paper is outlined as follows. Section 2 gives a briefdescription of the data model. The robust ML direction finding algorithm for unknown numbers of signals [2] is summarized in Section 3. We develop the main result, upper andlower bounds on the covariance matrix, in Section 4. Numerical results are presented in Section 5. Concluding remarksare given in Section 6.

2. SIGNAL MODEL

Consider an array of n sensors receiving m narrow band signals emitted by far-field sources located at Om == [(}1, ... , (}m]T.The array output x (t) is described as

x(t) == H m(Om)sm(t) + n(t), t == 1, ... ,T, (1)

where the ith column d((}i) of

represents the steering vector associated with the signal arriving from ()i. The signal vector 8 m (t) is considered as astationary, temporally uncorrelated complex normal processwith zero mean and covariance matrix CSm == E[sm(t)sm(t)']where (.), denotes the Hermitian transpose. The noise vector

978-1-4244-2241-8/08/$25.00 ©2008 IEEE 86

n(t) is a spatially and temporally uncorrelated complex normal process with zero mean and covariance matrix vI n wherev is the noise spectral parameter and In is an n x n identitymatrix. Under these assumptions, the array output x (t) iscomplex normally distributed with zero mean and covariancematrix

C~ == Hm((Jm)CsrnHm((Jm)' + vIn. (3)

Based on array observations {x (t ) } f= 1 and a pre-specified

number of signals, m, the ML estimate iJm is obtained byminimizing the negative concentrated likelihood function [1]

IT((Jm) == logdet (Pm ((Jm)CxPm((Jm)+DP;((Jm)) , (4)

A 1 ..l A)V == --tr(Pm((Jm)C~ (5)

n-m

where Pm((Jm) represents the projection matrix onto the column space of Hm((Jm) and P;((Jm) == In - Pm((Jm)'

A TC~ == ~ Lt=l x(t)x(t)' denotes the sample covariance ma-trix. The problem of central interest is to estimate the DOAparameters when the true number of signals, mo, is unknown.

3. ROBUST ML ESTIMATION FOR UNKNOWNNUMBERS OF SIGNALS

From the asymptotic property of ML estimation under misspecified numbers of signals, we know that for m > mo, theML estimate iJm contains (m - mo) components corresponding to the true parameters in (Jo [4]. Moreover, the value ofthe likelihood function attains its minimum at mo and remainsconstant for m 2: mo. Motivated by this observation, we proposed a robust ML estimation procedure in [2] that requiresonly an upper bound on the number of signals M. The likelihood function is maximized over an M -dimensional parameter space. From the resulting estimate, we select the relevantcomponents according to their contribution to the likelihoodfunction.

More precisely, let iJM denote the ML estimate obtainedby minimizing the negative log-likelihood function (4) withm==M:

iJ M == arg min IT( (J M)' (6)OM

Since M 2: mo, the M x 1 vector iJ M == [01 , ... ,OM]T contains more elements than the true parameter vector (Jo does.For large T, a subset of the elements in iJM coincide withthose of Bo. The elements of iJM that are associated with(Jo , are referred to as relevant estimates.

To select the relevant components, we compute the relevant value for each component O'i as follows

where Oi contains all elements of iJM except the ith component Oi:

iJi == [01, ... ,Oi-1, Oi+1,'''' OM]' (8)

Note that the log-likelihood IT(8i ) is computed by (4) us

ing the (M - 1) dimensional vector 8i . The value of R(Odindicates the contribution of the ith element Om to the loglikelihood function. The normalizing factor 1/IT (iJ !vI) isused to improve numerical stability. As pointed out in [2],a high relevance value R(Oi) implies that Oi is associatedwith one of (Jo 's components. A low relevance value (usuallyclose to zero) indicates that Oi does not correspond to anytrue parameter. Based on this observation, we decide Oi to bea relevant estimate if R(Oi) exceeds a given threshold 0:.

Let R(l) 2: R(2) 2: ... 2: R(M) denote the ordered rele

vance values with corresponding {0(1)' 0(2) ,... ,O(M)}' The setof relevant estimates is given by

(9)

where

R(l) 2: R(2) 2: ... 2: R(k) 2: 0:. (10)

We also define the k x 1 relevant vector as iJo == [O(l) , ... ,0(k)]T.Asymptotically the number of components in S or iJo coincides with the true number of signals mo.

4. COVARIANCE MATRIX OF RELEVANTESTIMATES

The consistency property of ML estimation has been established previously [4]. In the variance study, the main challenge is that an overparameterizedmodel is not uniquely identified. The asymptotic covariance matrix represents a class ofmatrices involving generalized inverse, rather than a uniquelydefined matrix [5]. In the following, rather than applying theelegant results of [5] directly, we shall take a more practicalapproach to assess estimation errors. More specifically, weshow that the asymptotic covariance matrix of relevant estimates is lower and upper bounded by two covariance matrices associated with mo and (M + 1) signals in standard MLestimation.

To tackle the identification problem, we consider a modified version of the signal model (1) :

where the subscript (')0 is used to emphasize that the steeringmatrix H 0 ((Jo) and the signal vector So (t) are associated withthe real signal sources. The n x (M - mo) matrix H f ((J f) denotes the steering matrix associated with the pseudo sources.The (M - mo) x 1 signal vector Sf. (t) corresponding to the

87

pseudo sources is independent of So (t) with the covariancematrix

(12)

where E > 0 is positive with small magnitude compared tothe real signal powers, i.e. E << tr(C so), The covariancematrix of array outputs (3) becomes

Cx == HoCsoH~ + Hf.Cf. H :. + vln (13)

where H 0 ==Ho(( 0 ), Hf. == Hf. (Of.) and Cso == ES:J(t)S:J(t)'.Note that for E -t 0, (11) and (13) approach to the true signalmodel and covariance matrix, respectively. Without loss ofgenerality, we assume H == [HoH ~J is of full rank.

Based on the model (11) and results from standard MLestimation [6], we shall derive the upper and lower bounds onthe asymptotic covariance matrix of relevant estimates

A A TK o == E[(Oo - ( 0 )(00 - ( 0 ) ]. (14)

4.1. Existing Results for Standard ML Estimation

The following properties for standard ML estimation [6] areessential to the proofof Theorem 1.

Result 1 (Explicitformulafor asymptotic covariance matrix)Suppose the signal model (1) and the number of signals, m,to be known. The asymptotic covariance matrix of the MLestimator K(m) == E[(Om - Om)(Om - Om)T] is given by

v { ~ H ~ H 1 Tl }-l2T Rel(Dm~Dm)8(CsrnHmc;rnHmCsrn)J'

(15)where 8 denotes the Hadamard elementwise product. Then x m matrix D m == [d'(th)·· ·d'(Bm )] contains the firstderivative of the steering vectors d' (B i ) == 8d(Bi ) / aB i .

Result 2 (R7 of[6J) If the (m + 1)st source is uncorrelatedwith the other m, then K (m) increases monotonically withincreasing m. More specifically,

[ 1m 0 1K (m + 1) [ 10' ] 2 K (m) , (16)

where I m is an m x m identity matrix and the (m + 1) x(m + 1) matrix K (m + 1) denotes the covariance matrix when(m + 1) signals are present.

4.2. Upper bound and lower bound on K 0

We introduced (M - mo) pseudo signals in the approximatesignal model (11) so that the parameter OM == [O'{; O;]Tis uniquely identified. The normalized estimator Vf'(0M

OM) has asymptotic normal distribution with zero mean andcovariance matrix K (M). This matrix varies with different

values of 0 f. , Eo However, for any choice of the (M + 1)stuncorrelated signal, K(M) is upper bounded by K(M + 1).According to Result 2:

[IM O J K(M+l)[It:]2K(M), (17)

where K(M + 1) is the covariance matrix when m == M +1. Multiplying both sides of (17) with [ I mo OM -mo ] and

[Imo OM-mo ]T where OM-mo denotes an (M - mo) x(M - mo) zero matrix, we obtain an upper bound for thecovariance matrix K 0 as follows,

[Imo OM+l-mo ] K(M+1) [ 0 1 mo] 2: K o· (18)

M+l-mo

The lower bound is derived by applying Result 2 again.More specifically,

K o == [Imo OM-mo ]K(M) [OImo ] 2: K(mo)M-mo

(19)where K (mo) is the covariance matrix for the standard MLestimatior with m == mo. Eq. (19) shows that the estimationaccuracy is affected by overparameterization. By eq. (18)we show that the impact of model order mismatch on estimation performance is equivalent to that caused by adding(M + 1 - mo) additional signals to the original mo signals.

A great advantage of the above analysis is that K (M + 1)and K (mo) can be easily computed by the existing formula(15). The parameter vector OM+l == [O'{;, Bmo + 1"" ,BM+1]Tat which K (M + 1) is evaluated has the true parameter 00 asits first mo elements. The remaining parameters can be chosen arbitrarily but has to ensure the corresponding steeringmatrix H M +1 to be full rank. The matrix K(mo) is evaluated at the true parameter 00 . The lower bound can only beachieved when M == mo, i.e. when no model order mismatchis present. Although these bounds are not necessarily tight,they provide a convenient method to evaluate the performanceof an overparameterized model. The following theorem summarizes the above discussion.

Theorem 1 Consider the algorithm described in Section 3.The asymptotic covariance matrix K 0 == E[(00 - (Jo) (00

( 0 )T] where 00 consists of relevant estimates is upper andlower bounded by K(M+1) and K(mo), respectively. Moreprecisely,

[Imo OM+l-mo] K(M+1) [ 0 1mo] 2: K o 2: K(mo).

M+l-mo(20)

The covariance matrices K (M + 1) and K (mo) can be computed by the closed form expression for covariance matrix ofstandard ML estimaion (15).

88

5. SIMULATION

10-210-2L.....--..----L._--'--_--'--_....I.....---'_---'-_---'--_-...-_...&...-.-----'

-10 -8 -6 -4

Variance, M=3101 .-------r'---..,.....----r---~-...__-......-___.__-........---r----"

In the simulation, a uniform linear of 10 sensors with interelement spacings of half a wavelength is employed. The narrow band signals are generated by mo == 2 uncorrelated signals located at 60 == [28° 36°] of various strengths. The datalength is given by T == 100. The difference ofsignal strengthsis [1 0] dB where 0 dB corresponds to the reference signal.The signal to noise ratio (SNR) varies from -10 to 10 dB ina 2 dB step. We consider two upper bounds on the numberof signals: M == 3 and M == 4. The later represents a largermismatch in the model order. The threshold ex is chosen to be0.015. Each experiment performs 200 Monte Carlo trials.

-8 -6 -4 -2 0 10

SNR

: ...... --. --....... ....

... -0- : -:0":. ..............i:::o................ .. ................

...... ~......:i~::......... _.......~-...-...

10-2"------L._---l.-_~_....L.....__L....._---L.._--'-- _ __L__...&...-.-------l

-10

Fig 1. shows the square root ofempirical variances for {}1,82 . The lower bound and upper bound computed from Theorem 1 are denoted by LBi and UBi, i == 1,2, respectively.We can observe that the variances decrease with increasingSNR. As predicted by our analysis, var(81 ) and var({}2) arelower bounded by the variances for the optimal case, K (2),and upper bounded by the worst case, K (4). Note that K (2)is calculated at 60 == [28° 36°] and K(4) is calculated at[28° 36° 20° 50°] by eq (15). The reletive strength of the additional two signals is 0 dB.

10-8 -6 -4 -2

Variance, M=4101

---,r-----r---r------r-----r---,-----,r-----r---r----,

Fig. 1. Square root of empirical variances of relevant estimates. rna = 2,

M=3.

In Fig. 2 we observe that the empirical variances forM == 4 increase due to a larger mismatch in numbers of signals. However, they are still well bounded by the two theoretical bounds obtained from K(2) and K(5). The matrix K(5)is calculated at [28° 36° 20° 50° 80°]. The reletive strengthof the additional three signals is 0 dB.

In summary, the variance of relevant estimates is largerthan that in the optimal case where no mismatch is present,i.e. m == mo. Furthermore, it is also upper bounded by thecorresponding diagonal elements of K (M + 1) as predictedby Theorem 1.

6. CONCLUSION

We studied performance of a recently proposed robust MLdirection finding algorithm for unknown numbers of signals.The underlying algorithm computes ML estimates for the maximally hypothesized model and select DOA estimates by therelevant value associated with each component. Due to overparameterization, the signal model and the asymptotic covariance matrix are not uniquely identifiable. We overcome thisdifficulty by considering an approximate signal model withpseudo signals ofnegligible strengths and derive an upper andlower bound on the varince of relevant estimates. The impactof overparameterization on variance is similar to that of introducing additional signals. Simulation results under varioussettings show good agreement with the theoretical analysis.

... .... ..... : =0: : : ~: : : i : • ...

............. -o- ~::: _--................. -0- ......

10-2'-------I..--..L-----'---~_-'---_'___----I.._-..L-_---'-----'

-10 -8 -6 -4 -2 0 10

SNR

Fig. 2. Square root of empirical variances of relevant estimates. rna = 2,

M=4.

89

7. REFERENCES

[1] J. F. B6hme. Statistical array processing of measuredsonar and seismic data. In Proc. SPIE 2563 AdvancedSignal Processing Algorithms, pp. 2-20, San Diego, Jul1995.

[2] P.-1. Chung. Robust ML estimation for unknown numbers of signals. In Proc. EUSIPCO, Poznan, Poland,Septermber 2007.

[3] P.-J.Chung and 1. F. Bohme and C. F. Mecklenbdiukerand A. O. Hero. Detection of the Number of SignalsUsing the Benjamini-Hochberg Procedure. IEEE Trans.Signal Processing, 55(6), pp. 2497-2508, June 2007.

[4] P.-1. Chung. Stochastic maximum likelihood estimationunder misspecified numbers ofsignals, IEEE Trans. Signal Processing, Vol 55(9), pp. 4726 - 4731, September2007.

[5] A. Shapiro. Asymptotic theory of overparameterizedstructural models. Journal of American Association,81(393):142-149, March 1986.

[6] P. Stoica and A. Nehorai. Performance study of conditional and unconditional direction-of-arrival estimation.IEEE Trans. Acoustics, Speech, and Signal Processing,38(10): 1783-1795, October 1990.

[7] M. Wax and I. Ziskind. Detection of the number ofcoherent signals by the MDL Principle. IEEE Trans.Acoust., Speech, Signal Processing, 37(8), pp. 11901196, August 1989.

90

Documents

[IEEE 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM) - Darmstadt, Germany (2008.07.21-2008.07.23)] 2008 5th IEEE Sensor Array and Multichannel Signal Processing