IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 6, …personal.stevens.edu/~hli/papers/FangLi14.pdf · IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 6, JUNE 2014 761 Super-Resolution

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 6, JUNE 2014 761

Super-Resolution Compressed Sensing: AnIterative Reweighted Algorithm for Joint

Parameter Learning and Sparse Signal RecoveryJun Fang, Jing Li, Yanning Shen, Hongbin Li, Senior Member, IEEE, and Shaoqian Li

Abstract—In many practical applications such as direction-of-arrival (DOA) estimation and line spectral estimation, the spar-sifying dictionary is usually characterized by a set of unknownparameters in a continuous domain. To apply the conventionalcompressed sensing to such applications, the continuous param-eter space has to be discretized to a finite set of grid points.Discretization, however, incurs errors and leads to deterioratedrecovery performance. To address this issue, we propose an it-erative reweighted method which jointly estimates the unknownparameters and the sparse signals. Specifically, the proposedalgorithm is developed by iteratively decreasing a surrogatefunction majorizing a given objective function, which results ina gradual and interweaved iterative process to refine the un-known parameters and the sparse signal. Numerical results showthat the algorithm provides superior performance in resolvingclosely-spaced frequency components.

Index Terms—Compressed sensing, parameter learning, sparsesignal recovery, super-resolution.

I. INTRODUCTION

T HE compressed sensing technique finds a variety of appli-cations in practice as many natural signals admit a sparse

or an approximate sparse representation in a certain basis. Nev-ertheless, the accurate reconstruction of the sparse signal relieson the knowledge of the sparsifying dictionary. While in manyapplications, it is often impractical to preset a dictionary that cansparsely represent the signal. For example, for the line spectralestimation problem, using a preset discrete Fourier transform(DFT) matrix suffers from a considerable performance degrada-tion because the true frequency components may not lie on thepre-specified frequency grid [1], [2]. This discretization error isalso referred to as the grid mismatch.

Manuscript received January 15, 2014; revised March 17, 2014; acceptedApril 02, 2014. Date of publication April 07, 2014; date of current version April16, 2014. This work was supported in part by the National Science Foundationof China under Grant 61172114, and by the National Science Foundation underGrant ECCS-0901066. The associate editor coordinating the review of this man-uscript and approving it for publication was Prof. Mathew Magimai Doss.J. Fang, J. Li, Y. Shen, and S. Li are with the National Key Lab-

oratory of Science and Technology on Communications, Universityof Electronic Science and Technology of China, Chengdu 611731,China (e-mail: [email protected]; [email protected];[email protected]; [email protected])H, Li is with the Department of Electrical and Computer Engineering,

Stevens Institute of Technology, Hoboken, NJ 07030 USA (e-mail:[email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/LSP.2014.2316004

The grid mismatch problem has attracted a lot of attentionover the past few years, e.g. [1]–[8]. Specifically, in [4], [5], todeal with the grid mismatch, the true dictionary is approximatedas a summation of a presumed dictionary and a structured pa-rameterized matrix via the Taylor expansion. The recovery per-formance of this method, however, depends on the accuracy ofthe Taylor expansion in approximating the true dictionary. Thegrid mismatch problem was also examined in [6], [7], wherea highly coherent dictionary (very fine grids) is used to miti-gate the discretization error, and the technique of band exclu-sion (coherence-inhibiting) was proposed for sparse signal re-covery. Besides these efforts, another line of work [1], [2], [8]studied the problem of grid mismatch in an undirect but morefundamental way: they circumvent the discretization issue byworking directly on the continuous parameter space (this ap-proach is also referred to as super-resolution techniques). In[1], [2], an atomic norm-minimization approach was proposedto handle the infinite dictionary with continuous atoms. Never-theless, finding a solution to the atomic norm problem is chal-lenging. Although the atomic norm problem can be cast into aconvex semidefinite program optimization for the complex si-nusoid mixture problem, it still remains unclear how this refor-mulation generalizes to other scenarios. In [8], by treating thesparse signal as hidden variables, a Bayesian approach was pro-posed to iteratively refine the dictionary, and is shown able toachieve super-resolution accuracy.In this paper, we propose an iterative reweighted method for

joint parameter learning and sparse signal recovery. The algo-rithm is developed by iteratively decreasing a surrogate func-tion that majorizes the original objective function. Our experi-ments show that our proposed algorithm achieves a significantperformance improvement as compared with existing methodsin distinguishing and recovering complex sinusoids whose fre-quencies are very closely separated.

II. PROBLEM FORMULATION

In many practical applications such as direction-of-arrival(DOA) estimation and line spectral estimation, the sparsifyingdictionary is usually characterized by a set of unknown param-eters in a continuous domain. For example, consider the linespectral estimation problem where the observed signal is a sum-mation of a number of complex sinusoids:

(1)

1070-9908 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

762 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 6, JUNE 2014

where and denote the frequency and thecomplex amplitude of the -th component, respectively. Define

, the model (1) can berewritten in a vector-matrix form as

(2)

where , , and. We see that the dictionary

is characterized by a number of unknown parameterswhich needs to be estimated along with the unknown

complex amplitudes . To deal with this problem, conven-tional compressed sensing techniques discretize the continuousparameter space into a finite set of grid points, assuming thatthe unknown frequency components lie on the discretizedgrid. Estimating and can then be formulated as asparse signal recovery problem , where( ) is an overcomplete dictionary constructed based onthe discretized grid points. Discretization, however, inevitablyincurs errors since the true parameters do not necessarily lieon the discretized grid. This error, also referred to as the gridmismatch, leads to deteriorated performance or even failure inrecovering the sparse signals.To circumvent this issue, we treat the overcomplete

dictionary as an unknown parameterized matrix, with each atom determined by an

unknown frequency parameter . Estimating andcan still be formulated as a sparse signal recovery problem.Nevertheless, in this framework, the frequency parameters

need to be optimized along with the sparse signalsuch that the parametric dictionary will approach the true spar-sifying dictionary. Specifically, the problem can be presentedas follows: we search for a set of unknown parameterswith which the observed signal can be represented by as fewatoms as possible. Such a problem can be readily formulated as

(3)

where stands for the number of the nonzero componentsof . The optimization (3), however, is an NP-hard problem.Thus, alternative sparsity-promoting functionals which aremore computationally efficient in finding the sparse solutionare desirable. In this paper, we consider the use of the log-sumsparsity-encouraging functional for sparse signal recovery.Log-sum penalty function has been extensively used for sparsesignal recovery, e.g. [9]–[13]. It was proved theoretically[14] and shown in a series of experiments [11] that log-sumbased methods present uniform superiority over the conven-tional -type methods. Replacing the -norm in (3) with thelog-sum functional leads to

(4)

where denotes the th component of the vector , and isa positive parameter to ensure that the function is well-defined.Note that the above optimization (4) can be formulated as anunconstrained optimization problem by removing the constraintand adding a penalty term, , to the objective func-tional. A two-stage iterative algorithm [15] can then be applied:given an estimate of , the sparse signal is recovered usingconventional compressive sensing techniques; and estimatebased on the estimated . The trade-off parameter for thisscheme, however, is difficult to determine due to the non-con-vexity of the objective function. In addition, the two-stage algo-rithm is very likely to be trapped in undesirable local minima,possibly because the estimated signal, instead of optimized ina gradual manner, undergoes an abrupt change from one itera-tion to another and thus easily deviates from the correct basin ofattraction. In the following, we develop an iterative reweightedalgorithm which less likely suffers from the local convergenceissue.

III. PROPOSED ALGORITHM

The proposed algorithm is developed based on a bounded op-timization approach, also known as the majorization-minimiza-tion approach [11], [16]. The idea is to iteratively minimize asimple surrogate functionmajorizing a given objective function.Note that although the use of this approach for sparse signal re-covery is not new (e.g. [11]–[13]), previous works concern onlyrecovery of the sparse signal. In this work, we generalize the ap-proach for joint parameter learning and sparse signal recovery.A surrogate function, usually written as , is an upper

bound for the objective function . Precisely, we have

(5)

with the equality attained when . We will show thatthrough iteratively decreasing (not necessarily minimizing) thesurrogate function, the iterative process yields a non-increasingobjective function value and eventually converges to a sta-tionary point of . We first find a suitable surrogate functionfor the objective function defined in (4). Ideally, we hope thatthe surrogate function is differentiable and convex. It can bereadily verified that an appropriate choice of such a surrogatefunction is given by

(6)

Replacing the log-sum functional in (4) with (6), we arrive atthe following optimization

(7)

where denotes the conjugate transpose, and is a diag-onal matrix given as

FANG et al.: SUPER-RESOLUTION COMPRESSED SENSING 763

Given fixed, the optimal of (7) can be obtained by resortingto the Lagrangian multiplier method and given as

(8)

Substituting (8) back into (7), the optimization simply becomessearching for the unknown parameter :

(9)

An analytical solution of the above optimization (9) is difficultto obtain. Nevertheless, in our algorithm, we only need to searchfor a new estimate such that the following inequalityholds valid

(10)

Such an estimate can be found by using the gradient descentmethod. Note that since the optimizations (9) and (7) attain thesame minimum objective function value, we can always find anestimate to meet (10). In fact, our experiments suggestthat finding such an estimate is much easier than searching fora local or global minimum of the optimization (9).Given , can be obtained via (8), with replaced

by , i.e.

(11)

In the following, we will show that the new obtained estimateresults in a non-increasing objective function value, that

is, . Firstly, we have

(12)

where comes from the inequality (10). Recalling (6), we canquickly reach from (12). Hencewe arrive at

(13)

where the first inequality follows from the fact thatattains its minimum when , the

second inequality follows from .We see that through iteratively decreasing (not necessarilyminimizing) the surrogate function, the objective functionis guaranteed to be non-increasing at each iteration.

For clarity, we summarize our algorithm as follows.1) Given an initialization .2) At iteration : Based on the estimate , con-struct the surrogate function as depicted in (6). Search fora new estimate of the unknown parameter vector, denotedas , by using the gradient descent method such thatthe inequality (10) is satisfied. Compute a new estimate ofthe sparse signal, denoted as , via (11).

3) Go to Step 2 if , where is a prescribedtolerance value; otherwise stop.

The second step of the proposed algorithm involves searchingfor a new estimate of the unknown parameter vector to meet thecondition (10). As mentioned earlier, this can be accomplishedvia a gradient-based search algorithm. Define

Using the chain rule, the first derivative of with respect tocan be computed as

where denotes the conjugate of the complex matrix , and

The current estimate can be used as an initialization pointto search for the new estimate . Our experiments sug-gest that a new estimate which satisfies (10) can usually be ob-tained within only a few iterations. When the iterations achievea steady state, the estimates of can be refined in a sequen-tial manner to help achieve a better reconstruction accuracy, butonly those parameters whose coefficients are relatively large arerequired to be updated every iteration.We see that in our algorithm, the unknown parameters and the

signal are refined in a gradual and interweaved manner. This in-terweaved and gradual refinement enables the algorithm, with ahigh probability, comes to a reasonably nearby point during thefirst few iterations, and eventually converges to the correct basinof attraction. In addition, like [12], we can improve the abilityof avoiding undesirable local minima by using a monotonicallydecreasing sequence , instead of a constant , in updatingthe weighting parameters in (6). For example, at the beginning,

can be set to a relatively large value, say 1, in order to pro-vide a stable coefficient estimate. We then gradually reduce thevalue of in the subsequent iterations until attains a pre-scribed value, say, .

764 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 6, JUNE 2014

IV. SIMULATION RESULTS

We now carry out experiments to illustrate the performance ofour proposed algorithm1 and its comparison with other existingmethods. We assume that the signal is amixture of complex sinusoids, i.e.

with the frequencies uniformly generated over andthe amplitudes uniformly distributed on the unit circle.The measurements are obtained by randomly selecting en-tries from elements of .We first consider recovering the orig-inal signal from the partial observations . The reconstruc-tion accuracy is measured by the “reconstruction signal-to-noiseratio” (RSNR) which is defined as

We compare our proposed algorithm with the Bayesian dic-tionary refinement compressed sensing algorithm (denoted asDicRefCS) [8], the root-MUSIC based spectral iterative hardthresholding (SIHT) [7], and the atomic norm minimization viathe semi-definite programming (SDP) approach [2]. Fig. 1(a)depicts the average RSNRs of respective algorithms as a func-tion of the number of measurements, , where we setand . Results are averaged over independent runs,where the frequencies and the sampling indices (used to obtain) are randomly generated for each run.We observe that our pro-posed algorithm outperforms the other three methods in the re-gion of a small , where a gain of more than 15 dB is achievedas compared with the DicRefCS and SDP methods. Note thatthe log-sum penalty functional adopted by our algorithm couldbe more sparse-encouraging than the atomic norm that is con-sidered as the continuous analog to the norm for discretesignals. This might be the reason why our proposed algorithmoutperforms the SDP method for a small . Our algorithm issurpassed by the SIHT and SDP methods as increases. Nev-ertheless, this performance improvement is of less significancesince all methods provide quite decent recovery performancewhen is large.The recovery performance is also evaluated in terms of the

success rate. The success rate is computed as the ratio of thenumber of successful trials to the total number of independentruns, where and are randomly generated for eachrun. Note that our algorithm and the DicRefCS method do notrequire the knowledge of the number of complex sinusoids, .A trial is considered successful if the number of frequency com-ponents is estimated correctly2 and the estimation error betweenthe estimated frequencies and the true parametersis smaller than , i.e. . Fig. 1(b) de-picts the success rates of respective algorithms vs. the numberof measurements. This result again demonstrates the superiority

1Matlab codes are available at http://www.junfang-uestc.net/codes/SRCS.rar2For our algorithm, some of the coefficients of the estimated signal keep de-

creasing each iteration, but will not exactly equal to zero. We assume that afrequency is identified if the magnitude of the coefficient is greater than .

Fig. 1. (a) RSNRs of respective algorithms vs. ; (b) Success rates of respec-tive algorithms vs. .

Fig. 2. (a) RSNRs of respective algorithms vs. the frequency spacing coeffi-cient ; (b) Success rates of respective algorithms vs. .

TABLE IRUN TIMES OF RESPECTIVE ALGORITHMS

of the proposed algorithm over other existing methods, particu-larly for the case when is small.We examine the ability of our algorithm in resolving closely-

spaced frequency components. The signal is assumed a mix-ture of two complex sinusoids with the frequency spacing

equal to , where is the frequency spacingcoefficient ranging from 0.1 to 2. Fig. 2 shows RSNRs and suc-cess rates of respective algorithms vs. the frequency spacing co-efficient , where we set and . Results areaveraged over independent runs, with one of the two fre-quencies (the other frequency is determined by the frequencyspacing) and the set of sampling indices randomly generatedfor each run. We see that our algorithm can accurately identifyclosely-spaced (say, ) frequencies with a high suc-cess rate and presents a significant performance advantage overother methods when two frequencies are very closely separated.The average run times of respective algorithms are also pro-vided (Table I), where is fixed as 0.6.

V. CONCLUSIONS

We proposed an iterative reweighted algorithm for joint para-metric dictionary learning and sparse signal recovery. The pro-posed algorithm was developed by iteratively decreasing a sur-rogate function majorizing the original objective function. Sim-ulation results show that the proposed algorithm presents supe-riority over other existing methods in resolving closely-spacedfrequency components.

FANG et al.: SUPER-RESOLUTION COMPRESSED SENSING 765

REFERENCES

[1] E. Candès and C. Fernandez-Granda, “Towards a mathematical theoryof super-resolution,” Commun. Pure Appl. Math., to be published.

[2] G. Tang, B. N. Bhaskar, B. Recht, and P. Shah, “Compressed sensingoff the grid,” [Online]. Available: http://arxiv.org/abs/1207.6053 2012

[3] Y. Chi, L. L. Scharf, A. Pezeshki, and A. R. Calderbank, “Sensitivity tobasis mismatch in compressed sensing,” IEEE Trans. Signal Process.,vol. 59, no. 5, pp. 2182–2195, May 2011.

[4] Z. Yang, L. Xie, and C. Zhang, “Off-grid direction of arrival estimationusing sparse Bayesian inference,” IEEE Trans. Signal Process., vol. 61,no. 1, pp. 38–42, Jan. 2013.

[5] L. Hu, J. Zhou, Z. Shi, and Q. Fu, “A fast and accurate reconstructionalgorithm for compressed sensing of complex sinusoids,” IEEE Trans.Signal Process., to appear.

[6] A. Fannjiang and W. Liao, “Coherence pattern-guided compressivesensing with unresolved grids,” SIAM J. Imag. Sci., vol. 5, no. 1, pp.179–202, 2012.

[7] M. F. Duarte and R. G. Baraniuk, “Spectral compressive sensing,”Appl. Comput. Harmon. Anal., vol. 35, pp. 111–129, 2013.

[8] L. Hu, Z. Shi, J. Zhou, and Q. Fu, “Compressed sensing of complexsinusoids: An approach based on dictionary refinement,” IEEE Trans.Signal Process., vol. 60, no. 7, pp. 3809–3822, 2012.

[9] B. A. Olshausen and D. J. Field, “Emergence of simple-cell receptivefield properties by learning a sparse code for natural images,” Nature,vol. 381, pp. 607–609, Jun. 1996.

[10] I. F. Gorodnitsky and B. D. Rao, “Sparse signal reconstructions fromlimited data using focuss: A re-weighted minimum norm algorithm,”IEEE Trans. Signal Process., vol. 45, no. 3, pp. 699–616, Mar. 1997.

[11] E. Candès, M.Wakin, and S. Boyd, “Enhancing sparsity by reweightedminimization,” J. Fourier Anal. Applicat., vol. 14, pp. 877–905, Dec.

2008.[12] R. Chartrand andW. Yin, “Iterative reweighted algorithm for compres-

sive sensing,” in IEEE Int. Conf. Acoustics, Speech, and Signal Pro-cessing, Las Vegas, NV, USA, 2008.

[13] D. Wipf and S. Nagarajan, “Iterative reweighted and methods forfinding sparse solutions,” IEEE J. Sel. Topics Signal Process., vol. 4,no. 2, pp. 317–329, Apr. 2010.

[14] Y. Shen, J. Fang, and H. Li, “Exact reconstruction analysis of log-summinimization for compressed sensing,” IEEE Signal Process. Lett., vol.20, pp. 1223–1226, Dec. 2013.

[15] M. Ataee, H. Zayyani, M. Babaie-Zadeh, and C. Jutten, “Parametricdictionary learning using steepest descent,” in Proc. IEEE Int. Conf.Acoustics, Speech, and Signal Processing, Proceedings, Dallas, TX,USA, 2010.

[16] K. Lange, D. Hunter, and I. Yang, “Optimization transfer using surro-gate objective functions,” J. Comput. Graph. Statist., vol. 9, no. 1, pp.1–20, Mar. 2000.

Documents

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 6, …personal.stevens.edu/~hli/papers/FangLi14.pdf · IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 6, JUNE 2014 761 Super-Resolution