Compressed sensing algorithms for electromagnetic imaging ... · Abstract of the Thesis Compressed Sensing Algorithms for Electromagnetic Imaging Applications by Richard Obermeier

Compressed Sensing Algorithms for Electromagnetic Imaging

Applications

A Thesis Presented

by

Richard Obermeier

to

The Department of Electrical and Computer Engineering

in partial fulfillment of the requirements

for the degree of

Master of Science

in

Electrical and Computer Engineering

Northeastern University

Boston, Massachusetts

December 2016

Contents

List of Figures iii

List of Acronyms v

Acknowledgments vi

Abstract of the Thesis vii

1 Introduction 11.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Hybrid DBT / NRI System for Breast Cancer Detection 52.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 DBT Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4 NRI Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.5 Image Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Compressed Sensing in Electromagnetic Imaging Applications 153.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Compressed Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.3 Physicality Constrained Compressed Sensing (PCCS) . . . . . . . . . . . . . . . . 18

3.3.1 Theoretical Considerations for the PCCS Problems . . . . . . . . . . . . . 203.4 Solving the PCCS Programs using Nesterov’s Method . . . . . . . . . . . . . . . . 24

3.4.1 Nesterov’s Accelerated Gradient Method for Non-smooth Convex Optimization 243.4.2 Nesterov’s Method for Traditional CS Problems . . . . . . . . . . . . . . . 273.4.3 Nesterov’s Method for PCCS Problems . . . . . . . . . . . . . . . . . . . 29

3.5 Solving the PCCS Programs using the Alternating Direction Method of Multipliers 303.5.1 The Alternating Direction Method of Multipliers (ADMM) . . . . . . . . . 313.5.2 ADMM for Traditional CS Problems . . . . . . . . . . . . . . . . . . . . 323.5.3 ADMM for PCCS Problems . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.6 Solving the PCCS Programs using an Accelerated Gradient Augmented Lagrangian(AGAL) Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

i

3.6.1 General Formulation of the AGAL Method . . . . . . . . . . . . . . . . . 373.6.2 AGAL for Traditional CS Problems . . . . . . . . . . . . . . . . . . . . . 393.6.3 AGAL for PCCS Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.7 Numerical Comparison of CS and PCCS Problems . . . . . . . . . . . . . . . . . 433.8 PCCS for the Hybrid DBT / NRI System . . . . . . . . . . . . . . . . . . . . . . . 45

4 Model-based Design Method for Compressive Antennas 504.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.3 A General Design Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.4 A Simplified Design Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.5 Reflection Mode Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.6 Transmission Mode Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.7 Capacity Maximization in MIMO Communication Systems . . . . . . . . . . . . . 624.8 Antenna Design using ELC Metamaterials . . . . . . . . . . . . . . . . . . . . . . 63

5 Conclusions 70

Bibliography 74

ii

List of Figures

2.1 Comparison of dielectric constant of various breast tissues as a function of frequency. 62.2 Comparison of conductivities of various breast tissues as a function of frequency. . 72.3 Conceptual diagram of the DBT measurement process. . . . . . . . . . . . . . . . 72.4 Conceptual diagram of the NRI measurement process. . . . . . . . . . . . . . . . . 82.5 Overview of the Hybrid DBT / NRI system processing. . . . . . . . . . . . . . . . 92.6 Overview of the DBT segmentation process. . . . . . . . . . . . . . . . . . . . . . 102.7 Comparison of the dielectric constant composite model to the measurements pre-

sented in [10]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.8 Comparison of the conductivity composite model to the measurements presented in

[10]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1 Depiction of the basis pursuit problem Eq. 3.18. . . . . . . . . . . . . . . . . . . 213.2 Depiction of the PCCS basis pursuit problem of Eq. 3.19. . . . . . . . . . . . . . 213.3 Reconstruction performance of CS and PCCS programs in electromagnetic imaging

example as a function of sparsity level. . . . . . . . . . . . . . . . . . . . . . . . . 453.4 Real and imaginary parts of true contrast variable χε obtained when the DBT image

is segmented perfectly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.5 Real and imaginary parts of reconstructed contrast variable χε obtained when the

DBT image is segmented perfectly and there is no measurement noise. . . . . . . . 473.6 Real and imaginary parts of true contrast variable χε obtained when the fat percentage

is segmented from the DBT image with 10% error. . . . . . . . . . . . . . . . . . 473.7 Real and imaginary parts of reconstructed contrast variable χε obtained when the

fat percentage is segmented from the DBT image with 10% error and there is nomeasurement noise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.8 Real and imaginary parts of reconstructed contrast variable χε obtained when thefat percentage is segmented from the DBT image with 10% error and and themeasurement SNR = 49dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.9 Real and imaginary parts of reconstructed contrast variable χε obtained when thefat percentage is segmented from the DBT image with 10% error and and themeasurement SNR = 43dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

iii

4.1 Configuration for the compressive antenna operating in reflection mode. White =Transmitter locations, Orange = Imaging region, Green = Scatterer locations, Red =PEC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2 Permittivity distribution of the optimized reflection mode antenna. . . . . . . . . . 574.3 log2 of the singular values of the sensing matrices obtained using the optimized

reflection mode antenna (blue) and original reflection mode antenna (red) in a multi-static configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.4 Numerical comparison of the reconstruction accuracies of Eq. 4.24 using the opti-mized reflection mode design (blue) and baseline reflection mode design (red). . . 59

4.5 Configuration for the compressive antenna operating in transmission mode. White =Transmitter locations, Orange = Imaging region, Green = Scatterer locations, Red =PEC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.6 Permittivity distribution of the optimized transmission mode antenna. . . . . . . . 604.7 log2 of the singular values of the sensing matrices obtained using the optimized

transmission mode antenna (blue) and original transmission mode antenna (red) in amulti-static configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.8 Numerical comparison of the reconstruction accuracies of Eq. 4.24 using the op-timized transmission mode design (blue) and baseline transmission mode design(red). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.9 Configuration for communications design. . . . . . . . . . . . . . . . . . . . . . . 634.10 Optimized dielectric constant ε . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.11 Comparison of the log2 of the singular values of the channel matrix. . . . . . . . . 644.12 Relative permittivity of ELC resonator for γ = 1 . . . . . . . . . . . . . . . . . . 654.13 Relative permittivity of ELC resonator for γ = 0.05 . . . . . . . . . . . . . . . . . 664.14 log2 of the singular values of the sensing matrices obtained using the optimized

reflection mode antenna (blue) and original reflection mode antenna (red) in a multi-static configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.15 Numerical comparison of the reconstruction accuracies of Eq. 4.24 using the opti-mized ELC reflection mode design (blue) and baseline transmission mode design(red). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

iv

List of Acronyms

ADMM Alternating Direction Method of Multipliers

AGAL Accelerated Gradient Augmented Lagrangian

BA Born Approximation

CBE Clinical Breast Exam

CM Conventional Mammography

CS Compressed Sensing

CSI Contrast Source Inversion

CT Computed Tomography

DBT Digital Breast Tomosynthesis

ELC Electric-LC

FISTA Fast Iterative Shrinkage Thresholding Algorithm

FDFD Finite Differences in the Frequency Domain

GMRES Generalized Minimum Residual

HWC High Water Content

LWC Low Water Content

MAP Maximum a-posteriori

MIMO Multiple Input Multiple Output

MRI Magnetic Resonance Imaging

NRI Nearfield Radar Imaging

PEC Perfect Electric Conductor

PCCS Physicality Constrained Compressed Sensing

RIP Restricted Isometry Property

v

Acknowledgments

Let me start off by saying that working on this thesis was no easy task for me. It wasextremely difficult at times to balance my research work load with my commitments to my employer,Raytheon BBN Technologies, as the two were completely orthogonal from each other. Luckily, therewere many people by my side that supported me on my endeavor. First and foremost, I would liketo thank my employer, Raytheon BBN Technologies. Without Raytheon BBN’s financial supportand flexible work hours, it would have been even more challenging for me to complete this thesis. Iwould like to thank my coworkers Steve Weeks, Paul Dryer, and Rob McGurrin for supporting me onthis journey and for always encouraging me to better myself. I would like to thank my adviser, Prof.Jose Angel Martinez Lorenzo, for his support and encouragement over the past few years, and forbeing understanding of my occasional “breaks” from research whenever I became too overwhelmedby the combined work / research work load. I would like to thank my good friend, Fernando Quivira,for directly and indirectly pushing me to achieve this goal. Last, but certainly not least, I would liketo thank my parents, who have encouraged me to pursue excellence my entire life, and who raisedme to be the man that I am today.

vi

Abstract of the Thesis

Compressed Sensing Algorithms for Electromagnetic Imaging

Applications

by

Richard Obermeier

Master of Science in Electrical and Computer Engineering

Northeastern University, December 2016

Prof. Jose Angel Martinez-Lorenzo, Advisor

Compressed Sensing (CS) theory is a novel signal processing paradigm, which statesthat sparse signals of interest can be accurately recovered from a small set of linear measurementsusing efficient `1-norm minimization techniques. CS theory has been successfully applied tomany sensing applications, such has optical imaging, X-ray CT, and Magnetic Resonance Imaging(MRI). However, there are two critical deficiencies in how CS theory is applied to these practicalsensing applications. First, the most common reconstruction algorithms ignore the constraintsplaced on the recovered variable by the laws of physics. Second, the measurement system must beconstructed deterministically, and so it is not possible to utilize random matrix theory to assess theCS reconstruction capabilities of the sensing matrix.

In this thesis, we propose solutions to these two deficiencies in the context of electromag-netic imaging applications, in which the unknown variables are related to the dielectric constant andconductivity of the scatterers. First, we introduce a set of novel Physicality Constrained CompressedSensing (PCCS) optimization programs, which augment the standard CS optimization programsto force the resulting variables to obey the laws of physics. The PCCS problems are investigatedfrom both theoretical and practical stand-points, as well as in the context of a hybrid Digital BreastTomosynthesis (DBT) / Nearfield Radar Imaging (NRI) system for breast cancer detection. Ouranalysis shows how the PCCS problems provide enhanced recovery capabilities over the standard CSproblems. We also describe three efficient algorithms for solving the PCCS optimization programs.

Second, we present a novel numerical optimization method for designing so-called “com-pressive antennas” with enhanced CS recovery capabilities. In this method, the constitutive pa-rameters of scatterers placed along a traditional antenna are designed in order to maximize thecapacity of the sensing matrix. Through a theoretical analysis and a series of numerical examples, we

vii

demonstrate the ability of the optimization method to design antenna configurations with enhancedCS recovery capabilities. Finally, we briefly discuss an extension of the design method to MultipleInput Multiple Output (MIMO) communication systems.

viii

Chapter 1

Introduction

Sensing systems attempt to extract as much information as possible about an object under

test by recording a set of independent measurements. The number of measurements and the degree of

their independence, as well as the physical limitations of the sensing modality, determine how much

information can be extracted by the sensing system. In general, the reconstruction accuracy of an

imaging system can be improved by adding more measurements. However, great care must be taken

when adding these measurements. In addition to exacerbating a number of practical issues such as

cost and processing power requirements, naively adding measurements often leads to diminishing

returns in the reconstruction accuracy.

Electromagnetic imaging systems, as the name suggests, attempt to reconstruct an image

of the object under test using electromagnetic field measurements. In general, these systems use

multiple transmitting antennas, which are distributed throughout the imaging domain, in order to

excite the object under test with broadband electromagnetic waveforms. These signals interact with

the objects in the imaging region in order to produce the scattered fields that are measured by a set

of receiving antennas. Using a model for the electromagnetic field propagation, these systems can

create an image of the objects within the imaging region. The physical meaning of the image and the

suite of reconstruction algorithms available for use depend upon the specific model that is employed.

For example, radar imaging systems often utilize linear models, which only consider the phase of the

electric field vector as it radiates in the background medium, typically freespace. This allows fast and

computationally efficient inversion methods to be used in order to generate images in quasi-real-time.

Unfortunately, the simplified linear model comes with the drawback that the reconstructed image

only recovers the scatterers’ so-called scalar reflectivity, which cannot easily be traced back to the

constitutive parameters, dielectric constant and conductivity, that govern electromagnetic radiation.

1

CHAPTER 1. INTRODUCTION

More accurate methods, such as the Contrast Source (CSI) algorithm [1, 2, 3], use the full non-

linear model for electromagnetic radiation in order to reconstruct the constitutive parameters of the

scatterers. Unfortunately, these methods tend to be slow and computationally expensive. The Born

Approximation (BA) provides middle ground between the phase-based models of radar imaging

and the accurate, but expensive non-linear methods. In the BA, the non-linear model defined by

Maxwell’s equations is linearized about some starting point, such that the resulting unknown quantity

is intimately related to the constitutive parameters of the scatterers. From here, one can simply use

any number of linear inversion techniques in order to estimate the constitutive parameters of the

scatterers.

Compressed Sensing (CS) theory is a novel signal processing paradigm, which states that

sparse signals of interest can be accurately recovered from a small set of linear measurements, even

when the number of measurements is less than the number of unknowns. In order for CS theory

to be exploited by a sensing system, several conditions must be met. First, as the definition of CS

theory implies, the unknown object of interest must have a sparse representation in some known

domain. Second, the measurement, or sensing matrix must be sufficiently “well-behaved” such that it

obeys a Restricted Isometry Property (RIP). Third, the imaging system must utilize a reconstruction

algorithm that exploits the sparsity priors. Efficient techniques based upon minimizing the `1-norm

are by far the most common techniques used in the field of CS. If the sensing system satisfies these

conditions, then CS theory can be applied in order to recover super-resolution images of the object

under test when compared to alternative methods.

CS theory has been successfully applied in many electromagnetic applications [4, 5, 6].

However, there are two critical deficiencies in how CS theory is applied to these applications. First,

standard CS theory does not always properly exploit all of the prior knowledge that is available in

electromagnetic imaging applications. In particular, when the BA is applied to form a linearized

scattering model, CS theory does not enforce the physical limitations placed upon the dielectric

constant and conductivity by the laws of physics. Intuitively, one expects the reconstruction accuracy

to improve if these so-called physicality constraints are enforced. Unfortunately, the most common

solvers used in industry and academia are specialized for the standard `1-norm minimization programs

of CS theory, such that the physicality constraints cannot easily be enforced. Second, there is no

straight-forward way to design measurement configurations such that the resulting sensing matrix

obeys the RIP. For reasons that will become clear to the reader in Chapter 3, it is NP hard to

determine whether or not given sensing matrix obeys the RIP. To overcome this, researchers have

resorted to using random matrix theory in order construct sensing matrices that obey the RIP with

2


high probability. Unfortunately, this approach cannot be employed in electromagnetic imaging

applications.

1.1 Contributions

This thesis has two main contributions to the application of CS theory in electromagnetic

imaging applications. First, we introduce the concept of Physicality Constrained Compressed Sensing

(PCCS). In PCCS, the standard `1-norm optimization programs of CS are augmented to force the

resulting variables to obey the laws of physics. This thesis considers PCCS from a purely theoretical

perspective, using the same tool sets that are often employed in standard CS theory, as well as from a

practical perspective. With regards to the latter, PCCS is investigated in the context of a hybrid Digital

Breast Tomosynthesis (DBT) / Nearfield Radar Imaging (NRI) system for breast cancer detection. In

general, a standalone NRI system cannot enforce sparsity to detect cancerous lesions in the breast.

However, by fusing NRI with DBT, the hybrid system is able to generate an appropriate reference

distribution for the BA so that, in theory, the breast cancer detection problem can be posed as a

sparse recovery problem. This application is investigated using a 2D full-wave model based on Finite

Differences in the Frequency Domain (FDFD) [7] in order to accurately model electromagnetic wave

propagation within the breast. PCCS is also investigated in the context of general electromagnetic

imaging applications. This analysis shows how PCCS enhances the image reconstruction capabilities

of standard CS theory. This thesis also describes in great detail three efficient algorithms for solving

the PCCS optimization programs. Each of the algorithms excels in different applications, depending

upon the size of the problem and the computational resources available.

In the second contribution, we describe a novel numerical optimization method for design-

ing so-called “compressive antennas” with enhanced CS recovery capabilities. Through a theoretical

analysis, we demonstrate how enhancing the capacity of the sensing matrix improves the lower bound

on CS reconstruction performance, as measured by the RIP. In the design method, the constitutive

parameters of scatterers placed along a traditional antenna are optimized in order to maximize the

capacity of the antenna. The design method is briefly described in its most general form, before it is

discussed in detail in simplified forms that are specialized to scatterers that are pure dielectrics and

to scatterers that consist of Electric-LC (ELC) metamaterial elements. We also briefly discuss an

extension of the design method to Multiple Input Multiple Output (MIMO) communication systems.

Using several numerical examples, which again utilize the 2D FDFD in order to accurately model

electromagnetic wave propagation, we demonstrate the ability of the optimization method to design

3


antenna configurations with enhanced capacity and enhanced CS recovery capabilities.

1.2 Outline

The remainder of this thesis is organized as follows. In Chapter 2, we introduce the concept

of the hybrid DBT / NRI system for breast cancer detection. Within this section, we describe the

linearized model for the electromagnetic sensing problem using the BA. This linearized model serves

as the basis for the CS imaging algorithms, which are described in Chapter 3. We begin this section

with a brief introduction to standard CS techniques, before transitioning to the specialized PCCS

techniques. In Chapter 4, we describe the novel compressive antenna design method, which is based

upon the maximization of the channel capacity, and assess its performance with a set of numerical

examples. Finally, in Chapter 5 we conclude the thesis by describing some interesting extensions and

improvements to the work presented herein that will be topics of future research.

4

Chapter 2

Hybrid DBT / NRI System for Breast

Cancer Detection

2.1 Motivation

A recent report by the Center for Disease Control and Prevention [8] states that breast

cancer is the most common type of cancer among women, with a rate of 118.7 cases per 100, 000

women, and that it is the second deadliest type of cancer among women, with a mortality rate of 21.9

deaths per 100, 000 women. It is well known that the detection of breast cancer in its early stages

can greatly improve a woman’s chance for survival, as the lesions tend to be smaller and are less

likely to have spread from the breast than more developed cancer. Although small cancers near the

surface of the breast can be detected by means of a clinical breast exam (CBE), cancers deep within

the breast can only be detected through non-invasive imaging.

Conventional Mammography (CM) is a widely used X-ray-based technology, which creates

a two-dimensional image of the breast. Because CM only creates a two-dimensional image of the

three-dimensional breast, overlapping tissue from different cross-sections of the breast can degrade

the quality of the images. Digital Breast Tomosynthesis (DBT) improves CM by generating a three-

dimensional image of the breast [9], thereby mitigating the effects of tissue overlap. Unfortunately,

CM and DBT both suffer from the small radiological contrast between healthy tissue and cancerous

tissue, which is on the order of 1%. As a result, these technologies tend to produce a large number of

false positives when used for early detection.

Nearfield Radar Imaging (NRI) is a less common technology for breast cancer detection.

5

CHAPTER 2. HYBRID DBT / NRI SYSTEM FOR BREAST CANCER DETECTION

Unlike CM and DBT, NRI excites the breast using non-ionizing microwave radiation. NRI is an

appealing technology for breast cancer detection because the contrast between healthy breast tissue

and cancerous tissue is on the order of 10% at microwave frequencies [10]. This result can be seen in

Figures 2.1 and 2.2, which display the dielectric constant and conductivity of various breast tissues

as a function of frequency. Unfortunately, the improved contrast between healthy breast tissue and

cancerous tissue comes at a cost: at microwave frequencies, the mutual coupling between the different

tissue types cannot be ignored, such that it is difficult to accurately model wave propagation within

the heterogeneous distribution of tissues within the breast. Without an accurate wave propagation

model, NRI systems fail to detect cancerous lesions within the breast.

Figure 2.1: Comparison of dielectric constant of various breast tissues as a function of frequency.

2.2 System Overview

Recent papers [11, 12, 13] have introduced the concept of a Hybrid DBT / NRI system for

breast cancer detection. The basic idea behind the hybrid system is that, by combining the strengths

of both DBT and NRI at microwave frequencies, the detection rate of cancerous lesions can be

improved. In this section, we provide an overview of how the hybrid system could be used in a

clinical setting. The hybrid system operates in a similar manner to a conventional mammogram

system. To start, the breast is placed under clinical compression in order to ensure that there is

6


minimal movement throughout the sensing process. Once the breast has been compressed, it is

excited by an X-ray source that is mechanically scanned over multiple view angles, and the radiation

that passes through the breast is measured by a set of detectors on the opposite side of the breast.

This process is depicted in Figure 2.3.

Figure 2.2: Comparison of conductivities of various breast tissues as a function of frequency.

Figure 2.3: Conceptual diagram of the DBT measurement process.

7


At this point, the measurement process is complete in a conventional DBT system. In the

hybrid DBT / NRI system, however, the NRI measurements are recorded immediately after the DBT

measurements have been completed. This ensures that the measurements between the two systems

are co-registered. Any differences in the relative position of the breast between the two measurement

periods only inhibits the ability to successfully fuse the two systems; the sequential measurement

process minimizes the probability of this occurrence. In the NRI measurement process, one or more

transmitting and receiving antennas are mechanically scanned over the the breast, as depicted in

Figure 2.4. In this figure, a single transmitting antenna and an array of receiving antennas on the

opposite side of the breast are used, although other configurations, i.e. multiple monostatic, can also

be used. In order to minimize the reflections from the surface of the breast, the transmitting and

receiving antennas are placed in a plastic container filled with a bolus matching liquid. This liquid

has minimal effect at X-ray frequencies, and so the hybrid DBT / NRI system can utilize a modified

compression paddle configuration that can be used for both the DBT and NRI measurements.

Figure 2.4: Conceptual diagram of the NRI measurement process.

The remainder of this chapter describes the data processing methodology of the hybrid

DBT / NRI system. An overview of this process is presented in Figure 2.5. The data processing

can be separated into three primary components: 1) DBT Segmentation, 2) NRI Modeling, and 3)

Image Reconstruction. The basic premise is to use the DBT measurements and the resulting DBT

reconstruction in order to establish suitable priors for the NRI imaging process, so that the enhanced

8


contrast at microwave frequencies can be maximally exploited. These processing components are

described in detail in the next three sections.

Figure 2.5: Overview of the Hybrid DBT / NRI system processing.

2.3 DBT Segmentation

The DBT segmentation process can be divided into three primary components, as depicted

in Figure 2.5. The first component, the DBT measurement process, was discussed in the previous

section. In the second component, the DBT measurements are used to create a high-resolution three-

dimensional image of the X-ray attenuation coefficients of the compressed breast. DBT imaging

techniques have been established in the literature and are outside the scope of this thesis; see [9]

for details. This image is in turn used to segment the breast into three types of tissue, skin, muscle

(pectoralis major), and breast tissue, and it is assumed that the latter only contains healthy tissue.

Each voxel of breast tissue is further characterized by its percentage of fatty tissue and fibroglandular

tissue based upon the intensity of the DBT image, as is shown in Figure 2.6. This is possible because

the X-ray attenuation coefficient is proportional to the fat content of the tissues; high fat tissues

absorb less X-rays than tissues with low fat content and high water content.

The third and final component of the DBT segmentation process establishes the priors for

the NRI imaging process. The priors are described in terms of the frequency-dependent dielectric

constant εb(r, ω) and conductivity σb(r, ω) of the breast tissues. These constitutive parameters

9


Figure 2.6: Overview of the DBT segmentation process.

are extracted directly from the fat content segmentation using the composite model developed in

[14]. This composite model was created based upon the work of Lazebnik et. al in [10]. In their

work, Lazebnik et. al experimentally measured the dielectric constant and conductivity of breast

tissue samples of various fat and fibroglandular percentages, and fit the frequency-dependence of

the parameters to a Cole-Cole model. From this data, the composite model in [14] was developed in

order to establish the dielectric constant and conductivity of breast tissue compositions that were not

directly measured in the study. The results of this composite model are displayed in Figures 2.7 and

2.8 for the dielectric constant and conductivity respectively at a 5GHz frequency. The black curves

display interpolated sample points measured in [10], and the green curves display the results of the

composite model. Overall, the composite model fits the the measurements well.

2.4 NRI Modeling

The NRI modeling process consists of two main components, which can be performed

simultaneously. Given the dielectric constants and conductivities segmented from the DBT image,

the goal is to model the NRI measurement process of the assumed healthy breast. This process

can be described using electromagnetic theory: the electric fields Eb(r, ω) produced when the NRI

source distribution I(r, ω) excites the complex permittivity εb(r, ω) = εb(r, ω) + σb(r,ω)ωε0of the

breast tissues must satisfy the vector Helmholtz equation:

∇×∇×Eb(r, ω)− k2b (r, ω)Eb(r, ω) = ωµ0I(r, ω) (2.1)

10


Figure 2.7: Comparison of the dielectric constant composite model to the measurements presented in

[10].

Figure 2.8: Comparison of the conductivity composite model to the measurements presented in [10].

where kb(r, ω) = ω√µ0ε0εb(r, ω) is the wavenumber. The solution Eb(r, ω) can be explicitly

written in terms of the dyadic Green’s functions Gb(r, r′, ω) of the heterogeneous breast as follows:

Eb(r, ω) = ω

∫Gb(r, r

′, ω)I(r′, ω)dr′ (2.2)

where Gb(r, r′, ω) is the solution to:

∇×∇×Gb(r, r′, ω)− k2b (r, ω)Gb(r, r

′, ω) = Iδ(r− r′

)(2.3)

and I is the unit dyad. The dyadic Green’s functions Gb(r, r′, ω) and Eb(r, ω) are both required

for the NRI reconstruction process, which is described in the next section. For complicated inho-

11


mogeneous media such as the human breast, the wave equations of Eq. 2.1 and 2.3 do not have

closed-form solutions, and so these quantities must be computed numerically. The NRI system

accomplishes this using a three-dimensional numerical model based on Finite Differences in the

Frequency Domain (FDFD) [7]. The FDFD discretizes the computational region into cubic voxels,

so that the deriviatives within the curl operators are approximated by finite differences of adjacent

voxels. The resulting discretization leads to a linear system Ax = b with a very sparse matrix A; this

system must be solved in order to compute the electric fields.

In order to accurately model wave propagation, the voxel dimension h must be chosen

sufficiently small relative to the smallest wavelength of the inhomogeneous medium. Due to the large

dielectric constant of high water content breast tissues, the FDFD must use a grid size on the order of

1 millimeter at low microwave frequencies near 1GHz in order to accurately model wave propagation.

As a result, the computational geometry is sufficiently large such that the discretized linear system

cannot be solved using direct methods (i.e. LU decomposition). Instead, the three-dimensional FDFD

utilizes the generalized minimal residual (GMRES) algorithm [15]. GMRES is an iterative method

that requires a suitable pre-conditioner in order to produce accurate results. The large dimensionality

and iterative nature of the FDFD therefore make the NRI modeling a computationally expensive and

time consuming process.

2.5 Image Reconstruction

The final component of the hybrid DBT / NRI system processing combines the NRI

measurements with the modeled Green’s functions and electric fields in order to recover the dielectric

constant and conductivity of each voxel within the breast. In order to accomplish this, the total

electric field vector can be decomposed into two components as follows:

E(r, ω) = Eb(r, ω) + Es(r, ω) (2.4)

where Eb(r, ω) are the modeled electric fields within the assumed healthy breast, and Es(r, ω) are

the “scattered” electric fields, that is, the fields produced by any differences between the modeled

complex permittivity εb(r, ω) and unknown true complex permittivity ε(r, ω). The electric field

E(r, ω) satisfies the Helmholtz equation of Eq. 2.1 for k(r, ω) = ω√µ0ε0ε(r, ω) instead of kb(r, ω).

Combining this with Eq. 2.4 and simplifying leads to the following expression relating the scattered

12


fields, total fields, and complex permittivity:

Es(r, ω) =

∫Gb(r, r

′, ω)k2b (r′, ω)E(r′, ω)χ(r′, ω)dr′ (2.5)

where χ(r, ω) = ε(r,ω)−εb(r,ω)εb(r,ω)

is called the contrast variable. This inversion model is used often in

the literature, see for example the Contrast Source (CSI) algorithm [1, 2, 3].

Eq. 2.5 is a nonlinear function of the contrast variable χ(r, ω) and total electric field

E(r, ω), and so nonlinear programming techniques such as the CSI must be applied in order to

recover χ(r, ω). These types of nonlinear algorithms typically require several calls to a forward

model solver such as the FDFD in each iteration. As was stated in the previous section, it is

computationally expensive and time consuming to model the NRI sensing process with the FDFD.

Exceptionally long processing times severely inhibit the usefulness of these algorithms in widespread

clinical settings, so it is desirable to make some simplifying assumptions in order to reduce the

computation time. This work makes two such assumptions. First, the Born Approximation (BA) is

applied, E(r, ω) ≈ Eb(r, ω), in order to linearize Eq. 2.5. Second, the complex permittivities are

assumed to be approximately constant over the frequency range of the NRI system, i.e. ε(r, ω) ≈ ε(r)

and εb(r, ω) ≈ εb(r), so that the contrast variable is also approximately constant over frequency.

With these two modifications, Eq. 2.5 can be rewritten in the following form:

Es(r, ω) =

∫Gb(r, r

′, ω)k2b (r′, ω)Eb(r

′, ω)χ(r′)dr′ (2.6)

+ es(r, ω)

where es(r, ω) is the error introduced by the approximating assumptions.

Eq. 2.6 can be discretized as y = Ax+ e+ ν, where x ∈ CN are the contrast variables,

y ∈ CM are the measured fields, A ∈ CM×N is the sensing matrix constructed from the incident

fields and Green’s functions of the background medium, e ∈ CM is the error vector, and ν ∈ CM is

the random noise introduced by the measurement system. In most applications, M < N , and so this

system has an infinite number of solutions satisfying y = Ax+ e+ ν. When ‖e‖`2 � ‖ν‖`2 , the

performance of linear inverse techniques only depends upon the vector ν. When ‖e‖`2 ∼ ‖ν‖`2 , then

the performance of linear inverse techniques depends upon both e and ν. In practice, the statistics of

the measurement noise ν can be estimated in order to tune the proposed inverse algorithm accordingly.

However, it is difficult to estimate e, since it requires a-priori knowledge of the unknown contrast

variable x.

The unknown, unmeasurable error vector e in the linearized model explains why stand-

alone NRI systems tend to perform poorly in breast cancer imaging applications. Without any prior

13


knowledge, stand-alone NRI systems select a homogeneous background medium εb(r, ω) whose

dielectric constant and conductivity are derived from averaging that of low-water-content (LWC)

fatty tissue and high-water-content (HWC) fibroglandular tissue. Choosing a homogeneous dielectric

constant leads to a contrast variable x that has a significant number of large, non-zero elements. This

violates the assumptions made by the BA and produces an error vector e with a large norm, which

severely impacts the ability of all reconstruction algorithms to accurately invert the linear system of

equations.

At this point, any number of linear inverse techniques can be used in order to recover the the

contrast variable x, and from it the complex permittivity ε(r). The following question naturally arises

from this: what inverse technique should be used? In the next chapter, we discuss the motivation for

applying Compressed Sensing (CS) techniques to the NRI inversion process.

14

Chapter 3

Compressed Sensing in Electromagnetic

Imaging Applications

3.1 Motivation

The previous chapter established the linearized NRI sensing process y = Ax+ e+ ν using

the Born Approximation. Since the number of measurements M is much smaller than the number

of variables N that are to be recovered, i.e. M � N , there are an infinite number of solutions x

satisfying y = Ax. When such ill-posed systems are encountered, additional information must be

introduced in the form of regularization terms in order to recover a meaningful estimate x of the true

vector xt. For example, in some applications, it is desirable to solve for the particular solution that

has the minimum energy, i.e.:

minimizex

‖x‖`2 (3.1)

subject to Ax = y

This is a convex optimization problem and has the closed-form solution:

x = AH(AAH

)−1y = A†y (3.2)

In other applications, when it is known that the measurement vector y is corrupted by additive noise,

it is more appropriate to solve the quadratically constrained quadratic program:

minimizex

‖x‖`2 (3.3)

subject to ‖Ax− y‖`2 ≤ η

15

CHAPTER 3. COMPRESSED SENSING IN ELECTROMAGNETIC IMAGING APPLICATIONS

This is equivalent to the Tikonov regularization problem for some value of λ:

minimizex

λ‖x‖2`2 + ‖Ax− y‖22 (3.4)

which has the closed-form solution:

x =(λI +AHA

)−1AHy (3.5)

The approaches outlined above are convenient in that they exhibit closed-form solutions.

However, they do not properly consider the prior knowledge available in the NRI sensing problem.

Ideally, the hybrid DBT / NRI system would segment the healthy breast tissue perfectly and would

classify any cancerous tissue as HWC fibroglandular tissue, so that the true contrast variable xt

is non-zero only at the locations of cancerous lesions. Since cancerous legions make up only a

small percentage of the breast tissues, the contrast variable xt only has a small number of non-zero

elements. One might then consider finding the sparsest solution that satisfies some error constraints

on the measured data, i.e. the solution to the problem:

(P0) minimizex

‖x‖`0 (3.6)


where ‖x‖`0 is known as the “`0-norm” and simply computes the number of non-zero elements in

the vector x. Unfortunately, Eq. 3.6 is a non-convex, NP hard optimization problem that can only be

solved by exhaustively searching all of the possible sparsity patterns of x. For realistic values of M

and N , this problem simply cannot be solved in a reasonable amount of time.

3.2 Compressed Sensing

Researchers have encountered sparsity problems similar to Eq. 3.6 for many years now. In

order to overcome the NP hard nature of these problems, it is common practice to simply replace the

`0-norm with the `1-norm. This can be interpreted as a convex relaxation of the original non-convex

problem, as the `1-norm is the convex envelope of the `0-norm [16]. When this heuristic is applied

16


to Eq. 3.6, sparse vectors x can be recovered by solving one of three equivalent convex programs:

(P1) minimizex

‖x‖`1 (3.7)


(P2) minimizex

λ‖x‖`1 +1

2

∥∥Ax− y∥∥2`2

(3.8)

(P3) minimizex

1

2‖Ax− y‖2`2 (3.9)

subject to ‖x‖`1 ≤ τ

Historically, the `1-norm has been employed as a heuristic; although the `1 norm tends to produced

sparse solutions, in general there is no guarantee that the recovered vector x represents the desired

sparse vector x in any meaningful way. Recent breakthroughs in Compressed Sensing (CS) theory

establish conditions under which the `1-norm heuristic recovers a sparse solution that is within a

finite bound, determined only by the matrix A and measurement error η, of the desired sparse vector

[17, 18, 19, 20]. These conditions rely on the notion of a restricted isometry constant: for a given

sparsity level S, the restricted isometry constant δS is defined as the smallest positive constant such

that:

(1− δS)‖x‖2`2 ≤ ‖Ax‖2`2 ≤ (1 + δS)‖x‖2`2 (3.10)

for all x satisfying ‖x‖`0 ≤ S [17]. Using this definition, CS theory establishes that a solution x to

Eq. 3.7 satisfies the following condition:

‖x− xt‖`2 ≤ CSη (3.11)

provided that the the matrix A satisfies the restricted isometry property (RIP) [18]:

δ3S + 3δ4S < 2 (3.12)

CS theory further extends this condition to “compressible” vectors, i.e. vectors that are approximately

sparse; in this case, the solution x satisfies:

‖x− xt‖`2 ≤ C1,Sη + C2,S‖xt − xt,S‖`1√

S(3.13)

where xt,S is the S-sparse vector containing the largest S entries in xt, provided that A satisfies the

RIP [18]. Note that both of these conditions hold for Eq. 3.8 and 3.9 for the appropriate values of λ

and τ .

17


While the RIP establishes the necessary conditions for the stable recovery of sparse and

compressible vectors, it does not specify how one designs sensing matrices that satisfy the RIP. It

is difficult to verify whether a given matrix A satisfies the RIP for some sparsity level S, as two

NP hard problems must be solved in order to compute the restricted isometry constants δ3S and

δ4S . Researches have used tools from random matrix theory in order to find matrices that satisfy

the RIP with overwhelmingly high probability [19, 21, 18]. These matrices can be divided into two

categories: 1) matrices whose elements are i.i.d. sub-gaussian random variables, and 2) matrices

whose rows are randomly drawn from an orthonormal matrix. In the Hybrid DBT / NRI system, and

electromagnetic imaging applications in general, we do not have the flexibility to design the sensing

matrix in this manner in order to guarantee that the RIP is satisfied. In Chapter 4, we introduce an

antenna design method that seeks to improve the imaging capabilities of the sensing matrix A by

maximizing the channel capacity. This design approach is a heuristic for improving the sensing

properties of a matrix in the same way that the `1-norm is a heuristic for generating sparse solution

vectors, and by no means does it guarantee that the RIP is satisfied. For the time being, we proceed

with the `1-norm as a heuristic, and use the conditions defined in CS theory to aid in the measurement

selection process.

3.3 Physicality Constrained Compressed Sensing (PCCS)

Researchers have applied the standard CS programs of Eq. 3.7, 3.8, and 3.9 in many

electromagnetic applications [4, 5, 6]. In electromagnetic-based tomographic imaging, one seeks

to reconstruct the constitutive parameters, dielectric constant and conductivity, of the object under

test. These parameters are bound by the fundamental constraints placed on them by the laws of

physics, namely εr ≥ 1 and σ ≥ 0. The standard CS programs do not consider these fundamental

limitations, so it is very much possible that the optimal solution recovered by these algorithms is

physically unrealizable. To overcome these pitfalls, we propose the following Physicality Constrained

Compressed Sensing (PCCS) optimization programs for electromagnetic-based tomographic imaging

18


applications:

(P1) minimizex

‖x‖`1 (3.14)


Re(diag(εb)x+ εb) � 1

Im(diag(εb)x+ εb) � 0

(P2) minimizex

λ‖x‖`1 +1

2

∥∥Ax− y∥∥2`2

(3.15)

subject to Re(diag(εb)x+ εb) � 1


(P3) minimizex

1

2‖Ax− y‖2`2 (3.16)




where εb is the vector containing the background complex permittivity at each point in the imaging

region. The PCCS programs are convex, despite the odd-looking box constraints on the real

and imaginary components of the complex vector. To make this explicit, the problems can be

written in an equivalent form in terms of three variables: the contrast x ∈ CN , the real part of the

permittivity εR ∈ RN , and the complex part of the permittivity εI ∈ RN . For example, the equivalent

representation of Eq. 3.14 is:

(P1) minimizex,εR,εI

‖x‖`1 (3.17)


εR � 1

εI � 0

x = diag(εb)−1(εR + εI − εb)

19


3.3.1 Theoretical Considerations for the PCCS Problems

The motivation for using the PCCS programs over the traditional CS versions is straight-

forward. Not only do they produce solutions that are physically realizable, but they should also

produce more accurate solutions than the traditional programs because they consider more prior

knowledge. In this section, we address several theoretical considerations for using the PCCS pro-

grams in electromagnetic imaging problems. Consider the following related physicality constrained

problem. Suppose that we wish to find the sparse solution to the linear equation y = Ax, where

y ∈ RM , x ∈ RN , given the constraint that the elements of x are strictly non-negative, i.e. x � 0.

Traditional CS recovers x using basis pursuit:

minimize ‖x‖`1 (3.18)

subject to Ax = y

Clearly, Eq. 3.18 does not enforce the physicality constraint on the variable x. Therefore, it is

possible that basis pursuit will compute a solution that violates the physicality constraint. One

example of this is displayed in Figure 3.1. In this figure, the blue line represents the values of x that

satisfy the equality constraint, the green shaded region represents the values of x that satisfy the

physicality constraint x � 0, and the red diamond represents the `1-ball of norm ‖x‖`1 = 1. Figures

of this form are often presented in order to describe why the `1-norm produces sparse solutions

as a heuristic. The optimal solution to Eq. 3.18 is the intersection of the `1-ball and the equality

constraint, x = [−1, 0]T . Clearly, this solution vector is infeasible, as it violates the physicality

constraint x � 0.

In contrast, PCCS recovers x using the modified basis pursuit problem:

minimize ‖x‖`1 (3.19)

subject to Ax = y

x � 0

Intuitively, one expects this solution to produce sparse solutions, given the `1-norm heuristic applied

to basis pursuit. However, one also expects that the additional physicality constraint should improve

the accuracy of the solution compared to basis pursuit. This can be seen in Figure 3.2, which is

similar to Figure 3.1 except that the `1-ball intersects the equality constraint at the solution to Eq.

3.19, x = [0, 1.5]T , which is the true sparse solution to this problem.

20


Figure 3.1: Depiction of the basis pursuit problem Eq. 3.18.

Figure 3.2: Depiction of the PCCS basis pursuit problem of Eq. 3.19.

Additional insight for the PCCS programs can be obtained from a statistical perspective.

Consider a scenario where noisy measurements y = Ax + n of the vector x ∈ RN are obtained,

where the elements of n ∈ RM are i.i.d. Gaussian with zero mean and variance σ2. If the elements

of x are i.i.d. Laplacian random variables with zero mean and scale parameter λ, then the maximum

21


a-posteriori (MAP) estimation technique computes the value of x that maximizes:

xMAP = argmaxx

p(x|y) = argmaxx

p(y|x)p(x)

p(y)(3.20)

= argmaxx

p(y)−1(2πσ2

)−M/2e− 1

2σ2‖y−Ax‖2`2

λN

2e−λ‖x‖`1

Maximizing instead over log p(x|y) leads to the following optimization program for xMAP:

xMAP = argminx

1

2σ2‖y −Ax‖2`2 + λ‖x‖`1 (3.21)

which has the same form as the basis pursuit denoising problem of Eq. 3.8. Given the equivalence

of Eq. 3.7 - 3.9 for appropriate values of η, λ, and τ , we can say that the traditional CS programs

compute the MAP estimate of the unknown vector x when its elements are distributed as i.i.d.

Laplacian random variables.

Consider now the same scenario, except that the elements of x are i.i.d. expontential

random variables with scale parameter λ. In this case, the MAP estimation technique selects the

value of x that maximizes:

xMAP = argmaxx

p(x|y) = argmaxx

p(y|x)p(x)

p(y)(3.22)

= argmaxx�0

p(y)−1(2πσ2

)−M/2e− 1

2σ2‖y−Ax‖2`2λNe−λ1

T x

Once again, maximizing over log p(x|y) leads to an alternative expression for xMAP:

xMAP = argminx�0

1

2σ2‖y −Ax‖2`2 + λ1Tx (3.23)

= argminx�0

1

2σ2‖y −Ax‖2`2 + λ‖x‖`1

which has a form similar to Eq. 3.15. Indeed, Eq. 3.15 reduces to Eq. 3.23 for electromagnetic

imaging problems in a free-space background when the scattering elements have negligible conduc-

tivity, so that Im(ε) = 0. This result obviously does not hold for all values of εb, but it does provide

a statistical motivation for enforcing the physicality constraints. Considering the `1-norm regularizer

as a term from the prior probability distribution on x, we find that the traditional CS programs assign

non-zero probability to values that are not physically realizable. Using the PCCS programs to enforce

physicality of the solution resolves this issue.

Our final example provides firm theoretical justification for the reconstruction capabilities

of the PCCS problems in the context of a related CS problem. Consider a scenario where noiseless

22


measurements y = Ax of the sparse vector x ∈ RN are obtained. Ideally, one would solve the

following `0-norm minimization problem to recover x:

(P0) minimizex

‖x‖`0 (3.24)

subject to Ax = y

Suppose that we have obtained a candidate solution z to Eq. 3.24 with ‖z‖`0 = S. It is easy to show

that z is the unique minimizer of Eq. 3.24 if and only if δ2S < 1, that is, there are no 2S-sparse

vectors in the nullspace of A. To prove this, suppose that there exists another solution w satisfying

‖w‖`0 = S and Aw = y. It follows from this that the vector z − w lies within the nullspace of

A, since Aw = Az → A(z − w) = 0. Since ‖z − w‖`0 ≤ 2S, non-trivial solutions z 6= w are

guaranteed to exist if δ2S ≥ 1. Therefore, if δ2S < 1, then z is guaranteed to be the unique minimizer

of Eq. 3.24.

Consider now the problem in which the elements of x are constrained to be strictly positive.

In this case, the sparsest vector can be recovered by solving the following `0-norm minimization

problem:

(P0) minimizex

‖x‖`0 (3.25)

subject to Ax = y

x � 0 (3.26)

How can we guarantee that a candidate solution z to Eq. 3.3.1 with ‖z‖`0 = S is the unique solution?

The requirement that δ2S < 1 derived for Eq. 3.24 is sufficient, but not necessary for guaranteeing

unique solutions to Eq. 3.3.1. To see this, once again suppose that there exists another S-sparse

solution w to this problem, which leads to the necessary condition that A(z −w) = 0 as before. The

difference this time, however, is that z−w is restricted in its sign pattern. The clearest way to see this

is to analyze the case S = Smin, where Smin is the smallest integer such that the condition δ2S < 1

is violated. This implies that z and w are supported on disjoint sets, i.e. wizi = 0 ∀i = 1, . . . , N .

Since z and w must be both physically realizable, i.e. z � 0 and w � 0, in order for them to be

solutions of Eq. 3.3.1, z − w must have exactly S positive values and S negative values. Therefore,

if the nullspace of A does not have any vectors with exactly S positive elements and S negative

elements, then z must be the unique solution to Eq. 3.3.1.

This result can easily be generalized to higher sparsity levels, where S > Smin. For these

sparsity levels, we cannot assume that w and z are necessarily disjoint. Indeed, if S = Smin + L for

23


L > 0, there exists nontrivial vectors w − z in the nullspace of A satisfying ‖w − z‖`0 = 2Smin ≤2S = 2Smin + 2L. We now consider the signs of the K ≤ L values in z −w where z and w overlap.

If zk > wk for all k in the overlapping set, then w − z has S negative values and S − L positive

values. However, if zk ≤ wk for any k, then w − z has fewer than S negative values. Therefore, the

sufficient condition for Eq. produce unique S-sparse solutions is that all vectors in the nullspace of

A with at least 2S elements have at least S + 1 negative elements.

At this point, the following question has not been answered: how do we solve the PCCS

programs defined in Eq. 3.14, 3.15, and 3.16? Indeed, the physicality constraints prevent traditional

CS solvers from being used to solve the PCCS programs. For small-scale problems, the PCCS

programs can of course be solved using a general purpose solver such as CVX [22, 23], which is an

interpreter for an interior point method solver chosen by the user. For large scale problems, however,

it has been shown that specialized algorithms outperform general purpose solvers in traditional CS

applications. To this end, we describe three efficient algorithms for solving the PCCS programs in

the following sections. Each of these algorithms have their own set of strengths and weaknesses that

make them more or less appropriate to use depending upon the specific conditions of the problem

being solved.

3.4 Solving the PCCS Programs using Nesterov’s Method

In this section, we describe a first-order algorithm for solving 3.15. This algorithm utilizes

Nesterov’s accelerated gradient method for non-smooth convex optimization [24]. Nesterov’s method

has already been applied to the standard CS programs of Eq. 3.7 and 3.8, in the form of the NESTA

toolbox [25]. The following sub-sections describe how the flexibility of Nesterov’s algorithm allowed

us to naturally incorporate the additional constraints on the contrast variables into the NESTA

algorithm.

3.4.1 Nesterov’s Accelerated Gradient Method for Non-smooth Convex Optimiza-tion

Nesterov’s accelerated gradient method is an optimal first-order method for solving smooth

convex optimization problems [26]. Suppose that we have a smooth convex function, which we seek

24


to minimize over some convex set, i.e.:

minimizex

f(x) (3.27)

subject to x ∈ Qp

If the function is continuously differentiable over the feasible set Qp, and its gradient is Lipschitz

continuous with constant L, that is,∇f(x) obeys:

‖∇f(x)−∇f(y)‖`2 ≤ L‖x− y‖`2 (3.28)

then Nesterov’s method can be used to solve Eq. 3.27.

Nesterov’s method is summarized in Alg. 1. The algorithm involves iteratively computing

three vectors, x(k), y(k), and z(k). The y(k) vector is updated by taking a projected gradient step away

from the current iterate x(k). The z(k) vector is updated by minimizing over x a linear combination

of pp(x) and the projection of x onto the average of the gradients evaluated at all points x(i) up to

and including the current iteration k, where pp(x) is a proximal function for the convex set Qp, i.e.

any function that is continuous and strongly convex on Qp [27]. It is assumed that pp(x) vanishes at

some point xc in the set Qp and that it satisfies the following condition for some value σp:

pp(x) ≥ σp2‖x− xc‖2`2 (3.29)

The final vector x(k) is updated by averaging the latest iterates of y(k) and z(k). Nesterov proved that

the averaging sequences α(k) =1

2(k + 1) for the gradient averaging and τ (k) =

2

k + 3for the x(k)

updates are optimal and guarantee a convergence rate of [26]:

f(y(k))− f(x∗) ≤ 4Lpp(x∗)

(k + 1)2σp(3.30)

In a more recent work [24], Nesterov generalized his accelerated gradient method for

non-smooth convex functions. Suppose that the non-smooth function f(x) can be written in the

following form:

f(x) = maximizeu

〈u,Wx〉 (3.31)

subject to u ∈ Qd

25


Algorithm 1: Overview of Nesterov’s accelerated gradient method.

Given x(0)

for k = 0, 1, 2, . . . doCompute∇f(x(k))

Compute y(k)

y(k) = argminx∈Qp

L2 ‖x− x

(k)‖2`2 + 〈∇f(x(k)), x− x(k)〉

Compute z(k)

z(k) = argminx∈Qp

Lσppp(x) +

∑ki=0 αi〈∇f(x(i)), x− x(i)〉

α(k) = 12(k + 1)

Compute x(k+1)

x(k+1) = τ (k)z(k) + (1− τ (k))y(k)

τ (k) = 2k+3

end

where Qd is a convex set. Nesterov proposed that f(x) be replaced by the following smooth

approximation:

fµ(x) = maximizeu

〈u,Wx〉 − µpd(u) (3.32)

subject to u ∈ Qd

where pd(u) is a proximal function for the convex set Qd with convexity parameter σd. Nesterov

showed that this smoothed approximation is continuously differentiable with its gradient and Lipschitz

constant satisfying:

∇fµ(x) = WHuµ(x) (3.33)

L =1

µσd‖W‖2`2 (3.34)

Nesterov’s method can now be applied to fµ(x) and will achieve the convergence rate of Eq. 3.30.

26


3.4.2 Nesterov’s Method for Traditional CS Problems

Recently, Becker et al. applied Nesterov’s method to solve the two standard CS programs

of Eq. 3.7 and 3.8 [25]. In their work, they proposed the following smooth approximation to the

`1-norm:

fµ(x) = maximizeu

〈u, x〉 − µ

2‖u‖2`2 (3.35)

subject to ‖u‖`∞ ≤ 1

In this case, fµ(x) is the Huber function, whose gradient and Lipschitz constant satisfy:

∂

∂xifµ(x) =

1µxi, |xi| < µ

sgn(xi), otherwise(3.36)

L =1

µ(3.37)

Nesterov’s algorithm can now be applied to the smoothed problems:

(P1) minimizex

fµ(x) (3.38)


(P2) minimizex

λfµ(x) +1

2

∥∥Ax− y∥∥2`2

(3.39)

It is straightforward to solve Eq. 3.39 using Nesterov’s method, as it is an unconstrained convex

program with Lipschitz constant L = λµ + ‖A‖2`2 . It is more difficult to solve Eq. 3.38 for a general

matrix A. In their preliminary work, Becker et al. focused on the particular case where the rows

of A are orthonormal, which leads to closed-form solutions for the y(k) and z(k) updates when the

prox-function is chosen to be:

pp(x) =1

2‖x− x0‖2`2 (3.40)

This solution is not particularly useful in electromagnetic imaging applications, since we cannot

arbitrarily design our sensing matrix to have orthonormal rows. In later work [28], Becker expanded

the NESTA algorithm to support any sensing matrix A using the proximal operator for the quadratic

constraint. In general, the proximal operator for a convex set Qp is defined as:

PQp(z) = minimizex

1

2‖x− z‖2`2 (3.41)

subject to x ∈ Qp

27


To see how the proximal operator was introduced to Nesterov’s method, the update steps for y(k)

and z(k) defined in Alg. 1 can be rewritten as follows, assuming that the prox-function for the z(k)

update takes the form of Eq. 3.40:

y(k) = proxIQp

(x(k) − 1

L∇f(x(k))

)(3.42)

z(k) = proxIQp

(x0 −

1

L

k∑i=0

αi∇f(x(i))

)(3.43)

That is, each iteration of Nesterov’s method requires two first-order steps and two calls to the

proximal operator to the feasible set. Therefore, for a general sensing matrix A, the following convex

optimization problem is solved at each step of NESTA:

minimizex

‖x− z‖2`2 (3.44)


Eq. 3.44 is a quadratically constrained quadratic program, which can be solved using many different

algorithms [29]. In [28], Becker developed an efficient method for solving this problem using the

singular value decomposition A = UΣV H . In this case, we can exploit the fact that the `2-norm of

a vector is preserved when that vector is multiplied by an orthonormal matrix in order to simplify

the problem. Specifically, by introducing the change in variables z = V z, x = V x, y = UHy, the

problem can be re-expressed as:

minimizex

‖x− z‖2`2 (3.45)

subject to ‖Σx− y‖`2 ≤ η

The stationarity condition for this problem dictates that optimal Lagrange multiplier λ∗ yields the

following the optimal value for x∗:

x∗ =(I + λ∗ΣTΣ

)−1(z + λ∗Σy) (3.46)

If the initial point z is feasible (i.e. ‖Σz− y‖2`2 ≤ η2), then λ∗ is necessarily zero and so the problem

is already solved. If the initial point z is infeasible, then we must find the value of λ such that:

‖Σ(I + λΣTΣ

)−1(z + λΣy)− y‖2`2 = η2 (3.47)

NESTA solves this problem using an efficient Newton method (note that Σ(I + λΣTΣ

)−1 is a

diagonal matrix).

28


3.4.3 Nesterov’s Method for PCCS Problems

The update steps of Eq. 3.42 and 3.43 show that Nesterov’s method can be used to solve

any smooth convex optimization problem, provided that the gradient of the objective function, the

Lipschitz constant, and the proximal operator for the feasible set are known. It is straightforward

then to extend NESTA to the PCCS programs of Eq. 3.14 and 3.15: we simply need to solve for

the respective proximal operators. The proximal operator for the physicality set of Eq. 3.15 is the

simpler of the two, and can be written as the solution to the convex problem:

minimizex

‖x− z‖2`2 (3.48)

subject to Re(diag(εb)x+ εb) � 1


This problem is separable in the components of x, and so it simplifies to N scalar optimization

problems. The scalar problem can be further simplified by expressing the contrast variables in terms

of the complex permittivity, i.e. x = (εx − εb)/εb and z = (εz − εb)/εb. In this case, the problem

can be rewritten as follows:

minimizeεx

‖εx − εz‖2`2 (3.49)

subject to Re(εx) � 1

Im(εx) � 0

Note that the εb term in the denominator of the contrast variable does not affect the projection problem,

since it appears in both x and z. Eq. 3.49 can be further separated into two independent problems for

the real and imaginary components of the complex permittivity; however, this formulation is omitted

due to its obviousness. It is trivial at this point to show that the proximal operator for the physicality

set can be expressed in closed-form as:

Pε(εx) = max(Re(εx), 1) + max(Im(εx), 0) (3.50)

PQp(x) =Pε(εbx+ εb)− εb

εb(3.51)

The proximal operator for the joint quadratic constraint and physicality set of Eq. 3.14

is more difficult to compute. Like the proximal operator for the quadratic constraint by itself, this

operator does not have a closed-form solution, and instead must be written as the solution to the

29


convex problem:

minimizex

‖x− z‖2`2 (3.52)




Eq. 3.52 cannot be solved in an efficient manner using the singular value decomposition like Eq.

3.44. Indeed, making the change of variables to simplify the quadratic constraint will increase

the complexity of the physicality box constraints by removing the separability. This problem

can be solved using a general purpose solver such as CVX for small-scale problems, or by using

variable-splitting techniques such as the Alternating Direction Method of Multipliers (ADMM),

which is discussed in the next section, for large-scale problems, but doing so completely discards any

computational benefits gained by using Nesterov’s algorithm in the first place. A similar argument

can be made for the projection problem pf Eq. 3.16, which can be expressed as the following convex

optimization problem:

minimizex

‖x− z‖2`2 (3.53)




Therefore, although it can be done, Nesterov’s method is not recommended for solving the PCCS

programs of Eq. 3.14 and Eq. 3.16.

3.5 Solving the PCCS Programs using the Alternating Direction Method

of Multipliers

In this section, we describe three algorithms for solving Eq. 3.14, 3.15, and 3.16 using

the Alternating Direction Method of Multipliers (ADMM). The ADMM is an attractive choice for

solving these problems due to its simplicity, convergence guarantees, and its ability to be parallelized.

The ADMM also addresses the short-comings of Nesterov’s algorithm when applied to 3.14 and 3.16.

Instead of requiring the latest update at each iteration to lie within the feasible set, like Nesterov’s

30


method requires, the ADMM simultaneously drives the iterates toward the optimal solution while

allowing the intermediate iterates to be infeasible. As a result, the ADMM completely avoids having

to solve the complicated proximal operatosr of Eq. 3.52 and 3.53, thereby reducing the computational

complexity of each iteration compared to Nesterov’s method. This result will be clear by the end of

this section.

3.5.1 The Alternating Direction Method of Multipliers (ADMM)

The Alternating Direction Method of Multipliers (ADMM) [30] is a simple, yet elegant

algorithm for solving convex optimization problems with separable objective functions. Suppose that

an optimization problem can be written in the following form:

minimizex,z

f(x) + g(z) (3.54)

subject to Ax+Bz = c

This is a simple convex optimization problem with linear equality constraints. The Augmented

Lagrangian Method [29, 30], also known as the Method of Multipliers, solves this problem by

forming the Augmented Lagrangian:

LA(x, z, ν) = f(x) + g(z) + (1/2)νH(Ax+Bz − c) (3.55)

+ (1/2)(Ax+Bz − c)Hν + (ρ/2)‖Ax+Bz − c‖2`2

for some positive constant ρ. The optimal solution is then found by solving the unconstrained

problem:

maximizeν

minimizex,z

LA(x, z, ν) (3.56)

The Augmented Lagrangian Method iteratively solves Eq. 3.56: at step k,LA(x, z, ν(k)) is minimized

w.r.t. x and z for the latest value of the dual variable ν(k); following this, the dual variable ν is

updated using the steepest ascent method. ADMM takes this process one step further by exploiting

the separability in the objective function: at step k, LA(x, z(k), ν(k)) is minimized over x to compute

x(k+1), LA(x(k+1), z, ν(k)) is minimized over z to compute z(k+1), and then the dual variable ν is

updated using the steepest ascent method to compute ν(k+1) = ρ(Ax(k+1) +Bz(k+1) − c

).

In many cases, it is convenient to use the scaled form of ADMM, which includes the dual

variable terms directly within the quadratic [30]. Introducing the scaled dual variable u = ν/ρ

leads to the ADMM formulation of Alg. 2. This form is particularly useful when A and B are the

31


multiples of the identity matrix, in which case x and z are updated using the proximal operators of

the functions f(x) and g(z) respectively. As was seen in the previous section, many functions have

proximal operators that are inexpensive to compute, and so we are inclined to use those operators in

the ADMM whenever it is appropriate.

Algorithm 2: Overview of scaled ADMM.

Given x(0), z(0), u(0)

for k = 0, 1, 2, . . . doCompute x(k+1)

x(k+1) = argminx

f(x) + (ρ/2)‖Ax+Bz(k) − c+ u(k)‖2`2

Compute z(k+1)

z(k+1) = argminz

g(z) + (ρ/2)‖Ax(k+1) +Bz − c+ u(k)‖2`2

Compute u(k+1)

u(k+1) = u(k) +Ax(k+1) +Bz(k+1) − cend

3.5.2 ADMM for Traditional CS Problems

In this section, we describe how the ADMM can be used to solve the three traditional CS

problems of Eq. 3.7, 3.8, and 3.9. Since ADMM is used to solve Eq. 3.8 more often than it is used to

solve Eq. 3.7 and 3.9, we will begin with it. We seek the optimal solution to the problem:

minimizex,z

λ‖x‖`1 + (1/2)∥∥Az − y∥∥2

`2(3.57)

subject to x = z

which is obviously equivalent to Eq. 3.8 due to the linear equality constraint. This process of adding

additional variables and equality constraints to the problem is known as variable splitting in the

literature. Since this equality constraint only contains identity matrices, the x and z update steps can

both be expressed in terms of proximal operators. The Augmented Lagrangian for this problem, in

terms of the scaled dual variable u, is:

LA(x, z, u) = λ‖x‖`1 + (1/2)∥∥Az − y∥∥2

`2+ (ρ/2)‖x− z + u‖2`2 (3.58)

32


The x variable is updated by solving the unconstrained problem:

x(k+1) = argminx

λ‖x‖`1 + (ρ/2)‖x− z(k) + u(k)‖2`2 (3.59)

= prox(λ/ρ)‖·‖`1(z(k) − u(k))

= Sλ/ρ(z(k) − u(k))

where S executes the soft-thresholding operator on each element of the input vector; for a scalar x,

the soft-threshold is defined as [30, 31]:

Sλ(x) =

sign (x) (|x| − λ) , |x| > λ

0 |x| ≤ λ(3.60)

The z variable is updated by solving the unconstrained problem:

z(k+1) = argminz

(1/2)‖Az − y‖2`2 + (ρ/2)‖x(k+1) − z + u(k)‖2`2 (3.61)

=(ρI +AHA

)−1 (AHy + ρx(k+1) + ρu(k)

)In practice, the closed-form solution of Eq. 3.61 can be executed efficiently using the matrix inversion

lemma or by computing the singular value decomposition of A. The final step, updating the dual

variable u, is trivial:

u(k+1) = u(k) + x(k+1) − z(k+1) (3.62)

We now turn our attention to solving Eq. 3.7 using ADMM. Expressing the quadratic

constraint in terms of its indicator function, which we denote as I`2 , Eq. 3.7 can be expressed in its

ADMM form as:

minimizex,z

‖x‖`1 + I`2(z) (3.63)

subject to x = z

The Augmented Lagrangian for this problem, in terms of the scaled dual variable u, is:

LA(x, z, u) = ‖x‖`1 + I`2(z) + (ρ/2)‖x− z + u‖2`2 (3.64)

The x variable is once again updated using the soft-thresholding operator:

x(k+1) = S1/ρ(z(k) − u(k)) (3.65)

33


And the z variable is updated by evaluating the proximal operator for the indicator function IQp :

z(k+1) = proxI`2(x(k+1) + u(k+1)) (3.66)

This is the exact same problem as Eq. 3.44, so the method developed by Becker for the NESTA

algorithm in [28] can be applied here. Finally, the scaled dual variable is again updated using Eq.

3.62.

Finally, we address the traditional CS problem of Eq. 3.9. Expressing the `1-norm

constraint in terms of its indicator function, which we denote as I`1 , Eq. 3.9 can be expressed in its

ADMM form as:

minimizex,z

1

2‖Az − y‖2`2 + I`1(x) (3.67)

subject to x = z

The Augmented Lagrangian for this problem, in terms of the scaled dual variable u, is:

LA(x, z, u) =1

2‖Az − y‖2`2 + I`1(x) + (ρ/2)‖x− z + u‖2`2 (3.68)

The ADMM update equation for z is the same as that displayed in Eq. 3.61. To update x, we need to

evaluate the proximal operator for I`1 :

x(k+1) = proxI`1(z(k+1) − u(k+1)) (3.69)

This proximal operator can be efficiently computed as the solution to the following convex optimiza-

tion problem:

minimizex

1

2‖x− z‖2`2 (3.70)


There are a few observations that can be made of Eq. 3.70 that can greatly simplify the problem.

First, we note that the optimal solution x∗ to this problem has the same sign pattern as z. Indeed,

for all vectors x satisfying ‖x‖`1 ≤ τ and |x| = |x∗|, the vector x = diag(sign(z))|x∗| minimizes

‖x− z‖2`2 . Therefore, Eq. 3.70 can be solved using strictly real and positive variables by making the

change of variables w = |z| and q = |x|. With this change of variables, the problem can be recast as

follows:

minimizeq

1

2‖q − w‖2`2 (3.71)

subject to 1TNq ≤ τ

q � 0

34


This problem could be solved using a general convex program solver, but more efficient methods

can be found through analysis of the optimality conditions. The Lagrangian for this problem can be

written in terms of two dual variables, a scalar α and a vector β ∈ RN , as follows:

L(q, α, β) =1

2‖q − w‖2`2 + α

(1TNq − τ

)− βT q (3.72)

The Karush Kuhn Tucker (KKT) conditions [32] mandate that the following conditions are satisfied

at the optimal point q∗, α∗, β∗:

q∗ = w + β∗ − α∗1N (3.73)

α∗(1TNq

∗ − τ)

= 0 (3.74)

β∗i q∗i = 0 ∀i (3.75)

α∗ ≥ 0 (3.76)

β∗ � 0N (3.77)

Assuming that 1TNw > τ , so that w is not the optimal solution, then 1TNq∗ = τ . Combining this

result with Eq. 3.73, we can express the optimal dual variable α∗ as follows:

α∗ =1

N

(1TNw + 1TNβ

∗ − τ)

(3.78)

Eq. 3.78 is not very useful without an expression for β∗, so we turn our attention to Eq. 3.75. When

combined with Eq. 3.73, Eq. 3.75 can be written as:

β∗i (wi + β∗i − α) = 0 (3.79)

If wi < α, then β∗i = λ − wi and q∗i = 0, otherwise the positivity constraint will be violated. If

wi > α∗, then β∗i = 0 and q∗i = wi − α∗ according to Eq. 3.79 and 3.77. These results give rise to

the following approach for finding the optimal Lagrange multipliers. Suppose that w is sorted in

descending order, such that w1 ≥ w2 ≥ . . . ≥ wN , and that α∗ is selected such that only the first

M elements of w satisfy wi > α∗, then only the first M values in q∗ are non-zero. Combining this

result with the relations for β∗ and Eq. 3.78, we can express α∗ as follows:

α∗ =1

M

(−τ +

M∑i=1

wi

)(3.80)

One need only to find the smallest M such that the value of α∗ according to Eq. 3.80 satisfies

wi ≥ α∗ for i ≤M and wi < α∗ for i > M .

35


3.5.3 ADMM for PCCS Problems

The variable splitting process of the ADMM that was applied to the traditional CS problems

can also be applied to the PCCS problems for electromagnetic applications. Since the ADMM

formulations for Eq. 3.14, 3.15, and 3.16 are very similar, we will only discuss the solution for Eq.

3.14 in this section. By using three variables in the optimization, one for the objective function and

one for each of the constraints, the ADMM formulation of Eq. 3.14 can be written as:

minimizex,w,z

‖x‖`1 + I`2(w) + IQp(z) (3.81)

subject to x = w

x = z

where IQp(·) is the indicator function for the physicality set. The Augmented Lagrangian for this

problem can be written as:

LA(x,w, z, u, v) = ‖x‖`1 + I`2(w) + IQp(z) (3.82)

+ (ρ/2)‖x− w + u‖2`2+ (ρ/2)‖x− z + v‖2`2

Similar to the two-variable case, the ADMM updates the variables x,w, and z independently in the

following manner:

x(k) = (w(k) − u(k) + z(k) − v(k))/2 (3.83)

x(k+1) = argminx‖x‖`1 + (ρ/2)‖x− x(k)‖2`2 (3.84)

w(k+1) = argminx

I`2(x) + (ρ/2)‖x− x(k+1) − u(k)‖2`2 (3.85)

z(k+1) = argminx

IQp(x) + (ρ/2)‖x− x(k+1) − v(k)‖2`2 (3.86)

u(k+1) = u(k) + x(k+1) − w(k+1) (3.87)

v(k+1) = v(k) + x(k+1) − z(k+1) (3.88)

The update steps for x,w, and z reduce to calls to the proximal operators for the `1-norm, quadratic

error constraint, and physicality constraint respectively. This highlights the computational simplicity

of the ADMM: unlike Nesterov, the ADMM does not need to project onto the joint quadratic error

and physicality set, it simply needs to project onto each one separately. The averaging step of Eq.

3.83 ensures that the three variables x,w, and z converge to equality. This form of the problem

36


is often referred to as the “consensus” formulation of the ADMM. It should be apparent that this

formulation can also be used to solve both Eq. 3.15 and 3.16: for the former, simply replace Eq. 3.85

with Eq. 3.61; for the latter, simply replace Eq. 3.84 with Eq. 3.67 and replace Eq. 3.85 with Eq.

3.61.

3.6 Solving the PCCS Programs using an Accelerated Gradient Aug-

mented Lagrangian (AGAL) Method

In this section, we describe a third method for solving the PCCS programs, which can

be viewed as a combination of the previous two methods. At this point, the following question

naturally arises: why do we need yet another method for solving these problems? Recall that the

algorithms presented thus far for solving Eq. 3.7 and Eq. 3.14 required the singular decomposition of

the sensing matrix A. If the singular value decomposition cannot be computed, then these algorithms

cannot be used. This can occur for a number of reasons. For example, if the matrix A is known but is

very large, then it can be prohibitively expensive to compute the singular value decomposition. As

another example, if the matrix A is unknown or cannot be stored in memory, and instead is described

by function handles in software that compute Ax and AHz, then the singular value decomposition

cannot be computed. Several algorithms exist that solve the basis pursuit denoising problem of Eq.

3.8 using only operators for computing Ax and AHz - see for example the Fast Iterative Shrinkage

Thresholding Algorithm (FISTA) [33] or the previously mentioned NESTA [25, 28] - but these

methods cannot solve Eq. 3.7. Most certainly, there are no specialized algorithms in the literature for

solving the PCCS program of Eq. 3.14 in this circumstance. This is unfortunate because, even though

the formulations of Eq. 3.7, 3.8, and 3.9 - and by extension Eq. 3.14, 3.15, and 3.9 - are equivalent

for appropriate values of η, λ, and τ , it is more “natural” to solve the quadratically constrained

problems due to the simple fact that the expected error η can be estimated from the measurement

system and model errors. It is very difficult to tune the parameters λ and τ in order to achieve the

desired performance because the mapping of λ and τ to the equivalent η is data-dependent.

3.6.1 General Formulation of the AGAL Method

In this section, we describe the general formulation of the Accelerated Gradient Augmented

Lagrangian (AGAL) method. As its name suggests, this algorithm can simply be described as an

application of the accelerated proximal gradient method to the Augmented Lagrangian. Specialized

37


methods for solving the CS problems of Eq. 3.7, 3.8, and 3.9, and the PCCS problems of Eq. 3.14,

Eq. 3.15, and Eq. 3.16, are described in the succeeding sections. The general problem instance seeks

to minimize the following convex function:

minimizex

f(x) +

Q∑q=1

gq(Aqx) (3.89)

In this problem, x ∈ CN , f(·) is a differentiable convex function whose gradient is Lipschitz

continuous with constant L and whose proximal operator is not necessarily known or is difficult to

compute, Aq ∈ CMq×N is not necessarily known, but methods for computing Aqx and AHq z are

available, and gq(·) are possibly non-smooth convex functions with closed-form or easy to compute

proximal operators, such as the `1-norm or indicator functions for simple convex sets. If a function

satisfies the conditions for both f(·) and gq(·), it can be placed in either category. The objective

function for this problem is the summation of Q+ 1 separable functions, which can be conveniently

separated by introducing auxiliary variables and equality constraints as follows:

minimizex,z1,...,zQ

f(x) +

Q∑q=1

gq(zq) (3.90)

subject to zq = Aqx , q = 1, 2, . . . , Q

This formulation is a particular instance of Eq. 3.54, in which g(z) is the summation of Q separable

functions. Following the same process that was done in that problem, the Augmented Lagrangian

can be formed using scaled dual variables in the following manner:

LA(x, z1, . . . , zQ, u1, . . . , uQ) = f(x) +

Q∑q=1

gq(zq) + (ρ/2)‖zq −Aqx+ uq‖2`2 (3.91)

If f(·) had a known or easy to compute proximal operator, andAq were known exactly, then

this problem could be solved using the ADMM; however, the problem’s assumptions do not mandate

these characteristics. Instead of using the ADMM, we can solve this problem using the traditional

Augmented Lagrangian method, which completely minimizes Eq. 3.91 over x, z1, . . . , zQ for fixed

u1, . . . , uQ before updating the Lagrange multipliers using gradient ascent. The unconstrained

subproblem of Eq. 3.91 can be solved efficiently, given the problem assumptions and the available

information, using an accelerated proximal gradient method similar to Nesterov’s method described

in Section 3.4. In particular, we recommend the method used by Beck and Teboulle for FISTA [33],

which was further refined by Tseng [34]. This method can be described concisely as follows. Given

38


an optimization problem of the form:

minimizex

f(x) + g(x) (3.92)

for a convex function f(x) with Lipschitz continuous gradient and a convex function g(x) with a

known or easy to compute proximal operator, FISTA updates the variable x at the k-th iteration using

two simple steps [27]:

x(k) = x(k−1) +k − 2

k + 1

(x(k−1) − x(k−2)

)(3.93)

x(k) = proxt(k)g

(x(k) − t(k)∇f(x(k))

)(3.94)

This method has been shown to converge with a rate O(1/k2), like Nesterov’s method described in

Section 3.4, provided that the step size t(k) satisfies t(k) ≤ t(k−1) and t(k) ≤ 1L in the limit. When

the Lipschitz constant is not known, line search methods can be employed in order to compute a

sequence of t(k) values that guarantee that the algorithm will converge; see [27] for details.

It is straightforward to apply FISTA to the Augmented Lagrangian subproblem of Eq. 3.91.

By substituting the corresponding values into Eq. 3.93 and 3.94, this implementation can be written

as in Alg. 3. If the matrix norms LAq = ‖Aq‖2`2 and Lipschitz constant Lf for f(·) are known, then

the step size can be initialized to t(0) =(Lf + ρQ+

∑Qq=1 ρLAq

)−1and held constant throughout

the optimization procedure. This step size is guaranteed to satisfy t(0) ≤ 1L , where L is the unknown

Lipschitz constant of the differentiable parts of Eq. 3.91. This result is easy to prove by applying the

triangle inequality to the quadratic terms (ρ/2)‖zq −Aqx+ uq‖2`2 .

3.6.2 AGAL for Traditional CS Problems

In order to demonstrate how this method can be applied to CS problems, we will first

consider the traditional CS Problem of Eq. 3.7. By introducing auxiliary variables z1 and z2, we can

recast this problem into the following form:

minimizex,z1,z2

‖z1‖`1 (3.95)

subject to ‖z2 − y‖`2 ≤ η

z1 = x

z2 = Ax

39


Algorithm 3: Overview of the Accelerated Gradient Augmented Lagrangian Method

subproblem.

Given x(0), z(0)1 , . . . , z(0)Q , u1, . . . , uQ

x(0) = x(−1) = x(−2)

z(0)q = z

(−1)q = z

(−2)q , q = 1, . . . , Q

for k = 1, 2, 3, . . . doCompute x(k), z(k)1 , . . . , z

(k)Q

x(k) = x(k−1) + k−2k+1

(x(k−1) − x(k−2)

)z(k)q = z

(k−1)q + k−2

k+1

(z(k−1)q − z(k−2)q

), q = 1, . . . , Q

Compute x(k+1), z(k+1)1 , . . . , z

(k+1)Q

x(k+1) = x(k) − t(k)∇f(x(k)) +∑Q

q=1 t(k)ρAHq

(z(k)q −Aqx(k) + uq

)z(k+1) = proxt(k)gq

(z(k) − t(k)ρ

(z(k)q −Aqx(k) + uq

))end

This problem is separable in the `1 and quadratic constraint terms, and the sensing matrix A appears

only in the equality constraint for z2. The Augmented Lagrangian for this problem can be written as:

LA(x, z1, z2, u1, u2) = ‖z1‖`1 + I`2(z2) + (ρ/2)‖z1 − x+ u1‖2`2 (3.96)

+ (ρ/2)‖z2 −Ax+ u2‖2`2

Equating the terms in Eq. 3.96 to those in Eq. 3.91 reveals the following correspondence: Q = 2,

f(x) = 0, g1(z1) = ‖z1‖`1 , g2(z2) = I`2(z2), A1 = I , and A2 = A. Therefore, this problem can

be solved using the method outlined in Alg. 3. If the norm of the sensing matrix LA = ‖A‖2`2 is

known, then the algorithm is guaranteed to converge with the fixed step size t(0) = [ρ(3 + LA)]−1.

It is worthwhile to mention here that the proximal operator for the indicator function of the quadratic

constraint I`2(z2) has a much simpler solution than it does in Eq. 3.44. For this problem, the

proximal operator can be expressed as the solution to the convex problem:

minimizex

‖x− z‖2`2 (3.97)

subject to ‖x− y‖2`2 ≤ η

40


It is easy to show that this problem as the closed-form solution given by Eq. 3.98. Therefore, each

proximal gradient step in Alg. 3 can be computed very efficiently in traditional CS problems.

x∗ =

z ‖z − y‖`2 ≤ η

y + η‖z−y‖`2

(z − y) ‖z − y‖`2 > η(3.98)

Let us now consider the traditional CS Problem of Eq. 3.9. By introducing auxiliary

variable z1, we can recast this problem into the following form:

minimizex,z1

1

2‖Ax− y‖2`2 (3.99)

subject to ‖z1‖`1 ≤ τ

z1 = x

This problem is separable in the `1 and quadratic constraint terms, and the sensing matrix A appears

only in the equality constraint for z1. The Augmented Lagrangian for this problem can be written as:

LA(x, z1, u1) =1

2‖Ax− y‖2`2 + I`1(z1) + (ρ/2)‖z1 − x+ u1‖2`2 (3.100)


f(x) = 12‖Ax− y‖

2`2

, g1(z1) = I`1(z1), and A1 = I . Therefore, this problem can be solved using

the method outlined in Alg. 3. If the norm of the sensing matrix LA = ‖A‖2`2 is known, then the

algorithm is guaranteed to converge with the fixed step size t(0) = [2ρ+ LA]−1.

Finally, it is worth mentioning that the Accelerated Gradient Augmented Lagrangian

method can also be applied to solve the basis pursuit denoising problem of Eq. 3.8. In this case, the

problem is recast to the following form:

minimizex,z1

λ‖z1‖`1 + (1/2)‖Ax− y‖2`2 (3.101)

subject to x = z1

Solving Eq. 3.101 is not recommended in practice because the equality constraint adds unnecessary

complexity to the problem. The traditional FISTA method, as described in [33] and in Eq. 3.92 -

3.94, is more appropriate for this problem. However, the equivalent problem for electromagnetic

applications, given by Eq. 3.15, cannot be solved using FISTA, and so the Augmented Lagrangian

formulation can be applied beneficially. This is described further in the next subsection.

41


3.6.3 AGAL for PCCS Problems

It is straightforward to extend the CS formulations from the previous subsection for the

PCCS problems. By introducing three auxiliary variables, Eq. 3.14 can be recast in the following

form:

minimizex,z1,z2,z3

‖z1‖`1 (3.102)

subject to ‖z2 − y‖`2 ≤ η

Re(diag(εb)z3 + εb) � 1

Im(diag(εb)z3 + εb) � 0

z1 = x

z2 = Ax

z3 = x

The Augmented Lagrangian for this problem can be written as:

LA(x, z1, z2, z3, u1, u2, u3) = ‖z1‖`1 + I`2(z2) + IQp(z3) + (ρ/2)‖z1 − x+ u1‖2`2 (3.103)

+ (ρ/2)‖z2 −Ax+ u2‖2`2 + (ρ/2)‖z3 − x+ u3‖2`2


f(x) = 0, g1(z1) = ‖z1‖`1 , g2(z2) = I`2(z2), g3(z3) = IQp(z3), A1 = I , A2 = A, and A3 = I .

Therefore, this problem can be solved using the method outlined in Alg. 3. If the norm of the sensing

matrix LA = ‖A‖2`2 is known, then the algorithm is guaranteed to converge with the fixed step size

t(0) = [ρ(5 + LA)]−1.

A similar process can be followed in order to solve Eq, 3.15. In this case, two auxiliary

variables can be introduced in order to recast the problem in the following form:

minimizex,z1,z2

λ‖z1‖`1 + (1/2)‖Ax− y‖2`2 (3.104)

subject to Re(diag(εb)z2 + εb) � 1


z1 = x

z2 = x

42



LA(x, z1, z2, u1, u2) =(1/2)‖Ax− y‖2`2 + λ‖z1‖`1 + IQp(z2) (3.105)

+ (ρ/2)‖z1 − x+ u1‖2`2 + (ρ/2)‖z2 − x+ u2‖2`2


f(x) = (1/2)‖Ax − y‖2`2 , g1(z1) = ‖z1‖`1 , g2(z2) = IQp(z2), A1 = I , and A2 = I . Therefore,

this problem can be solved using the method outlined in Alg. 3. If the norm of the sensing matrix

LA = ‖A‖2`2 is known, then the algorithm is guaranteed to converge with the fixed step size

t(0) = [4ρ+ LA]−1.

Finally, let us now consider Eq. 3.16. By introducing two auxiliary variables, we can recast

this problem into the following form:

minimizex,z1,z2

1

2‖Ax− y‖2`2 (3.106)

subject to ‖z1‖`1 ≤ τ

Re(diag(εb)z2 + εb) � 1


z1 = x

z2 = x


LA(x, z1, z2, u1, u2) =1

2‖Ax− y‖2`2 + I`1(z1) + IQp(z2) (3.107)

+ (ρ/2)‖z1 − x+ u1‖2`2 + (ρ/2)‖z2 − x+ u2‖2`2


f(x) = 12‖Ax − y‖

2`2

, g1(z1) = I`1(z1), g2(z2) = IQp(z2), A1 = I , and A2 = I . Therefore, this

problem can be solved using the method outlined in Alg. 3. If the norm of the sensing matrix

LA = ‖A‖2`2 is known, then the algorithm is guaranteed to converge with the fixed step size

t(0) = [4ρ+ LA]−1.

3.7 Numerical Comparison of CS and PCCS Problems

This section presents a numerical comparison of the reconstruction accuracies of the tradi-

tional CS and PCCS problems when applied to an electromagnetic imaging problem. Suppose that the

43


elements of the M ×N sensing matrix A are drawn from i.i.d. complex Gaussian random variables

with zero mean and standard deviation 1/√M . Although it is not possible to design a sensing

matrix in this manner in electromagnetic imaging applications, it is considered for this analysis

because the performance of the traditional CS programs with such matrices is well documented in

the literature. Indeed, CS theory states that the unknown sparse vector x can be recovered exactly

from the M noiseless measurements by solving Eq.3.7 provided that the sparsity S of the signal

satisfies [21, 18, 35]:

S ≤ CM/ log(N/M) (3.108)

This analysis compares the reconstruction capabilities of Eq. 3.7 and Eq. 3.14 for a sensing matrix

with dimensions M = 48 and N = 500 and with η/‖y‖`2 = 10−6 in order to approximately enforce

equality. The performance of each CS program as a function of sparsity level S was evaluated by

averaging the normalized errors ‖xt − xr‖`2/‖xt‖`2 , where xt is the true contrast variable and xr

is the reconstructed contrast variable, for 100 vectors at each sparsity level. The locations of the

non-zero elements in xt were drawn from a uniform distribution, and the contrast values themselves

were drawn from i.i.d. complex Gaussian random variables, which were projected to the physical set.

The background permittivity was set to freespace, i.e. εb = 1.

Figure 3.3 displays the average reconstruction accuracy for the traditional CS (blue) and

PCCS (red) programs for this numerical example. For small sparsity levels, the solutions provided

by the two methods are indistinguishable from each other. This result is to be expected. For the

cases where CS theory guarantees exact recovery for the standard problem of Eq. 3.7, the additional

physicality constraints of Eq. 3.14 are inactive. Another way to put it is that, in these cases, the

optimal solution of Eq. 3.7 satisfies the physicality constraints. The reconstruction accuracies of the

two algorithms start to diverge near the sparsity level S = 11. For sparsity levels greater than this,

the PCCS program produces accurate solutions more frequently than the standard CS program. In

these cases, some of the physicality constraints are active, i.e. the only way to decrease the `1−norm

any further would require values εr < 1 and σ < 0. This numerical analysis suggests that one can

improve upon the reconstruction capabilities of standard CS in electromagnetic imaging applications

by enforcing the physicality constraints. This is a very intuitive result. The PCCS programs utilize

more prior information about the problem than the traditional CS programs. Just as the traditional

CS programs outperform classical smooth techniques such as `2-norm regularization by exploiting

sparse priors, the PCCS programs outperform the traditional programs in electromagnetic imaging

applications by exploiting the physicality constraints.

44


Figure 3.3: Reconstruction performance of CS and PCCS programs in electromagnetic imaging

example as a function of sparsity level.

3.8 PCCS for the Hybrid DBT / NRI System

In this section, we assess the performance of the PCCS algorithm in the Hybrid DBT /

NRI system using a set of numerical simulations. Following the segmentation process discussed

in Chapter 2, a 2D model of a healthy breast was generated by segmenting a 2D slice from a 3D

DBT image. In order to simulate data from a cancerous case, a lesion with frequency-dependent

electrical properties modeled after [10] was added to the healthy breast. A 2D version of the FDFD

code was used to generate the synthetic NRI measurements of the healthy breast, the synthetic NRI

measurements of the cancerous breast, and the sensing matrix of the healthy breast A according

to Eq. 2.6. Note that the FDFD model accounted for the dispersive properties of both the healthy

breast tissue and the cancerous tissue; only the inversion process utilized the simplifying assumptions

discussed in Chapter 2. In the simulation, the NRI system used six transmitting and receiving

antennas operating in a multiple monostatic configuration. Each antenna was excited with three

different frequencies, 500MHz, 600MHz, and 700MHz, for a total of 18 measurements among the

antennas.

Figure 3.4 displays the true contrast variable obtained when the fat percentage is perfectly

segmented from the DBT image. In this plot, the white dots represent the antenna positions and

the green curves represent the breast and lesion borders. Since the fat percentage was segmented

perfectly, the contrast variable is non-zero only at the location of the cancerous lesion. Figure 3.5

45


Figure 3.4: Real and imaginary parts of true contrast variable χε obtained when the DBT image is

segmented perfectly.

displays the estimated contrast variable obtained using noiseless measurements and the the perfect

fat percentage segmentation to compute the sensing matrix according to Eq. 2.6. It should be noted

that all of the results in this section were generated using Nesterov’s method to solve the physicality

constrained basis pursuit denoising problem of Eq. 3.8. The artifacts within the image are due to the

error vector es(r, ω) that is introduced to the measurement vector when the simplifying assumptions

of Chapter 2 are applied. Despite these artifacts, the algorithm is able to locate the cancerous lesion.

Figure 3.6 displays the true contrast variable obtained when the fat percentage is segmented

from the DBT image with 10% random error. More specifically, the fat percentage values were

corrupted by i.i.d. random noise following a uniform distribution, taking values between ±10%

with equal probability. Since the fat percentage is not segmented correctly, the true contrast variable

is non-zero within the healthy tissue. Nevertheless, the true contrast variable is approximately

compressible, and so CS techniques can still be used to image the breast. This result can be seen in

Figure 3.7, which displays the estimated contrast variable obtained using the noisy fat percentage

segmentation and noiseless measurements.

Figure 3.8 displays the estimated contrast variable obtained using the noisy fat percentage

segmentation and measurements whose SNR = 49dB. It is important to note that the signals in this

SNR calculation are the electric fields scattered by the entire breast, and not just the fields scattered

by the cancerous lesion. In this example, the fields scattered by the lesion are approximately 40dB

46


lower in magnitude than the fields scattered by the rest of the breast, so that the “lesion signal to noise

ratio” is on the order of 10dB. Therefore, the NRI system must have a significant SNR to ensure that

the fields produced by cancerous lesions are not overwhelmed by the noise, or it must have antennas

with higher directivity in order to improve the SNR - the latter case may require the CS algorithm to

Figure 3.5: Real and imaginary parts of reconstructed contrast variable χε obtained when the DBT

image is segmented perfectly and there is no measurement noise.

Figure 3.6: Real and imaginary parts of true contrast variable χε obtained when the fat percentage is

segmented from the DBT image with 10% error.

47


use additional measurements. With this high SNR, the CS algorithm is able to image the cancerous

lesion with some additional artifacts compared to the noiseless case. However, when the SNR is

decreased to 43dB, the algorithm is no longer able to image the lesion, as can be seen in Figure 3.9.

Figure 3.7: Real and imaginary parts of reconstructed contrast variable χε obtained when the fat

percentage is segmented from the DBT image with 10% error and there is no measurement noise.


percentage is segmented from the DBT image with 10% error and and the measurement SNR

= 49dB.

48



percentage is segmented from the DBT image with 10% error and and the measurement SNR

= 43dB.

49

Chapter 4

Model-based Design Method for

Compressive Antennas

4.1 Introduction

The previous chapter introduced compressed sensing (CS) theory, which dictates how

sparse vectors of interest can be recovered using `1−norm minimization techniques provided that

the sensing matrix satisfies the restricted isometry property (RIP). While these techniques were

applied to electromagnetic imaging applications with some success, they were applied primarily

using the `1−norm as a heuristic for generating sparse solutions. Indeed, it is very difficult to apply

the reconstruction guarantees of CS theory to electromagnetic imaging applications because we do

not have the flexibility to design sensing matrices that are guaranteed to satisfy the RIP. To address

this shortcoming, the traditional CS programs were augmented to include the physicality constraints

on the unknown contrast variables in order to improve reconstruction performance. In this chapter,

we attempt to address this issue directly by answering the following question: how can we design

sensing matrices in electromagnetic imaging applications with enhanced imaging capabilities?

Recent papers [36, 37] have introduced the concept of a compressive reflector antenna

for use in millimeter wave imaging applications. The compressive reflector antenna operates in a

manner similar to that of the coded apertures utilized in optical imaging applications [38, 39, 40]:

by introducing scatterers to the surface of a traditional reflector antenna, the compressive antenna

encodes a pseudo-random phase front on the scattered electric field. As a result of this, it was

observed that the sensing capacity of the compressive antenna was improved compared to the

50

CHAPTER 4. MODEL-BASED DESIGN METHOD FOR COMPRESSIVE ANTENNAS

traditional reflector antenna, and CS techniques could be employed in imaging applications utilizing

the compressive antenna with improved performance over the traditional reflector antenna.

This chapter expands upon the compressive reflector antenna concept in several ways. First,

we provide additional theoretical considerations describing why the reconstruction capabilities of CS

techniques are enhanced using an antenna with enhanced sensing capacity. Second, we describe a

model-based method for designing compressive antennas with improved CS imaging capabilities.

This method is an enhancement of the previous work [36, 37], which simply selected the constitutive

properties of the scatterers at random. A generalized framework of the design method is introduced,

and two specific instances of the design problem are described in detail. Third, we describe how our

design method is an enhancement of existing techniques that have been applied to Multiple Input

Multiple Output (MIMO) communication systems.

4.2 Motivation

Consider a general linear system, in which a set of noisy measurements y ∈ CM of

the object of interest x ∈ CN are obtained via the relationship y = Ax + n. As we discussed

in the previous chapter, when the scatterer x is sparse, that is it has a small number of non-zero

coefficients, then it can be accurately recovered using novel CS techniques [19, 18, 20]. Recall that

the reconstruction performance of the `1-norm minimization techniques is guaranteed when the

sensing matrix A obeys a Restricted Isometry Property (RIP), which for completeness is repeated

here. For a given sparsity level S, the restricted isometry constant δS is defined as the smallest

constant such that:

(1− δS)‖x‖2`2 ≤ ‖Ax‖2`2 ≤ (1 + δS)‖x‖2`2 (3.10)

for all x satisfying ‖x‖`0 ≤ S [17]. Stable reconstruction ‖x − xt‖`2 ≤ CSη is guaranteed then

according to Candes when the restricted isometry constants satisfy [18]:

δ3S + 3δ4S < 2 (3.12)

Compressed sensing can also be considered from the perspective of information theory.

Consider the affine mapping y = Ax + n, and assume that the set of feasible vectors x and the

noise term n are constrained in their `2-norms, i.e. ‖x‖`2 ≤ T and ‖n‖`2 ≤ ε. Considering the

linear system as a communication channel, the following question then naturally arises: how many

input vectors x can be uniquely defined within a tolerance ε from the measurements y? This is

a sphere-packing problem that is common to communication systems. The maximum amount of

51


information that can be transmitted through the system in this case is given by the ε-capacity, which

can defined as follows [41, 4]:

Hε(A) = log2

(detAAH

ε

)=

M∑m=1

log2

(σmε

)(4.1)

where σm are the singular values ofA. Colloquially, we refer to the ε-capacity as the sensing capacity,

or just the capacity. Consider instead a system y = ASxS + n, for S < M , where AS is generated

by selecting S columns from A. In this case, the ε-capacity takes the form [41, 4]:

Hε(AS) =S∑s=1

log2

(σsε

)(4.2)

where σs are the singular values of AS . Now, the definition of the restricted isometry δS of Eq. 3.10

ensures that the singular value σs satisfies:

(1− δS) ≤ σ2s (4.3)

Assuming the worse case scenario, in which σ1 = σ2 = . . . = σS =√

1− δS , it is easy to show that

the restricted isometry constant establishes the following lower bound on the ε-capacity:

Hε(AS) ≥ S

2log2

(1− δSε

)(4.4)

In this sense, δS defines the minimum amount of information that can be transmitted by S−sparse

vectors using the linear mapping y = Ax.

In order to improve the ability of the sensing matrix A to recover sparse vectors, one would

ideally minimize the values of the restricted isometry constants δS . This is equivalent to maximizing

the lower bound of Eq. 4.4. Unfortunately, it is prohibitively expensive to do this in most practical

applications because the number of computations required grows exponentially with N . Instead,

let us consider a more practical measure using the singular values of the complete matrix A. By

convention, the smallest and largest singular values of A satisfy the following inequality for all

vectors x that do not lie in the null space of A:

σ2min‖x‖2`2 ≤ ‖Ax‖2`2 ≤ σ

2max‖x‖2`2 (4.5)

Comparing Eq. 3.10 to Eq. 4.5, we see that there are two possible inequalities relating δS and σmin:

1− δS ≥ σ2min (4.6)

1− δS ≤ σ2min (4.7)

52


For all S ≥ ST , where ST is an unknown threshold, the inequality of Eq. 4.7 holds. It is easy to

prove this result. Given the CS requirement that the sensing matrix A have `2-normalized columns,

it is necessary that δ1 = 0 and σmin < 1, and so the first inequality holds for at least S = 1. When

this condition is satisfied, the ε-capacity Hε(AS) is necessarily bounded according to

S log2

(√1− δSε

)≤ Hε(AS) ≤ Hε(A) (4.8)

for ε ≤ 1. Solving Eq. 4.8 for δS results in the following lower bound on the restricted isometry

constants:

δS ≥ 1−(ε2Hε(A)

)2/S= 1− ε2(1−M/S)

(M∏m=1

σ2m

)1/S

(4.9)

Since ε ≤ 1 and S < M , this bound reduces to δS ≥ 0 as ε→ 0. In addition, the singular value term(∏Mm=1 σ

2m

)1/Sapproaches zero as S →∞. Therefore, the strongest bound arises when ε = 1 and

S = 1:

δS ≥ 1− 22H1(A) = 1−M∏m=1

σ2m (4.10)

This relationship states that, for sparsity levels S > ST , the ε-capacity of the full sensing matrix

A provides a lower bound on the restricted isometry constants δS . If the singular values are poorly

distributed, i.e.∏Mm=1 σ

2m is small, then the values of δS will be close to one. In order to provide the

best bound, the ε-capacity H1(A) should be as large as possible.

We propose maximizing the ε-capacity of the Green’s function matrix G as an appropriate

method for maximizing the ε-capacity of the sensing matrix in electromagnetic inverse problems.

This is motivated by the fact that the sensing matrix A is dependent upon the fields radiated by the

transmitting antennas. Consider, for example, the Born Approximation (BA) formulation described

in Section 2.5, which is repeated here for convenience:

Es(r, ω) =

∫Gb(r, r

′, ω)k2b (r′, ω)Eb(r

′, ω)χ(r′)dr′ (2.6)

+ es(r, ω)

Intuitively, one expects that improving the ε-capacity of G also improves the ε-capacity of A.

4.3 A General Design Approach

In the optimization problem, the transmitting antenna system is described by a set of current

sources located at T locations. Each transmitting antenna excites the M positions in the imaging

53


region with stepped-frequency waveforms at K frequencies. The design procedure optimizes the

constitutive properties ε(r, ω) and µ(r, ω) of scattering elements located at N positions along the

reflector. In order to allow the scattering elements to be dispersive, the permittivity and permeability

of the scatterers at the k−th frequency will be jointly represented by the variable xk. With this

convention, the matrix Gk (xk) ∈ C3M×3T can be defined as the Green’s function matrix for sources

radiating at frequency ωk, located at the T transmitter positions, and evaluated at the M positions in

the imaging region. This matrix is a nonlinear function of the design variables xk. By concatenating

the Green’s function matrices for multiple frequencies, the multi-frequency Green’s function matrix

G(x) ∈ C3M×3KT can be expressed as:

G(x) = G(x1, x2, . . . , xK)

=[G1 (x1) , G2 (x2) , . . . , GK (xK)

](4.11)

where the vector x is the vector of concatenated design variables for each frequency. Assuming that

M > KT , the channel capacity maximization problem can be expressed as a non-convex “max-det”

problem:

maximize log det(GH(x)G(x)

)(4.12)

subject to hq(x) ≤ 0, q = 1, · · · , Q

cp(x) = 0, p = 1, · · · , P

The constraint functions hq(x) and cp(x) can be non-convex and depend upon the spe-

cific design constraints placed on the dielectric scatterers. For example, if the scatterers are re-

stricted to non-dispersive materials, then the equality constraint functions force the design variables

x1, x2, . . . , xK to produce the same permittivity and conductivity. As another example, if metama-

terial scattering elements are disallowed, then the inequality constraint functions force the design

variables to produce dielectric constants ≥ 1.

4.4 A Simplified Design Approach

This section describes a method for solving a simplified version of Eq. 4.12. In this

approach, both the scatterers and the background medium at the scatterer locations are assumed

to be non-dispersive and non-conductive, so that the design variables x1, x2, . . . , xK are equal and

are real-valued. Moreover, the constraints simply restrict the electric permittivities and magnetic

54


permeabilities of the scatterers to lie within specified ranges, [εL, εR] and [µL, µR]. The simplified

optimization problem can therefore be expressed as:


)(4.13)

subject to xL ≤ x ≤ xR

Eq. 4.31 can be solved efficiently using the nonlinear conjugate gradient method [29]. This method

requires expressions for the gradient of the cost function log detF (x) = log det(GH(x)G(x)

).

Assuming that F (x) is invertible, the partial derivatives ∂∂xl

log detF (x) and ∂F (x)∂xl

are:

∂

∂xllog detF (x) = tr

(F−1(x)

∂F (x)

∂xl

)(4.14)

∂F (x)

∂xl=

(∂G(x)

∂xl

)HG(x) +GH(x)

∂G(x)

∂xl(4.15)

A close examination of Eq. 4.11 reveals hat the partial derivatives ∂G(x)∂xl

consist of the partial

derivatives ∂Gk(x)∂xl

. By defining Hk(x) as the discretized version of the Helmholtz operator for

frequency k, the Green’s function matrix Gk(x) can be expressed as:

Gk(x) = ΦH−1k (x)Ψ (4.16)

where Φ ∈ C3M×3L, Hk(x) ∈ C3L×3L, and Ψ ∈ C3L×3T . The matrices Φ and Ψ are subsampling

matrices corresponding to the imaging and transmitter positions respectively. From this relationship,

the partial derivatives ∂Gk(x)∂xl

take the following form:

∂Gk(x)

∂xl= −ΦH−1k (x)

∂Hk(x)

∂xlH−1k (x)Ψ (4.17)

The elements of the partial derivative matrix ∂Hk(x)∂xl

differ depending upon whether xl is permittivity

or permeability. If xl is the permittivity εj at position j, then the partial derivative matrix takes the

form:∂Hk(x)

∂εj= ω2

k diag(13 ⊗ δij) (4.18)

where⊗ is the Kronecker product and δij ∈ CL is the Kronecker delta function expressed as a vector,

i.e. the j − th element of δij equals one and all others equal zero. If xl is the permeability µj at

position j, then the partial derivative matrix takes the form:

∂Hk(x)

∂µj= − 1

µ2jLc diag(13 ⊗ δij)Lc (4.19)

55


where Lc is the discretized curl operator. Computation of these derivatives requires K(N + T ) calls

to a forward model solver at each iteration in order to compute the Green’s functions.

With expressions for the gradients, the conjugate gradient method can now be discussed.

The nonlinear conjugate gradient method computes the search direction sk at each iteration recursively

using gradients in the following manner [29]:

dk = ∇x log detGH(x)G(x) (4.20)

sk = dk + βksk−1 (4.21)

The choice of βk depends upon the specific search direction method that is utilized. One such method,

the Polak-Ribiere search directions, computes the parameter βk as [29]:

βk = Re

(dHk (dk − dk−1)dHk−1dk−1

)(4.22)

To compute the next iterate xk+1, the objective function is optimized along the search direction sk:

maximize log det(GH(xk + αsk)G(xk + αsk)

)(4.23)

subject to xL ≤ xk + αsk ≤ xR

α ≥ 0

In practice, Eq. 4.23 is difficult to solve exactly, so we instead utilize inexact line-search methods,

which are well detailed in the literature [29].

4.5 Reflection Mode Results

This section presents preliminary antenna design results, which were generated using the

simplified algorithm and a 2D forward model solver for computingH−1k (x) based on finite differences

in the frequency domain (FDFD) [7]. The design method was executed for a configuration in which

the antenna operated in reflection mode. In this configuration, dielectric scatterers are added to the

surface of a Perfect Electric Conductor (PEC) reflector in order to further perturb the fields scattered

by the reflector.

Figure 4.1 displays the configuration for the optimization problem. Three line source

antennas, represented by the white circles, were used to excite the free-space imaging region, colored

in orange. The green pixels represent the locations of the scatterers to be optimized, and the red pixels

represent the PEC. The scatterer region was discretized into 40 rectangular blocks with dimensions

56


0.0263[m] × 0.01[m]. The antennas were constrained to transmit at five frequencies linearly spaced

between 3.1GHz and 3.5GHz, and the dielectric constant of the scatterers was constrained to the

range [1, 10]; the magnetic permeability was restricted to µ = µ0.

Figure 4.2 displays the optimized permittivity distribution. It is important to note that Eq.

4.31 is non-convex, and so it is probable that the solution displayed in Figure 4.2 is only a locally

optimal solution. If necessary, the optimization problem can be solved several times using different

starting points until a suitable design is found. It should also be noted that the solutions of Eq. 4.31

may be difficult to manufacture; however, the general approach of Eq. 4.12 can be used with the

appropriate constraint functions in order to ensure that the algorithm produces a feasible design.

Figure 4.3 displays the log2 of the singular values of the sensing matrices obtained using

Figure 4.1: Configuration for the compressive antenna operating in reflection mode. White =

Transmitter locations, Orange = Imaging region, Green = Scatterer locations, Red = PEC.

Figure 4.2: Permittivity distribution of the optimized reflection mode antenna.

57


Eq. 2.6 with the optimized reflection mode antenna (blue) and original reflection mode antenna

(red). In this configuration, the imaging system operates in a multi-static configuration, such that

the total number of unique measurements is Na(Na+1)2 Nf = 30. In addition, the imaging region was

discretized into blocks of size 0.0360[m] × 0.0360[m], such that the sensing matrix A ∈ C30×102.

Figure 4.3 demonstrates the design method’s ability to improve the singular value distribution of

the sensing matrix, and therefore the lower bound on the capacity. Indeed, the ratio σmax/σmin of

the largest singular value to the smallest non-zero singular value has improved significantly, from

approximately 9000 in the original antenna to 25 in the optimized antenna. This is a desirable

property for any imaging system, even when alternative techniques such as regularized least squares

are used in lieu of CS techniques.

A numerical analysis was performed in order to demonstrate that the design method

improves the CS imaging capabilities of the antenna. In this analysis, the sensing matrices for the

baseline and optimized antenna configurations were used to solve the following PCCS reconstruction

problem:

minimizex

‖x‖`1 (4.24)

subject to Ax = y



Figure 4.3: log2 of the singular values of the sensing matrices obtained using the optimized reflection

mode antenna (blue) and original reflection mode antenna (red) in a multi-static configuration.

58


Figure 4.4 displays the results of the numerical analysis as the fraction of vectors recovered within a

normalized error ‖xt − x‖`2/‖xt‖`2 of 0.001. At small sparsity levels, the original and optimized

antennas provide comparable performance; however, at high sparsity levels, the optimized antenna

clearly outperforms the original antenna.

Figure 4.4: Numerical comparison of the reconstruction accuracies of Eq. 4.24 using the optimized

reflection mode design (blue) and baseline reflection mode design (red).

4.6 Transmission Mode Results

Although the design method improved the CS recovery capabilities of the baseline reflection

mode antenna in the previous section, we would like to emphasize again that the method is a heuristic,

and is not guaranteed to improve performance. This is evident from inspection of Eq. 4.9 and 4.10:

the sensing capacity provides a lower bound on the restricted isometry constants. As a result, it is

possible to decrease the lower bound on the restricted isometry constants without actually meeting

that lower bound. This can be demonstrated through the following transmission mode example. This

problem is analogous to that presented in the previous section, except that the antenna operates in

transmission mode. In transmission mode, dielectric scatterers are introduced in order to perturb the

fields radiated by antennas operating in the possibly heterogeneous, but known background medium.

Figure 4.5 displays the configuration for the optimization problem. Once again, three

line source antennas, represented by the white circles, were used to excite the free-space imaging

region, colored in orange. The green pixels represent the locations of the scatterers to be optimized.

The antennas were constrained to transmit at five frequencies linearly spaced between 3.1GHz and

59


3.5GHz, and the dielectric constant of the scatterers was constrained to the range [1, 10]; the magnetic

permeability was restricted to µ = µ0.

Figure 4.6 displays the optimized permittivity distribution, and Figure 4.7 displays the log2

of the singular values of the sensing matrices obtained using Eq. 2.6 with the optimized antenna

(blue) and original antenna (red). The sensing matrices in this example have the same dimensionality

as before, A ∈ C30×102. Once again, the optimization procedure significantly improves the singular

value distribution of the sensing matrix, decreasing the ratio σmax/σmin from approximately 17500 in

the original antenna to 38 in the optimized antenna. Unfortunately, the CS recovery capabilities of

the optimized antenna design are not improved compared to the original design. This result can be

seen in Figure 4.8, which displays the estimated reconstruction accuracies of Eq. 4.24 using the two

Figure 4.5: Configuration for the compressive antenna operating in transmission mode. White =

Transmitter locations, Orange = Imaging region, Green = Scatterer locations, Red = PEC.

Figure 4.6: Permittivity distribution of the optimized transmission mode antenna.

60


antenna designs. Although the lower bound on the restricted isometry constants is improved with the

optimized design, it actually recovered sparse vectors less frequently than the original design in this

numerical analysis. Again, this result is unavoidable because the capacity maximization procedure

simply improves the lower-bound of the restricted isometry constants without any guarantees that the

bound is actually met.

Figure 4.7: log2 of the singular values of the sensing matrices obtained using the optimized transmis-

sion mode antenna (blue) and original transmission mode antenna (red) in a multi-static configuration.


transmission mode design (blue) and baseline transmission mode design (red).

61


4.7 Capacity Maximization in MIMO Communication Systems

It is obvious from inspection of Eq. 4.12 that the compressive antenna design technique

described in the previous sections can be readily applied to optimize the channel capacity of a

Multiple Input Multiple Output (MIMO) communication channel. Consider a MIMO communication

channel, in which Nt antennas are used to transmit data to Nr receiving antennas. The received

signal, y ∈ CNt , is related to the transmitted signal, s ∈ CNr , by the relationship y = Gs+ η, where

G is the channel matrix and η is zero-mean white Gaussian noise [42]. When the transmitted signals

are statistically independent, the channel capacity can be expressed as [42]:

C = log2 det

(I +

ξsNtN0

GGH)

(4.25)

where ξsN0

is the signal to noise ratio. Using the singular value decomposition G = UΣV H , the

channel capacity be equivalently expressed as:

C = log2 det

(I +

ξsNtN0

UΣ2UH)

=

min(Nr,Nt)∑i=1

log2

(1 +

ξsNtN0

σ2i

)(4.26)

The channel matrix G is directly related to the Green’s functions matrix. If the transmitted signals

are narrowband relative to the carrier frequency, the ij−th element of the channel matrix is (G)ij =

G (ri, rj , ω), where ri is the location of the i−th receiving antenna and rj is the location of the j−th

transmitting antenna. Expressing the matrix G in terms of the design parameter x, which for example

could be the effective permittivity ε and magnetic permeability µ of the reflector elements, we can

express the reflector design problem as:

maximize log2 det

(I +

ξsNtN0

G(x)G(x)H)

(4.27)

subject to hq(x) ≤ 0, q = 1, · · · , Q

cp(x) = 0, p = 1, · · · , P

For high signal to noise ratio, i.e. ξs � NtN0, Eq. 4.27 simplifies to the capacity maximization

problem of Eq. 4.12. Given their similarities, Eq. 4.27 can also be solved using the design techniques

discussed in the previous sections. For the simplified problem discussed in Section 4.4, one simply

needs to substitute F (x) = I + ξsNtN0

G(x)G(x)H in Eq. 4.14 and Eq. 4.15.

Figures 4.9 - 4.11 display some preliminary results of the MIMO design approach. In this

example, both the transmitter and the receiver utilized four antenna elements. The dielectric constant

of the reflecting elements was optimized over the range [1, 10], and the magnetic permeability was

62


restricted to µ = µ0. The optimized design clearly outperforms the original design, in which each

transmitting antenna radiates in free-space. The gains of the four orthogonal MIMO communication

channels, given by σ2opt/σ2orig, are approximately 200(23dB), 41(16dB), 34(15dB), and 109(20dB);

this results in an increased channel capacity approximately 27 bits per second per Hz greater than

that of the original design for high signal to noise ratios.

Figure 4.9: Configuration for communications design.

4.8 Antenna Design using ELC Metamaterials

In the simplified design approach described in the previous sections, the design variables x

represented the dielectric constant and magnetic permeability of non-dispersive and non-conductive

objects. Due to these constraints, the compressive antennas were only able to exhibit spatial diversity

in the radiated electric fields. In order to improve the frequency diversity in the radiated fields, we

must allow the scattering elements to be dispersive. In this section, we consider a design scenario

in which the scatterers are Electric-LC (ELC) resonator elements. ELC resonators are a class of

metamaterial absorbers that have been used in many applications [43, 44, 45], which are largely

outside the scope of this work. The ELC theory relevant to the compressive antenna design method

states that the frequency-dependent permittivity of ELC resonators is given by the Drude-Lorentz

63


model [44]:

ε(ω) = fdl(ε∞, ωp, ω0, γ, ω) = ε∞ +ω2p

ω20 − ω2 − jγω

(4.28)

where ε∞ is the dielectric constant at infinite frequency, ωp is the “plasma” frequency, ω0 is the

resonant frequency, and γ is the attenuation factor. ELC resonators allow us to create frequency

diversity in the radiated fields by configuring these resonance parameters. Clearly, for large values of

Figure 4.10: Optimized dielectric constant ε

Figure 4.11: Comparison of the log2 of the singular values of the channel matrix.

64


|ω2 − ω20|, the permittivity reduces to ε(ω) ≈ ε∞. More importantly, for ω = ω0, the permittivity

reduces to:

ε(ω0) = ε∞ + ω2p

γω0(4.29)

For appropriate values of ωp and γ, the ELC acts like a conductor near the resonant frequency ω0.

More interestingly, in the limit as γ → 0, the ELC theoretically behaves like a PEC at ω = ω0

and as a simple dielectric with ε = ε∞ for ω 6= ω0. This result can be seen in Figures 4.12 and

4.13, which display the real and imaginary parts of the permittivity of two ELC resonators with the

same dielectric constant ε∞ = 1 and the same resonant and plasma frequencies ω0 = ωp = 1, but

different values of γ, namely γ = 1 and γ = 0.05. In general, the permeability of ELC resonators

also follows the Drude-Lorentz model with its own set of parameters (µ∞, ωp,m, ω0,m, γm). When

the four ELC parameters for the permittivity and pearmeability are equal, then the impedance of the

ELC is perfectly matched to freespace [44]. More generally, µ∞ and ε∞ can be configured such that

the ELC is perfectly matched to the background medium except at the resonant frequency ω0, where

it acts as a conductor.

Assuming that the ELC parameters for permittivity and permeability are equal for all of

the scatterers, then the capacity optimization problem of Eq. 4.12 can be expressed in terms of the

variables (ε∞, ωp, ω0, γ). In one particular formulation, the ELC capacity maximization problem can

Figure 4.12: Relative permittivity of ELC resonator for γ = 1

65


Figure 4.13: Relative permittivity of ELC resonator for γ = 0.05

be expressed in terms of equality and box constraints as follows:

maximize log det(GH(ε(1), . . . , ε(Nf ), µ(1), . . . , µ(Nf ))G(ε(1), . . . , ε(Nf ), µ(1), . . . , µ(Nf ))

)(4.30)

subject to fdl(ε∞,i, ωp,i, ω0,i, γi, ω(j)) = ε(j)i i = 1 . . . Nr, j = 1 . . . Nf

fdl(ε∞,i, ωp,i, ω0,i, γi, ω(j)) = µ(j)i i = 1 . . . Nr, j = 1 . . . Nf

lε∞ ≤ ε∞,i ≤ uε∞ i = 1 . . . Nr

lωp ≤ ωp,i ≤ uωp i = 1 . . . Nr

lω0 ≤ ω0,i ≤ uω0 i = 1 . . . Nr

lγ ≤ γi ≤ uγ i = 1 . . . Nr

where Nr is the number of scatterers being optimized and Nf is the number of frequencies. Instead

of solving the problem in this form, which would require a significant number of calls to an

electromagnetic forward solver, we can explicitly enforce the equality constraints within the objective

function. In this case, the problem takes the much simpler form:


)(4.31)

subject to l ≤ x ≤ u

where G(x) = G((ε(1)(x), . . . , ε(Nf )(x), µ(1)(x), . . . , µ(Nf )(x)

)and the design parameters (ε∞, ωp, ω0, γ)

of all of the scatterers have been compressed into the vector x. In this form, the box-constrained

conjugate gradient algorithm discussed in Section 4.4 can be applied with some modifications. The

66


gradient of the log-det objective function can be written in terms of the partial derivatives:

∂

∂xmlog det

(GH(x)G(x)

)= Re

{ Nf∑j=1

Nr∑i=1

∂

∂ε(j)i

log det(GH(x)G(x)

) ∂ε(j)i∂xm

(4.32)

+∂

∂µ(j)i

log det(GH(x)G(x)

) ∂µ(j)i∂xm

}where the partial derivatives ∂

∂ε(j)i

log det(GH(x)G(x)

)and ∂

∂µ(j)i

log det(GH(x)G(x)

)are those

derived in Section 4.4. The partial derivatives ∂ε(j)i

∂xmand ∂µ

(j)i

∂xmare non-zero only when the parameter

xm is one of the parameters (ε∞, ωp, ω0, γ) for the i−th scatterer. These values are found by

differentiating Eq. 4.28:

∂

∂ε∞fdl(ε∞, ωp, ω0, γ, ω) = 1 (4.33)

∂

∂ωpfdl(ε∞, ωp, ω0, γ, ω) =

2ωpω20 − ω2 − jγω

(4.34)

∂

∂ω0fdl(ε∞, ωp, ω0, γ, ω) =

−2ω0ω2p(

ω20 − ω2 − jγω

)2 (4.35)

∂

∂γfdl(ε∞, ωp, ω0, γ, ω) =

jωω2p(

ω20 − ω2 − jγω

)2 (4.36)

Although the box-constrained method can be applied to the ELC capacity maximization

problem, it is not the “optimal” method. The reason for this is that, as previously stated, ELC

metamaterials achieve frequency diversity by acting like conductors near the resonant frequency and

by acting like dielectrics far away from the resonant frequency. For a stepped-frequency system,

which operates on a discrete set of frequencies, one must ensure that the operating frequencies overlap

the resonant frequencies of the ELC metamaterials. If the resonant frequency of an ELC metamaterial

does not coincide with one of the operating frequencies of the sensing system, then the metamaterial

may act like a simple dielectric, thereby losing its frequency diversity. The box-constrained conjugate

gradient method cannot prevent this phenomenon.

Alternatively, let us consider a combinatorial approach for the ELC capacity maximization

problem. Suppose that the ELC parameter ε∞ is constant and matched to the background material,

and that the parameters (ωp, γ) are computed directly from the resonant frequency ω0 according to

the following relationships:

ωp = αω0 (4.37)

γ = βω0 (4.38)

67


where α and β are positive constants that determine the width and amplitude of the lobe centered

about the resonant frequency. By constraining the resonant frequencies to coincide with one of

the operating frequencies, the antenna design method can be expressed as the following integer

programming problem:

maximize log det(GH(ω0,1, . . . , ω0,Nr)G(ω0,1, . . . , ω0,Nr)

)(4.39)

subject to ω0,i ∈ {0, ω1, . . . , ωNf }, i = 1 . . . Nr

Note that the value ω0 = 0 was added to the constraint set in order to allow the metamaterials to

act like dielectrics at all frequencies. Clearly, this problem is NP-hard, as one must loop through

all (Nf + 1)Nr possible combinations of materials. Instead of looping through all combinations,

let us consider a sub-optimal greedy approach. In this method, a single resonant frequency ω0,i is

optimized over the set {0, ω1, . . . , ωNf } while all other ω0,j for j 6= i are held fixed. Once the value

for ω0,i is determined, the optimization procedure moves on to ω0,i+1. This process continues until

the set of ω0,i converge, or until some number of iterations through the list of Nr materials has been

met.

In order to test the greedy ELC optimization procedure, let us return to the reflection

mode optimization problem discussed in Section 4.5. Consider the reflection mode configuration of

Figure 4.1, where the scatterers added to the PEC reflector are ELC metamaterials with parameters

ε∞ = 1, α = 0.1, and β = 0.001. The objective is to select the resonant frequency ω0 ∈{0, 3.1, 3.2, 3.3, 3.4, 3.5} [GHz] such that the capacity of the antenna is maximized. Starting from

the baseline configuration, i.e. ω0 = 0 for all scatterers, the greedy optimization method was executed

for a single cycle through the 40 scatterer elements. Figure 4.14 displays the log2 of the singular

values of the baseline and optimized sensing matrices. The optimization procedure clearly improves

the capacity of the sensing matrix, as the condition number decreased from 36000 in the original

design to nearly 58 in the optimized design. The optimized design also leads to an improvement in

CS reconstruction capability, as can be seen in Figure 4.15.

68


Figure 4.14: log2 of the singular values of the sensing matrices obtained using the optimized reflection

mode antenna (blue) and original reflection mode antenna (red) in a multi-static configuration.


ELC reflection mode design (blue) and baseline transmission mode design (red).

69

Chapter 5

Conclusions

In this thesis, we have introduced two new contributions to the field of Compressed Sensing

(CS) in electromagnetic imaging applications. First, we introduced the concept of Physicality

Constrained Compressed Sensing (PCCS), which augments the standard `1-norm optimization

programs of CS in order to ensure that the solution vector obeys the laws of physics. Our theoretical

and numerical analyses demonstrated that the reconstruction capabilities of standard CS are enhanced

when the PCCS techniques are used instead. PCCS was also investigated in the context of a hybrid

Digital Breast Tomosynthesis (DBT) / Nearfield Radar Imaging (NRI) system for breast cancer

detection. The numerical results of PCCS applied to the hybrid DBT / NRI system using synthetic

data indicate that it may be possible to successfully deploy the hybrid system in a clinical setting.

We also introduced three efficient methods for solving the PCCS problems, which are summarized in

Table 5.1 below. Each of the PCCS algorithms has their own pros and cons that make them suitable

for solving different problems.

Nesterov+ ADMM∗ AGAL+

P1 (Eq. 3.14) N Y Y

P2 (Eq. 3.15) Y Y Y

P3 (Eq. 3.16) N Y Y

+requires methods for computing Ax and AHz∗requires the sensing matrix A to be known exactly

Table 5.1: Summary of the PCCS algorithms.

70

CHAPTER 5. CONCLUSIONS

In the second contribution, we introduced a novel numerical optimization method for

designing so-called “compressive antennas” with enhanced CS recovery capabilities. This design

method operates by adding scatterers to a baseline antenna configuration such that the capacity

of the system is enhanced. Our theoretical analysis demonstrated that by enhancing the capacity

of the antenna system, and by extension the capacity of the sensing matrix, one can improve the

lower bound on CS reconstruction performance, as measured by the Restricted Isometry Property

(RIP). We presented several numerical examples, using both dielectric scatterers and Electric-LC

(ELC) metamaterial scatterers, to demonstrate how the new design method can enhance the CS

reconstruction capabilities of the antenna. We also briefly discussed the application of the antenna

design method to Multiple Input Multiple Output (MIMO) communication systems.

The work presented in this thesis is only the beginning. Indeed, it can serve as the

foundation for future research. In the remainder of this section, we discuss some of the extensions

and future work that stem from the contents of this thesis that we would like to see researched in the

future. First, the imaging results presented in Chapter 3 suggest that it may be possible to employ

PCCS techniques in the hybrid DBT / NRI system. Future research on this topic should include

an extension of the numerical analysis using a 3D model for the breast, and an assessment of the

imaging capabilities in practice using a prototype system. Research into both of these topics is

currently underway within our research group. In our opinion, the theoretical considerations for PCCS

presented in Chapter 3 are far from satisfying. Indeed, while the `1-norm heuristic arguments and the

`0-norm analysis provide intuition for why PCCS outperforms standard CS (when the physicality

constraints are applicable, of course), one can’t help but wonder if a more sound theoretical argument

akin to the RIP exists. This is an interesting topic that we believe should be investigated further.

To conclude the discussion of PCCS, it is worth mentioning that the algorithms discussed in

Chapter 3 can be applied to several related problems in Compressed Sensing. Although our analysis

only considered the case where the vector x is sparse, it is straight-forward to extend the algorithms

to the scenarios where the vector is sparse when expressed in a different basis, i.e. α = Wx, and

when it is piece-wise smooth such that the total variation ‖x‖TV is sparse. It can also be extended to

block CS applications [46], where the support of x is restricted to a small number of disjoint blocks.

Indeed, physicality constrained block CS may be appropriate for the hybrid DBT / NRI system, given

the contiguous nature of cancerous lesions.

It is also worth mentioning that the physicality constrained algorithms are by no means

restricted to either CS or Electromagnetic Imaging applications. More generally, the physicality

71


constrained optimization program can be expressed in one of three equivalent forms:

minimizex

f(x) (5.1)


x ∈ Qp

minimizex

λf(x) +1

2‖Ax− y‖2`2 (5.2)

subject to x ∈ Qp

minimizex

1

2‖Ax− y‖2`2 (5.3)

subject to f(x) ≤ τ

x ∈ Qp

where f(x) is the objective function and Qp is the physicality set. In order to apply the physicality

constrained algorithms to these problems, f(x) must be Lipschitz continuously differentiable or have

an easy to compute proximal operator, and there must be an efficient method for projecting onto the

physicality set Qp. In many practical sensing applications, the physicality constraints are separable,

such that they can be enforced on each voxel independently. Electromagnetic imaging, which has

been discussed extensively in this thesis, is one such application. Some other applications include,

but are certainly not limited to, the following:

• In acoustic sensing applications, the density ρ and bulk modulus κ must be strictly positive, i.e.

ρ � 0, κ � 0

• In x-ray CT, the x-ray attenuation coefficient α must be strictly positive, i.e. α � 0

• In a gray-scale camera, each pixel in the image is constrained to lie within a certain range, i.e.

0 � x � xmax

• Temperature sensors, such as the passive GeoSTAR satellite system [47], should enforce the

temperature to lie within a certain range, i.e. Tmin � T � Tmax

The choice of objective function depends upon the prior information that is to be exploited. This

choice will also determine which of the three algorithms described in Chapter 3 can be used to solve

the problem. If the objective function is differentiable, or can be smoothed using a technique similar

72


to the one applied in Nesterov’s method, then accelerated gradient techniques such as Nesterov’s

method and FISTA can be used to solve the general PCCS problem of Eq. 5.2. If the objective

function has a closed-form or easy to compute proximal operator, then the ADMM and the AGAL

method can be used in order to solve Eq. 5.1, Eq. 5.2, and Eq. 5.3.

Finally, there are many extensions to the compressive antenna design method that should

be investigated in the future. All of the methods described in Chapter 4 optimized the capacity

of the antenna system operating as a transmitter. As it turns out, enhancing this quantity also

happens to enhance the capacity of the sensing matrix obtained when the Born Approximation (BA)

is applied. A natural extension to this method is to maximize the capacity of the sensing matrix

directly. Although this does increase the computational complexity of the optimization problem,

especially for the first-order method, it is the more desirable solution. Developing this method will

also allow the technique to be applied to other CS applications where the sensing matrix is generated

deterministically. One might also consider a different objective function than the capacity for the

design method. Minimization of the mutual coherence [48] is one possible candidate for the objective

function, as it is already deeply rooted within CS theory. Although the mutual coherence provides

weaker reconstruction guarantees than the RIP, it is significantly easier to compute. One drawback to

the coherence, however, is that it is not differentiable. Nevertheless, there is some existing work in the

literature (see [49] and the references therein), which suggests that it may be possible to develop such

a technique. We have begun to research this technique as it applies to general sensing applications,

and plan to develop it further for the compressive antenna design problem in the future.

73

Bibliography

[1] P. Van Den Berg and A. Abubakar, “Contrast source inversion method: state of art,” Journal of

Electromagnetic Waves and Applications, vol. 15, no. 11, pp. 1503–1505, 2001.

[2] A. Abubakar, G. Pan, M. Li, L. Zhang, T. Habashy, and P. van den Berg, “Three-dimensional

seismic full-waveform inversion using the finite-difference contrast source inversion method,”

Geophysical Prospecting, vol. 59, no. 5, pp. 874–888, 2011.

[3] A. Zakaria, I. Jeffrey, and J. LoVetri, “Full-vectorial parallel finite-element contrast source

inversion method,” Progress In Electromagnetics Research, vol. 142, pp. 463–483, 2013.

[4] M. D. Migliore and D. Pinchera, “Compressed sensing in electromagnetics: Theory, applications

and perspectives,” in Antennas and Propagation (EUCAP), Proceedings of the 5th European

Conference on. IEEE, 2011, pp. 1969–1973.

[5] Y. Rodriguez-Vaqueiro, Y. Alvarez-Lopez, B. Gonzalez-Valdes, J. Martinez-Lorenzo, F. Las-

Heras, and C. Rappaport, “On the use of compressed sensing techniques for improving multi-

static millimeter-wave portal-based personnel screening,” 2013.

[6] A. Massa, P. Rocca, and G. Oliveri, “Compressive sensing in electromagnetics-a review,”

Antennas and Propagation Magazine, IEEE, vol. 57, no. 1, pp. 224–238, 2015.

[7] C. Rappaport, A. Morgenthaler, and M. Kilmer, “FDFD modeling of plane wave interactions

with buried objects under rough surfaces,,” in 2001 IEEE Antenna and Propagation Society

International Symposium, 2001, p. 318.

[8] D. of Health, C. f. D. C. Human Services, Prevention, and N. C. Institute, “U.s. cancer statistics

working group. u.s. cancer statistics: 1999-2009, incidence and mortality web-based report,”

Tech. Rep., 2013.

74

BIBLIOGRAPHY

[9] D. Kopans, S. Gavenonis, E. Halpern, and R. Moore, “Calcifications in the breast and digital

breast tomosynthesis,” The breast journal, vol. 17, no. 6, pp. 638–644, 2011.

[10] M. Lazebnik, D. Popovic, L. McCartney, C. B. Watkins, M. J. Lindstrom, J. Harter, S. Sewall,

T. Ogilvie, A. Magliocco, T. M. Breslin et al., “A large-scale study of the ultrawideband

microwave dielectric properties of normal, benign and malignant breast tissues obtained from

cancer surgeries,” Physics in Medicine and Biology, vol. 52, no. 20, p. 6093, 2007.

[11] J. Martinez Lorenzo, R. Obermeier, F. Quivira, C. Rappaport, R. Moore, and D. Kopans, “Fusing

digital-breast-tomosynthesis and nearfield-radar-imaging information for a breast cancer detec-

tion algorithm,” in Antennas and Propagation Society International Symposium (APSURSI),

2013 IEEE, July 2013, pp. 2038–2039.

[12] R. Obermeier, M. Tivnan, C. Rappaport, and J. A. Martinez-Lorenzo, “3d microwave nearfield

radar imaging (nri) using digital breast tomosynthesis (dbt) for non-invasive breast cancer

detection,” in URSI 2014 — USNC-URSI Radio Science Meeting, July 2014.

[13] R. Obermeier, J. Heredia Juesas, and J. A. Martinez-Lorenzo, “Imaging breast cancer in a

hybrid dbt / nri system using compressive sensing,” in Antennas and Propagation Society

International Symposium (APSURSI), 2015 IEEE, 2015.

[14] J. Martinez Lorenzo, A. Basukoski, F. Quivira, C. Rappaport, R. Moore, and D. Kopans,

“Composite models for microwave dielectric constant characterization of breast tissues,” in

Antennas and Propagation Society International Symposium (APSURSI), 2013 IEEE, July 2013,

pp. 2036–2037.

[15] Y. Saad and M. H. Schultz, “Gmres: A generalized minimal residual algorithm for solving

nonsymmetric linear systems,” SIAM Journal on scientific and statistical computing, vol. 7,

no. 3, pp. 856–869, 1986.

[16] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2009.

[17] E. J. Candes and T. Tao, “Decoding by linear programming,” Information Theory, IEEE

Transactions on, vol. 51, no. 12, pp. 4203–4215, 2005.

[18] E. J. Candes, J. K. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate

measurements,” Communications on pure and applied mathematics, vol. 59, no. 8, pp. 1207–

1223, 2006.

75

BIBLIOGRAPHY

[19] E. J. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruc-

tion from highly incomplete frequency information,” Information Theory, IEEE Transactions

on, vol. 52, no. 2, pp. 489–509, 2006.

[20] D. L. Donoho, “Compressed sensing,” Information Theory, IEEE Transactions on, vol. 52,

no. 4, pp. 1289–1306, 2006.

[21] E. Candes and T. Tao, “Near-optimal signal recovery from random projections and universal

encoding strategies. submitted to ieee trans,” Inform. Theory, November, 2004.

[22] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version

2.1,” http://cvxr.com/cvx, Mar. 2014.

[23] ——, “Graph implementations for nonsmooth convex programs,” in Recent Advances in

Learning and Control, ser. Lecture Notes in Control and Information Sciences, V. Blondel,

S. Boyd, and H. Kimura, Eds. Springer-Verlag Limited, 2008, pp. 95–110, http://stanford.

edu/∼boyd/graph dcp.html.

[24] Y. Nesterov, “Smooth minimization of non-smooth functions,” Mathematical programming,

vol. 103, no. 1, pp. 127–152, 2005.

[25] S. Becker, J. Bobin, and E. J. Candes, “Nesta: a fast and accurate first-order method for sparse

recovery,” SIAM Journal on Imaging Sciences, vol. 4, no. 1, pp. 1–39, 2011.

[26] Y. Nesterov, “A method for unconstrained convex minimization problem with the rate of

convergence o (1/k2),” in Doklady an SSSR, vol. 269, no. 3, 1983, pp. 543–547.

[27] L. Vandenberghe, “Lecture notes from ucla ee236c,” 2014.

[28] S. R. Becker, “Practical compressed sensing: modern data acquisition and signal processing,”

Ph.D. dissertation, California Institute of Technology, 2011.

[29] J. Nocedal and S. J. Wright, “Numerical optimization 2nd,” 2006.

[30] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical

learning via the alternating direction method of multipliers,” Foundations and Trends R© in

Machine Learning, vol. 3, no. 1, pp. 1–122, July 2011.

[31] N. Parikh and S. Boyd, “Proximal algorithms,” Foundations and Trends in optimization, vol. 1,

no. 3, pp. 123–231, 2013.

76

http://cvxr.com/cvx

http://stanford.edu/~boyd/graph_dcp.html

http://stanford.edu/~boyd/graph_dcp.html

BIBLIOGRAPHY

[32] H. Kuhn and A. Tucker, “Nonlinear programming. inproceedings of the 2nd berkeley sympo-

sium on mathematical statistics and probability.”

[33] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse

problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.

[34] P. Tseng, “On accelerated proximal gradient methods for convex-concave optimization. submit-

ted to siam j,” J. Optim, 2008.

[35] D. L. Donoho, “For most large underdetermined systems of linear equations the minimal `1-

norm solution is also the sparsest solution,” Communications on pure and applied mathematics,

vol. 59, no. 6, pp. 797–829, 2006.

[36] J. A. Martinez-Lorenzo, J. Heredia Juesas, and W. Blackwell, “A single-transceiver compressive

reflector antenna for high-sensing-capacity imaging,” IEEE Antennas and Wireless Propagation

Letters, 2015.

[37] A. M. L. T. W. B. J. Heredia-Juesas, G. Allan and J. A. Martinez-Lorenzo, “Consensus-

based imaging using admm for a compressive reflector antenna,” in IEEE AP-S International

Symposium, 2015.

[38] A. Busboom, H. D. Schotten, and H. Elders-Boll, “Coded aperture imaging with multiple

measurements,” JOSA A, vol. 14, no. 5, pp. 1058–1065, 1997.

[39] G. D. De Villiers, N. T. Gordon, D. A. Payne, I. K. Proudler, I. D. Skidmore, K. D. Ridley, C. R.

Bennett, R. A. Wilson, and C. W. Slinger, “Sub-pixel super-resolution by decoding frames

from a reconfigurable coded-aperture camera: theory and experimental verification,” in SPIE

Optical Engineering+ Applications. International Society for Optics and Photonics, 2009, pp.

746 806–746 806.

[40] R. F. Marcia and R. M. Willett, “Compressive coded aperture superresolution image reconstruc-

tion,” in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International

Conference on. IEEE, 2008, pp. 833–836.

[41] M. D. Migliore, “On electromagnetics and information theory,” Antennas and Propagation,

IEEE Transactions on, vol. 56, no. 10, pp. 3188–3200, 2008.

[42] J. G. Proakis and M. Salehi, Digital communications, 5, Ed. McGraw-Hill, 2008.

77

BIBLIOGRAPHY

[43] B. Arritt, D. Smith, and T. Khraishi, “Equivalent circuit analysis of metamaterial strain-

dependent effective medium parameters,” Journal of Applied Physics, vol. 109, no. 7, p.

073512, 2011.

[44] C. M. Watts, X. Liu, and W. J. Padilla, “Metamaterial electromagnetic wave absorbers,” Ad-

vanced Materials, vol. 24, no. 23, 2012.

[45] H. Lee, J. Park, and H. Lee, “Design of double negative metamaterial absorber cells using

electromagnetic-field coupled resonators,” in Asia-Pacific Microwave Conference 2011. IEEE,

2011, pp. 1062–1065.

[46] Y. C. Eldar, P. Kuppinger, and H. Bolcskei, “Block-sparse signals: Uncertainty relations and

efficient recovery,” IEEE Transactions on Signal Processing, vol. 58, no. 6, pp. 3042–3054,

2010.

[47] A. Molaei, G. Allan, J. Heredia, W. Blackwell, and J. Martinez-Lorenzo, “Interferometric sound-

ing using a compressive reflector antenna,” in 2016 10th European Conference on Antennas

and Propagation (EuCAP). IEEE, 2016, pp. 1–4.

[48] D. L. Donoho and X. Huo, “Uncertainty principles and ideal atomic decomposition,” Informa-

tion Theory, IEEE Transactions on, vol. 47, no. 7, pp. 2845–2862, 2001.

[49] Z. Lin, C. Lu, and H. Li, “Optimized projections for compressed sensing via direct mutual

coherence minimization,” arXiv preprint arXiv:1508.03117, 2015.

78

Documents

Compressed sensing algorithms for electromagnetic imaging ... · Abstract of the Thesis Compressed Sensing Algorithms for Electromagnetic Imaging Applications by Richard Obermeier