19
Recent Advances in Imaging Around Corners Tomohiro Maeda, Guy Satat, Tristan Swedish, Lagnojita Sinha, Ramesh Raskar MIT Media Lab Figure 1: Applications of non-line-of-sight (NLOS) imaging. (Material from:‘ [1], adapted with per- mission from the authors.’) (a) Detection of approaching vehicles around corners results in safer autonomous navigation. (b) Localizing survivors in the hidden areas makes search-and-rescue operations more efficient and safer. (c) Imaging the inaccessible regions in endoscopy enables better diagnostics. Abstract Seeing around corners, also known as non-line-of- sight (NLOS) imaging is a computational method to resolve or recover objects hidden around corners. Re- cent advances in imaging around corners have gained significant interest. This paper reviews different types of existing NLOS imaging techniques and discusses the challenges that need to be addressed, especially for their applications outside of a constrained laboratory envi- ronment. Our goal is to introduce this topic to broader research communities as well as provide insights that would lead to further developments in this research area. 1. Introduction Recent advances in computational imaging tech- niques made it possible to image around corners, which is known as non-line-of-sight (NLOS) imaging. The ability to see around corners would be beneficial in var- ious applications. For example, the detection of objects around a corner enables autonomous vehicles to avoid collisions. Detection and localization of people without the need to go into dangerous environments make res- cue operations safer and more efficient. NLOS imaging can also be used for medical applications such as en- doscopy, where the region of interest is hard for the sensors to access directly (Fig. 1). Fig. 2 shows typical scene setups for NLOS imag- ing techniques. While there are many techniques using the electromagnetic (EM) spectrum or acoustic waves to image around corners and through walls [2–4], we focus on works that use light (electromagnetic wave in the visible and infrared spectrum). Fig. 3 shows an overview of the current state of NLOS imaging. NLOS imaging was first proposed by Raskar and Davis [5], and demonstrated by Kirmani et al. [6] for recovery of hidden planar patches. Velten et al. [7] showed the first full reconstruction of a 3D object that was hidden around a corner. These works showed that ToF measurements of photons returning from the hid- den scene with three bounces (Fig. 2 (a)) contain suf- ficient information to recover the occluded scene. Fol- lowing these works, different sensing modalities were used for NLOS imaging. For example, Katz et al. [8] first demonstrated the use of a speckle pattern to re- construct 2D images. Tancik et al. [9,10] and Bouman 1 arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

Recent Advances in Imaging Around Corners

Tomohiro Maeda, Guy Satat, Tristan Swedish, Lagnojita Sinha, Ramesh RaskarMIT Media Lab

Figure 1: Applications of non-line-of-sight (NLOS) imaging. (Material from:‘ [1], adapted with per-mission from the authors.’) (a) Detection of approaching vehicles around corners results in safer autonomousnavigation. (b) Localizing survivors in the hidden areas makes search-and-rescue operations more efficient andsafer. (c) Imaging the inaccessible regions in endoscopy enables better diagnostics.

Abstract

Seeing around corners, also known as non-line-of-sight (NLOS) imaging is a computational method toresolve or recover objects hidden around corners. Re-cent advances in imaging around corners have gainedsignificant interest. This paper reviews different typesof existing NLOS imaging techniques and discusses thechallenges that need to be addressed, especially for theirapplications outside of a constrained laboratory envi-ronment. Our goal is to introduce this topic to broaderresearch communities as well as provide insights thatwould lead to further developments in this researcharea.

1. Introduction

Recent advances in computational imaging tech-niques made it possible to image around corners, whichis known as non-line-of-sight (NLOS) imaging. Theability to see around corners would be beneficial in var-ious applications. For example, the detection of objectsaround a corner enables autonomous vehicles to avoidcollisions. Detection and localization of people without

the need to go into dangerous environments make res-cue operations safer and more efficient. NLOS imagingcan also be used for medical applications such as en-doscopy, where the region of interest is hard for thesensors to access directly (Fig. 1).

Fig. 2 shows typical scene setups for NLOS imag-ing techniques. While there are many techniques usingthe electromagnetic (EM) spectrum or acoustic wavesto image around corners and through walls [2–4], wefocus on works that use light (electromagnetic wave inthe visible and infrared spectrum). Fig. 3 shows anoverview of the current state of NLOS imaging.

NLOS imaging was first proposed by Raskar andDavis [5], and demonstrated by Kirmani et al. [6] forrecovery of hidden planar patches. Velten et al. [7]showed the first full reconstruction of a 3D object thatwas hidden around a corner. These works showed thatToF measurements of photons returning from the hid-den scene with three bounces (Fig. 2 (a)) contain suf-ficient information to recover the occluded scene. Fol-lowing these works, different sensing modalities wereused for NLOS imaging. For example, Katz et al. [8]first demonstrated the use of a speckle pattern to re-construct 2D images. Tancik et al. [9,10] and Bouman

1

arX

iv:1

910.

0561

3v1

[ee

ss.I

V]

12

Oct

201

9

Page 2: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

Figure 2: Layouts of typical NLOS imaging setups with different sensing principles. (a) ToF-basedtechniques use three-bounce photons that give distance constraints. (b) Coherence-based techniques use specklepatterns or spatial coherence that are preserved during scattering. (c) Intensity-based methods mainly exploitocclusions that cast shadows.

Figure 3: Landscape of NLOS imaging works on detector, illumination source, and scene space coding.

et al. [11] demonstrated reconstruction and tracking ofthe hidden object with a traditional RGB camera.

The main challenge in NLOS imaging is that thephotons reflected from the hidden object are scatteredat the line-of-sight surfaces (e.g., wall and floor). Com-putational imaging techniques model light transport torecover the information of the hidden scene. However,their applications in practice are still limited. Thispaper reviews the recent advances in NLOS imaging

and discusses the challenges towards real-world appli-cations. We note that other types of imaging tasks,such as seeing through scattering media, are often alsoreferred to as NLOS imaging, but in this paper, weuse the term “NLOS imaging” to refer solely to seeingaround corners.

2

Page 3: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

ToF Coherence Intensity

Reconstruction3D

[7, 12–30]µm-cm Resolution

None Planar/Specular Object [31]

2DReconstruction

Included in 3Dcm Resolution [33,34]

Resolution [8, 32]Diffraction Limited

Occlusion Dependent [9, 10,36–40]Coarse Resolution [31], Thermal [35]

Localization/Tracking [41–46]

cm Precision3D Tracking [47]

1D Distance Recovery [33]6D Tracking [49], 3D localization [35,50]

Occlusion Dependent [10,11,48]

ClassificationIdentification [45]

HumanHuman Pose Classification [51]

MNIST,Human Pose Detection [35]

Object Type Classification [10,50],

Table 1: Overview of existing techniques to see around corners for different tasks. The color of thetable cell indicates the quality of the methods relative to other NLOS imaging techniques. Green and yellowcells indicate that NLOS imaging tasks have been shown in practical and limited recovery complexities. Red cellsindicate that the tasks have not been demonstrated.

1.1. What is NLOS Imaging?

Fig. 2 illustrates different strategies to see aroundcorners. Mathematically, NLOS imaging can be for-mulated as the following forward model:

y = f(x) + ε, (1)

where, x represents the hidden scene parameters, suchas albedo, motion or class of the hidden object, y is themeasurement, f(·) is the mapping of the hidden sceneto the measurement–which depends on the choice ofillumination, sensor, and the geometry of the corner–and ε denotes the measurement noise.

The goal of NLOS imaging is to design sensingschemes such that f(·) can be inverted, and to buildalgorithms such that x is robustly and efficiently recov-ered given y. Recently, data-driven approaches showedthat inversion of Eq. 1 can be learned even when f(·)is not directly available [10].

Full Scene Reconstruction: The objective ofNLOS imaging is to see around a corner as if there isa line-of-sight directly into the hidden scene. Recon-struction tasks include the recovery of the 3D, 2D, or1D shape of the hidden object or light fields. In re-construction, x is a direct representation of the hiddenscene, such as a voxelized probability map, surface, orlight transport in the hidden scene.

Direct Scene Inference: The reconstruction pro-cedure often requires the long acquisition and time-consuming computations. However, full scene re-construction might not even be necessary in manyscenarios–for example, detecting the presence of a mov-ing person is sufficient for rescue operations. We define“inference” as an estimation of these latent parameters

(e.g., location, class, or motion) without direct recon-struction of the hidden scene. In this case, the un-known parameter x can be location, type, or movementof the hidden object. Because inference does not re-quire full reconstruction, inference around corners canbe potentially performed with less computation andmeasurement than full scene reconstruction.

1.2. Overview of Sensing Schemes for NLOS Imag-ing

We classify the existing techniques into three cat-egories based on their sensing modalities: Time-of-flight, coherence, and intensity. Table 1 provides anoverview of the current state of each method regardingNLOS imaging tasks. Section 2–4 provides the reviewof NLOS imaging techniques that exploit different sens-ing modalities, and Section 5 presents the challenges ofNLOS imaging for each category.

Time-of-Flight-Based Approach (Section 2):As the photons travel from the source, the first bounceoccurs on an adjoining wall or directly on the floornear the corner. The second bounce is on the hiddenobject around the corner. The third bounce is againon a wall or a floor in the line-of-sight region and backtowards the measurement system, as shown in Fig. 2(a). Time-resolved imaging techniques utilize the pho-tons associated with the third bounce, which constrainspossible locations of the hidden object. Section 2 re-views the reconstruction and inference algorithms fortime-resolved sensing based methods.

Coherence-Based Approach (Section 3):While the information on the hidden geometry islargely lost by the diffuse reflection of photons on therelay wall, some coherence properties of light are pre-

3

Page 4: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

served. Coherence-based methods exploit the fact thatcoherence preserves information about the occludedscene. Section 3 reviews NLOS imaging techniquesthat utilize speckle and spatial coherence of light.

Intensity-Based Approach (Section 4): Oc-clusions from the corner or occlusions within the hid-den scene can provide rich information on the hiddenscene. Using occlusions, it is possible to recover thehidden scene with a typical intensity (RGB) camera,such as a smartphone camera (Fig. 2 (c)). While manyintensity-based approaches rely on occlusions, someworks demonstrate NLOS imaging by exploiting sur-face reflectance of the relay wall or the object. Sec-tion 4 reviews the intensity-based approach.

1.3. Cameras to Image Around Corners

Different hardware has been used to see around cor-ners. We briefly introduce such hardware to describewhat sensors and illumination sources are used forNLOS imaging.

Streak Camera + Pulsed Laser: A streakcamera acquires temporal information by mapping thearrival time of photons in 1D space with sweep elec-trodes, resulting in a 2D measurement of space andtime. Streak cameras offer the best time resolutiondown to sub-picosecond or even 100 femtoseconds (sub-mm –0.03 mm distance resolution), among other typesof ToF cameras. The disadvantages of streak camerasare the high cost (typically more than $100k), low pho-ton efficiency, high noise level, and the need for linescanning or additional optics [52]. This results in a po-tentially longer acquisition time for sufficient signal-to-noise ratio (SNR). The first works on NLOS imagingused a Hamamatsu C5680 for the experiments withKerr lens mode-locked Ti:sapphire laser (50 fs pulsewidth).

SPAD + Pulsed Laser: Single-photon avalanchediode (SPAD) can detect the arrival of a single photonwith a time jitter of 20–100 ps (6–30 mm distance res-olution). Unlike streak cameras, it is possible to have2D SPAD arrays for capturing 3D measurements with-out the need for line scanning. While the temporalresolution of a SPAD is poorer than streak cameras,the photon efficiency and SNR are better. Commonlyused single-pixel SPAD detectors and time-correlatedsingle-photon counting devices are from Micro PhotonDevices and Hydraharp from PicoQuant. These costaround $5k and $20-35k, respectively.

AMCW ToF Camera + Modulated LightSource: An amplitude modulated continuous wave(AMCW) ToF camera compares the phase shift of

emitted and received modulated light through three ormore correlation measurements [53, 54]. The distanceresolution of AMCW ToF cameras is limited by mod-ulation frequency and the SNR. While AMCW ToFcameras can achieve few-mm resolution [55], they aresusceptible to strong ambient light due to the need fora longer exposure time than that for pulse-based ToFcameras. AMCW ToF cameras have a much lower cost(ranging $400–$2000) than SPADs and streak camerasand are used in commercial products, including the Mi-crosoft Kinect and Photonic Mixer Device (PMD).

Traditional Camera: In this paper, we use “tra-ditional camera” to refer to the sensors such as charge-coupled device (CCD) and complementary metal-oxide-semiconductor (CMOS) arrays, which measureirradiance images without ToF information. Tradi-tional cameras can be used for both intensity andspeckle measurement modalities. The combination ofa diffuser and traditional camera enables the measure-ment of cross-correlation of temporal coherence to pro-vide passive ToF measurement [46]. Traditional cam-eras are the most ubiquitous and affordable sensor dis-cussed in this paper but do not acquire direct measure-ments of time-of-flight information.

Interferometer: Interference between multiplelightwaves provides depth information in µm scale,which can be used for ToF-based NLOS imaging formicroscopic scenes. For example, an optical coher-ence tomography system with temporally and spatiallyincoherent LED showed 10 µm resolution to demon-strate NLOS reconstruction of an object as small asa coin [25]. Heterodyne interferometry, which utilizesinterference of light with different wavelengths, demon-strated NLOS imaging at 70 µm precision [56]. A dual-phase Sagnac interferometer captures complex-valuedspatial coherence functions as a change of measuredintensity and is used for spatial coherence-based meth-ods. Such an interferometer is not commercially avail-able, and we refer readers to [57] for the design and de-tails of a dual-phase Sagnac interferometer. An inter-ferometer is a sensitive instrument and often requirescareful alignment and isolation from mechanical vibra-tions.

2. Time-of-Flight-Based NLOS Imaging

Among many numbers of works in NLOS imaging,ToF-based techniques are the most popular due totheir ability to resolve the path length of the three-bounce photons that carry the information of the hid-den scene. The ToF measurement of three-bouncephotons can also be used for estimating the bidirec-tional reflectance distribution function (BRDF) of ma-

4

Page 5: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

Figure 4: Need for time-resolved information(Material from Velten et al., “Recovering three-dimensional shape around a corner using ul-trafast time-of-flight imaging,” published 2012,Nature Communications [7] with permission ofthe authors). a) Intensity change due to displacementof a patch is below traditional camera’s sensitivity. (b)Temporal information resolves the displacement of apatch with a practical sensitivity range.

terials [58]. While this review paper only considersimaging around corners, ToF measurements are alsouseful for seeing through scattering media [59–63], an-alyzing light transport [64–71], and novel imaging sys-tems [72–74].

The Benefit of ToF Measurement As discussedin the supplement material of Velten et al. [7], thechange of intensity due to the displacement of a patchillustrated in Fig. 4 (a) is proportional to (∆x)2∆z/z3,where ∆x,∆z, z are the size, displacement and depth ofa patch respectively. In contrast, the intensity changeof a time-resolved measurement as illustrated in Fig. 4(b) is proportional to 3∆x/(2πz). Drawing upon an ex-ample from [7], the fractional change of the intensityfor ∆x = 5 mm, ∆z = 5mm, z = 20 cm is 0.00003 for thetraditional camera and 0.01 for the ToF camera. Thisexample shows that intensity change is below the sensi-tivity of the traditional camera. Furthermore, the SNRis small because the number of three-bounce photonsis small. For this reason, the existing demonstration ofintensity-based NLOS imaging relies on occlusions ormore specular BRDF of the wall. ToF measurementcan resolve the change of the measurement to recoverthe 3D geometry of the hidden object.

2.1. Reconstruction Algorithms

ToF measurements provide ellipsoidal constraints onthe possible object locations in the hidden scene, as il-lustrated in Fig. 5. Let w1 and w2 denote points ofthe wall where a photon undergoes the first and thirdbounces, and o denote a point of the object where aphoton undergoes the second bounce. Moreover, let land v denote locations of the light source and the cam-era. Then the hidden object o must be on an ellipsoid

Figure 5: Time-of-flight information gives con-straints on the potential location of the objectaround corners. When the path length of a photon isknown, the location of the hidden object is constrainedto the ellipsoid.

Figure 6: Visualization of reconstruction viaback-projection (Material from: Velten et al.,“Recovering three-dimensional shape around acorner using ultrafast time-of-flight imaging,”published 2012, Nature Communications [7]adapted with permission of SNCSC). (a) The hid-den object around the corner. (b) Probability map gen-erated by a back-projection algorithm. (c) The contourof the object can be sharpened by applying a sharpen-ing filter.

that satisfies

|w1 − o|+ |w2 − o| = ct− |w1 − l| − |w2 − v|, (2)

where c and t are the speed of light and time of travelrespectively.

Most of the existing techniques consider discretizedvoxels to recover hidden geometry. This physics-basedforward model maps the hidden scene to the measure-ment. Many works express the forward model with theellipsoidal constraint as a linear inverse problem, whichcan be solved by back-projection or optimization.

y = Ax + ε, (3)

where x ∈ Rn and y ∈ Rm and ε ∈ Rm denotethe vectorized target voxels, ToF measurements, andnoise respectively. A ∈ Rm×n represents the transientlight transport including the ellipsoidal constraints de-scribed by Eq. 2, intensity fall-off due to the surface re-flectance, and the distance between the object and the

5

Page 6: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

Figure 7: Layout of Confocal imaging setup (Material from: O’Toole et al., “Confocal non-line-of-sight imaging based on the light-cone transform,” published 2018, Nature [12], adapted withpermission of SNCSC). (a) The SPAD pixel and laser sees the same point on the wall. Due to the confocaloptical design and change of variables, the measurement can be written as a convolution of the hidden object anda 3D kernel. (b) 3D kernel can be used to perform deconvolution to recover the hidden object.

wall. The theory of light transport of multi-bouncephotons is studied by Seitz et al. [75] for non-time-resolved imaging, and by Raskar et al. [5] for time-resolved imaging.

In NLOS imaging, Eq. 3 is often ill-posed, so fil-tering or regularization of the reconstruction is oftennecessary. Moreover, A becomes large as x and y are5D measurements (3D measurement × 2D scanning)and 3D voxels in general. For example, if we havenx × ny × nt 3D measurement for scanning illumina-tion with ns points, ith column of Ai has n = nxnyntnselements, which maps the voxel xi to the measurement.If we take 64×64××64 3D measurement with 64 laserillumination spots for 64×64×64 voxel reconstruction,A is a 644 by 643 matrix. ToF-based methods mainlyfocus on efficient and robust algorithms to recover x.

Recently, a forward model beyond a linear modelhas been studied. For example, the shape of the surfacecan be recovered using fewer photons by modeling pho-tons that travel specific paths called Fermat paths [25].Phasor-field virtual wave optics enables the modelingof full light transport, including photons that bouncemore than three times, in the hidden scene [29].

Back-Projection: A naive way to solve Eq. 3is to consider each voxel (an element in x), and com-pute the heat map of an object occupying the voxelgiven the measurement y and light transport modelA. This back-projection method was exploited in thefirst demonstration of the 3D reconstruction of hid-den objects [7], and other NLOS imaging works forreconstruction [14, 15, 17–20]. Back-projection usuallyproduces blurry reconstruction, as shown in Fig. 6 (b)because of the ill-posed nature of A. Sharpening fil-ters and thresholding are used to improve the recon-struction quality. Instead of considering each voxel,

a probability map of x can be recovered by consid-ering the intersections of ellipsoidal constraints to per-form efficient back-projection that is up to three ordersof magnitude faster than naive back-projection [16].Back-projection can be implemented optically, by il-luminating and scanning along ellipsoids on the relaywall. This enables focusing the measurement to a sin-gle voxel in the hidden scene [23].

Optimization: Priors on the hidden object x canbe incorporated to the inversion of Eq. 3 by formulatingthe inverse problem as minimization of a least squareerror with a regularizer [13,14,21,27,28]:

x = arg minx

‖y −Ax‖2 + Γ(x), (4)

where Γ(x) encourages x to follow priors of the hid-den scene. Iterative algorithms such as CoSaMP [76],FISTA [77], and ADMM [78] can be used to solve thisoptimization problem based on the priors. Enforcingpriors generally results in better reconstruction qualityas compared to back-projection.

While computation and memory complexity can beenormous, the optimization formulation gives flexibil-ity in more accurate reconstruction. For example, Acan be factorized to consider partial occlusion and sur-face normal of the hidden object to recover the visibilityand surface normal as well as the albedo of the hiddenobject [13].

Confocal Imaging: In confocal imaging, the relaywall is raster-scanned as the detector collects photonsfrom the same point as the illuminated spot [12]. Thismakes w1 = w2, and the ellipsoidal constraints becomespherical constraints, which makes the forward oper-ation a 3D convolution (Fig. 7). Confocal setup re-laxes the need to solve back-projection or optimization

6

Page 7: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

problems, and instead, a simple deconvolution solvesthe reconstruction problem. 3D deconvolution makesconfocal imaging both memory- and computationally-efficient. Confocal imaging makes the inverse problemsimple but suffers from the first-bounce photons as thedetection and illumination points are the same. Thisissue can be mitigated by introducing a slight misalign-ment of the detector and illumination or time gatingof SPAD. However, SNR is still limited because thesingle-bounce light is much stronger than the three-bounce light.

The above three methods formulate NLOS imagingas a linear inverse problem. Other problem formula-tions can be constructed to solve NLOS imaging fromdifferent perspectives.

Wave-Based reconstruction: Reza et al. [79]showed that the intensity waves from modulated lightin the NLOS setting can be modeled as a propagationof a wave (phasor field). The hidden scene’s impulseresponse can be recorded with a pulsed laser and ul-trafast detector such as SPAD. The modulated lightpulse can be virtually synthesized over time, using therecorded impulse response. Constructive interferenceof the synthesized wave appears at the hidden ob-ject. Hence, virtual propagation of a pulsed modula-tion on the impulse response of the hidden scene resultsin reconstruction [29]. Virtual wave optics approachto NLOS imaging models the full light transport, in-cluding more than three-bounce photons in the hiddenscene. Lindel et al. [30] modeled the light transportof the confocal imaging system as wave propagation,where the measurement is one boundary condition. Ac-quiring other boundary conditions of the wave propa-gation with frequency-wavenumber (f-k) migration al-gorithm results in efficient and robust reconstruction ofthe hidden scene containing objects with various sur-face reflectance.

Inverse Rendering: A renderer can be used tomodel the physics-based forward model instead of ana-lytical forward operations, as written in Eq. 3. This“synthesis-by-analysis” approach, also known as in-verse rendering, changes the scene parameters suchthat the rendered and experimental measurementsmatch. Inverse rendering provides more flexible re-constructions than the voxel-based reconstruction dis-cussed above. For example, more detailed reconstruc-tion is possible by representing the hidden object inmesh, and non-Lambertian surface reflection can beincorporated [22]. However, accurate rendering canbe time-consuming, especially because each iterationof the optimization requires computationally expensiverendering. The long run-time problem can be solved by

a differential renderer that efficiently computes gradi-ents with respect to the hidden scene parameters [26].

Shape Recovery: Methods discussed above usefull ToF measurement for reconstruction. However, thesurface can be recovered without using all the multi-bounce photons. Tsai et al. showed that the first-returning photons provide the length of the shortestpath to the hidden object, which can be used to recon-struct the boundary and surface normal of the hiddenobject [24]. Later, discontinuities of ToF measurementare shown to follow specific paths (Fermat paths) thatgive rich information about the boundary of the hiddengeometry [25]. Because this approach does not rely onintensity information, it is robust to BRDF variationsof the object around corners.

2.2. Inference Algorithms for Localization, Track-ing, and Classification

We have reviewed algorithms to reconstruct objectsaround the corners. Often, reconstruction of the hid-den object might not be the end goal. Instead, theinference of the properties of the hidden object, suchas its location or class is sufficient. While reconstruc-tion suffices such tasks, direct inference without recon-struction can be made more efficiently with a smallernumber of measurements.

Back-Projection: The back-projection algorithmthat we discussed in the reconstruction section can beused for a point localization. Instead of consideringsmall voxels to recover the details of the object, an ob-ject can be treated as the single voxel to recover thelocation of the object [42, 43]. Because the goal is notto recover the 3D shape, localization can be performedwith much fewer measurements and less computationthan reconstruction at a larger scale [43]. While recon-struction of the cluttered scene is challenging, trackingand size estimation of a moving object is demonstratedby Pandharka et al. [41].

Deep learning: Data-driven algorithms have be-come a powerful tool for pattern recognition and foundpromising applications in computer vision. Thoughdata-driven approaches for full reconstruction requirefurther theoretical development [80, 81], deep learningis useful for inference of unknown parameters such asobject class and location. The objective of deep learn-ing approach is to learn x = f−1(y), where f(·) couldbe hard to model explicitly. For example, neural net-works with a SPAD array demonstrated the point lo-calization and identification of a person around the cor-ner [45]. While the imaging setup is different from acorner, Satat et al. [82] demonstrated calibration-freeNLOS classification of objects behind a diffuser, where

7

Page 8: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

Figure 8: The principle of speckle-based imaging technique to image after diffuse scattering (Materialfrom: Katz et al., “Non-invasive single-shot imaging through scattering layers and around cornersvia speckle correlations,” published 2014, Nature Photonics [8], adapted with permission of SNCSC).(a) Camera captures speckle pattern after scattering on a diffuse wall. (b) The angular auto-correlation of thecaptured image is the same as the angular auto-correlation of the original image. (c) Phase-retrieval algorithmsrecover the image before the scattering on the wall.

the transmissive scattering at the diffuser is similar tothe reflective scattering at the relay wall.

3. Coherence-based NLOS Imaging

The challenges of NLOS imaging come from thescattering of photons on the diffuse relay wall. How-ever, some coherent properties of light are preservedafter scattering. Coherence-based methods exploitspeckle patterns or spatial coherence to see around cor-ners.

3.1. Speckle-Based NLOS Imaging:

The speckle pattern is an intensity fluctuation gen-erated by the interference of the coherent light waves.Although a speckle pattern may seem random, the ob-served pattern encodes information about the hiddenscene.

Reconstruction The angular correlation of theobject intensity pattern is preserved in the observedspeckle pattern after the scattering on the relaywall [83]. This is known as memory effect [84,85]:

y ∗ y = x ∗ x, (5)

where y and x are the observed speckle pattern andobject intensity pattern, respectively. ∗ denotes con-volution. Katz et al. [8] found that the memory effectcan be applied to a spatially incoherent light sourcesuch as fluorescent bulb, and demonstrated single-shotimaging through scattering, and around corners. Re-construction of the hidden object can be performedwith phase-retrieval algorithms [86, 87]. Speckle pat-tern can also be generated with active coherent illu-mination when the speckle pattern cannot be observedin passive sensing [32]. While this approach achieves

diffraction-limited resolution, its field of view is limitedbecause of the memory effect(order of a few degrees ofthe angular field of view [8]). Because scattering of thediffuse wall has a similar nature as the scattering ofa diffuser, speckle-based reconstruction can be used tosee through scattering media as well [88].

Inference for Tracking When coherent light il-luminates objects around a corner, the light scatteredfrom the objects creates a speckle pattern on the re-lay wall. When the object moves, the speckle patternmoves as well. The motion of the object can be trackedby computing the cross-correlation of two images takenat different times [89]. Smith et al. [47] showed that thisprinciple applies to NLOS imaging, and demonstratedmotion tracking of the multiple hidden reflectors usingthe active coherent illumination. The speckle-basedtracking has precision less than 10 µm, but is cur-rently limited to microscopic motions because of thesmall field of view of the memory effect [47]. Thedata-driven approach demonstrated MNIST and pose-estimation classification from speckle patterns [51].

3.2. Spatial-Coherence-Based NLOS Imaging:

Spatial coherence refers to the correlation of thephase of a light wave at different observation points(while temporal coherence refers to the correlation atdifferent observation time). Spatial coherence can beused to reconstruct the scene in the camera’s field ofview [90]. Because the spatial coherence of light ispreserved through the scattering at the diffuse wall,such reconstruction techniques can be applied to NLOSimaging [33,34]. Measurement of spatial coherence re-quires a unique imaging system such as the Dual-PhaseSagnac Interferometer (DuPSaI) [57].

8

Page 9: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

Figure 9: The layout of NLOS imaging exploiting the scene occlusion (Material from: Bouman etal., “Turning Corners into Cameras: Principles and Methods,” published 2017 [11] adapted withpermission from IEEE. c© 2017 IEEE). (a) The corner occlusion casts shadows that depend on the locationof the hidden objects. (b) Intensity camera sees the intensity over the angle from the relay wall. (c) Transfer(measurement) matrix A is constructed based on the occlusion.

Figure 10: Neural networks can locate, classify and reconstruct the hidden scene (Material from:Tancik et al., “Flash Photography for Data-Driven Hidden Scene Recovery,” published 2018 [10]adapted with permission from the authors). (a) Neural network architectures for different tasks. (b)Heatmaps of “important” region of the scene for localization and identification tasks. Neural networks learn toexploit the occlusion as proposed by [11] (left). It also learns to use measurements that are previously not realizedto be useful for object identification tasks (right).

Reconstruction The propagation of coherence infree space can be approximated in a closed-form. Thislets the spatial coherence measurement y be expressedas a system of linear equations as written in Eq. 3,where A illustrates the propagation of the spatial co-herence function through space and scattering. Thisinverse problem can be solved as minimization of leastsquare error and regularizers that incorporate priorssuch as sparsity and small total variation. Beckus et

al. [34] showed that coherence measurement and inten-sity measurement could be incorporated into a singleoptimization problem for reconstruction.

Inference for Localization Using spatial coher-ence, the distance of the incoherent light source fromthe detector can be estimated from the phase of thespatial coherence function. While only 1D localizationwas demonstrated [33], multiple measurements of thespatial coherence functions can be used to localize the

9

Page 10: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

light source by triangulation.

4. Intensity-Based NLOS Imaging

Velten et al. [7] first studied the use of intensitymeasurement for NLOS imaging and showed that atraditional camera requires high sensitivity for diffusesurfaces. We illustrate this in Fig. 4, and refer read-ers to the supplemental material of [7] for further de-tails. To overcome this challenge, most of the existingintensity-based NLOS imaging works exploit occlusionsin the scene. Before the development of NLOS imag-ing, occlusions have been used for computational imag-ing for a long time. For example, light transport fromocclusions are used to synthesize images from a dif-ferent point of view [91], spatially varying BRDF tocapture incident light [92], recover 4D light field forrefocusing [93], and create an anti-pinhole camera tosee outside the field-of-view of the camera [94]. Tan-cik et al. [9, 10] and Bouman et al. [11] first addressedthe NLOS imaging problem directly. Bouman et al.showed 1D tracking, and Tancik et al. introduced theidea of using intensity-based reconstruction for NLOSimaging. We also review another class of techniquethat exploits non-Lambertian surface reflection.

4.1. Exploiting Occlusions

Spatially varying occlusions make it feasible to useintensity measurement since the variability of intensitydue to occlusions increases the rank of the underlyingray transport matrix. Passive intensity-based NLOSimaging with inexpensive cameras has shown a real-time demonstration of seeing around corners. How-ever, the separation of the ambient light and the signallight from the object is necessary, which often requiresmoving hidden objects or background subtraction.

Localization: Bouman et al. [11] demonstratedpractical passive tracking of hidden objects using anocclusion from a wall as illustrated in Fig. 9(a), andTankic et al. showed that neural networks learn toexploit occlusions [10] as shown in Fig. 10 (b) . Theocclusion in the scene can be considered to be a specifictype of aperture. As shown in Fig. 9 (c), the problemcan be formulated as a linear inverse problem similarto Eq. 3, where A represents light transport with oc-clusions. With priors on the floor albedo, it is possibleto estimate the 1D angular location the hidden objectfrom a single measurement by estimating the ambientillumination [48].

Reconstruction: More complex occlusion ge-ometries enable reconstruction of the hidden scene, asA becomes well-posed [38, 39]. A 4D light field canbe recovered from an occlusion-based inverse problem

framework [37]. While the above reconstruction tech-niques assume known occlusions, the unknown occlu-sions can be estimated by exploiting the sparse mo-tion of the hidden object [40]. Deep learning-basedapproaches showed its ability for reconstruction, track-ing, and object classification to a specific scene setup,while its generalizability is yet to be explored [9,10,50].

4.2. Exploiting Surface Reflectance

Another class of intensity-based NLOS imagingtechnique exploits the bidirectional reflectance distri-bution function (BRDF) of the relay wall to recon-struct the hidden scene. Specular BRDF makes theinverse problem less ill-posed. When such reflectancefunction of the wall is known, the light field from thehidden scene can be reconstructed [36]. However, cur-rent demonstrations are limited to scenes without am-bient light, and the wall has some specular surface re-flections [36]. Chen et al. [31] demonstrated the re-construction of specular planar objects with active il-lumination. The data-driven approach showed thatit is also possible to reconstruct diffuse objects withactive illumination [31]. Tracking of the object usingactive illumination and intensity measurement can beperformed by matching the simulation with the exper-imental measurements (inverse rendering) [49]. Ther-mal imaging benefits from the specular BRDF of long-wave IR, which was exploited for passive localizationand near real-time pose detection around corners [35].

5. Challenges and Future Directions

We reviewed major techniques for NLOS imagingin the previous sections. Here, we introduce commonchallenges to real-time and robust applications to in-spire further research.

5.1. Properties of Different Sensing Modalities

Table 1 shows the current landscape of NLOS tech-niques, and Table 2 summarizes the properties of dif-ferent sensing modalities. In this section, we discusswhat challenges need to be solved to complete Table 1.

ToF: Since ToF-based NLOS imaging was firstdemonstrated, there have been significant improve-ments in reconstruction algorithms that became ordersof magnitude faster than the originally proposed al-gorithm. The challenges towards practical ToF-basedNLOS imaging are:

• Low signal-to-background of three-bounce pho-tons.

• Estimation of line-of-sight scene parameters.

10

Page 11: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

Illumination Sensor Cost Ambient LightSensitivity to

on GeometryNeed of Priors

ToFPulsed

LaserPulsed

SPADStreak Camera High Robust Not Required

ToFAMCW

Laser, LEDModulated

CameraCorrelation Medium Strong Light

Sensitive to Not Required

CoherencePassive None Camera

Traditional Low Sensitive Scene Geometry

CoherencePassive Spatial None Interferometer

Dual Phase Sagnac Medium Sensitive Scene Geometry

CoherenceActive Coherent Source Camera

Traditional Low Sensitive Scene Geometry

IntensityPassive

None CameraTraditional

Low Sensitive Occlusion GeometryScene Geometry

ActiveIntensity Laser

FlashlightCamera

Traditional Low Sensitive Scene Geometry

Table 2: Properties of techniques to see around corner for different sensing schemes. This tablesummarizes fundamental properties of seeing around corners techniques.

First, only a small fraction of the emitted photons arecaptured in the measurement. This limits the acquisi-tion time for NLOS imaging. Many recent works stillrequire minutes to an hour of acquisition time for dif-fuse objects, which limits the real-time applications.Second, the majority of the existing works treat theline-of-sight scene as a known geometry, given that theline-of-sight scene is much easier to recover than thehidden scene. However, a fully-automated procedureto recover both line-of-sight and non-line-of-sight ge-ometry is necessary for the practical applications ofNLOS imaging, where the imaging platform could bemoving.

Coherence: The coherence-based approaches ex-ploit the observations that certain coherent propertiesare preserved after the scattering on the diffuse wall.The challenge towards practical speckle-based methodsare:

• Small field-of-view due to memory effect.

• Lack of depth information.

Correlography techniques exploit memory effect, whichhas a limited angular field of view [8]. This limits its ap-plication to the macroscopic scenes such as autonomousvehicles. Speckle-based reconstruction methods re-cover the 2D projection of the image but do not recoverthe depth.

Spatial coherence-based methods do not suffer fromthe small field of view as the speckle-based methods,but they require sensitive interferometric detectors thatare mechanically translated [33].

Intensity: The ease of image acquisition is themain advantage of intensity-based approaches. De-tection of hidden moving objects around corners us-ing shadow was demonstrated on an autonomouswheelchair [95]. The challenge towards practicalintensity-base NLOS imaging are:

• Unknown occluding geometries in the hiddenscene.

• Separation of the background and signal photons.

• Low signal-to-noise ratio.

First, the quality of passive, occlusion-based NLOSimaging heavily depends on the occluding geometries.For example, if there are no occluding geometries inthe hidden scene, the reconstruction is extremely chal-lenging [38, 39]. However, such information about oc-clusions might not be available in prior. Second, ambi-ent illumination may be much stronger than the signallight, which makes it hard to isolate the signal fromthe measurement. This requires active illumination,background subtraction, or motion of the hidden sceneto extract the signal. Lastly, the algorithm has to besensitive enough to capture the small signal from thehidden scene while robust enough to reject the intensityvariation due to the noise.

5.2. Limitations on Reconstruction:

ToF-based NLOS imaging has a derived resolutionlimit. Reza et al. [79] showed that virtual optics for-mulation of NLOS imaging gives the following Rayleigh

11

Page 12: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

lateral resolution limit:

∆x ≈ 1.22cτL

d, (6)

where c, τ, L, d are the speed of light, full-width-half-max of the temporal response of the imaging system,the distance between the wall and hidden object, anddiameter of the virtual aperture (scanning area) respec-tively. However, it is hard to evaluate resolution forsome NLOS imaging techniques. The reconstructioncapability of occlusion-based techniques heavily relieson scene geometry. It is essential to evaluate differentmethods with common datasets such as those proposedin [96].

Liu et al. [97] showed that the hidden object be-comes impossible to reconstruct depending on the di-rection of its surface normal. It is also shown that thereconstruction of multiple objects may fail with a sim-ple linear light transport model when the signal fromone object is stronger than the others [13]. Further the-oretical investigation of such limitations is necessary forthe practical use of NLOS imaging.

5.3. Acquisition Time

The low signal-to-noise ratio (SNR) or signal-to-background ratio is a common issue for most NLOSimaging techniques, as discussed in the previous sec-tion. This is a fundamental problem, as the number ofphotons from the hidden object is typically small. Thislimits the possible acquisition time necessary for satis-factory reconstruction and inference. Fig. 11 summa-rizes reported demonstration of NLOS imaging tech-niques across different scales of the acquisition time,illumination power, and scene.

Active Sensing: Most of the emitted photonswill not be captured by the camera. Moreover, the3-bounce photons that contain the information aboutthe hidden scene consist only of a small fraction ofthe captured photons. Current demonstrations of real-time NLOS imaging with eye-safe power level is limitedto retroreflective objects. Recent work on ToF-basedNLOS imaging uses a laser of up to 1W power, but yetrequires 10 minutes of acquisition time [30]. More than10W power is necessary to perform reconstruction un-der a minute with the same quality, but increasing laserpower is not scalable for safety and cost. One could uselight at a different wavelength, such as near-infraredradiation, with higher eye safety power. Recently, anew scanning method was proposed to focus on a sin-gle voxel on the hidden scene, which can improve theSNR for a specific region of interest [23].

Passive Sensing: In many practical scenarios,most of the passively captured photons are ambient

Figure 11: Overview of the state-of-art NLOSimaging demonstration across different acquisi-tion time, illumination power, and scene scale.

light, which does not interact with the hidden objects.This results in low SNR as well as the necessity to sepa-rate ambient photons and photons from the hidden ob-jects. The signal photons can be isolated by exploitingthe movements in the hidden scene [11], or backgroundsubtraction when a measurement without the hiddenobject is available. Priors on the ambient illuminationcan also be exploited to remove the ambient term fromthe measurement [48]. Because uncontrolled ambientlight is hard to model, adding an active illumination tothe passive technique may improve the signal to noiseratio [10,50].

5.4. Integration of Multi-Modal Techniques

The measurements from different sensing modalitiescan be jointly exploited for NLOS imaging. Becks etal. [34] showed that fused intensity measurement andcoherence measurement in a single optimization frame-work produces better reconstruction results.

Insights from different approaches can be shared.For example, occlusions are not exploited in the ToF-based approaches. Occlusion-based techniques mightbe able to narrow down the region of interest in the hid-den scene to make the ToF-based reconstruction faster.

5.5. Data-driven Approach

We refer “data-driven” approach to the use of datato model the mapping between the measurement and

12

Page 13: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

the hidden scenes, and to produce priors on the solutionto the inverse problem.

Learned Priors: Handcrafted priors such as total-variation are often used for NLOS reconstruction.More flexible priors or representations of the hidden ob-jects can be learned from relevant data. Recent resultson deep learning show the use of a generative modelor discriminative model to enforce the learned priors inlinear inverse problem [98,99]. Theoretical connectionsbetween convolutional neural networks and dictionarylearning [100] suggest the suitability of emerging deeplearning methodology to incorporate learned represen-tations as priors to solve inverse problems.

End-to-End Learning: In recent years, deep learn-ing has demonstrated a powerful set of techniqueswith applications to pattern recognition and studiedfor computational imaging applications such as CTand MRI [101]. If there are correlations present ina dataset, then deep learning can offer powerful andflexible ways to approximate the function that mapsthe desired outputs from novel inputs. With a largedataset, a deep learning model can learn the mappingbetween a measurement and the desired inference (e.g.,reconstruction and localization). The same networkmay also easily be adapted to learn the forward modelas well by swapping the inputs and outputs and mak-ing changes to the model architecture. Potentially, adata-driven approach could offer calibration-free, effi-cient, and flexible solutions to NLOS imaging prob-lems [9, 10, 31]. Fig. 10 shows the use of deep learn-ing for localization, identification, and reconstructionaround corners.

Many NLOS imaging methods attempt to find effi-cient and robust algorithms to solve the optimizationproblem shown in Eq. 3. In contrast, the data drivenapproach attempts to solve the following minimizationproblem to find the function f , which approximates themapping between the measurement y, and the targetx (3D shape, location, class of the hidden object):

f = arg minf∈F

1

n

n∑i=1

l(f(yi), xi) + r(θ), (7)

where f is parameterized by θ and the regularizerr(θ) discourages over-fitting. 1

n

∑ni=1 l(f(yi), xi) de-

notes the error between the reconstruction and groundtruth.

The main problems in the data-driven approach aregeneralizability and explainability.

Generalizability: The potential advantage ofdata-driven approaches is that they can learn models

that are more robust to scene variation than brittlemodel-based approaches. For example, a data-drivenapproach may handle variations in a wall reflectance,while a model-based approach is often limited to sim-ple, diffuse reflectance. However, the key concern isdata-driven approaches fail in unpredictable ways whenencountering scenes that are not represented in thetraining data.

One approach towards generalization is to producea sufficiently large dataset that contains rich variationssuch that any possible practical scenes are representedin the training dataset. However, this is ultimatelylimited by the ability to simulate or collect enough ex-perimental data to sufficiently sample the parameterspace. Another approach is to design flexible neuralnetworks for specific scene types. This makes it easierfor the neural networks to learn the algorithm, whichrobustly works if the scene type is identified correctly.

Explainability: Another challenge for data-driven approaches is that Eq. 7 produces a mappingthat is difficult to interpret. Rather than incorporat-ing a known forward model, the learned mapping isbased on the statistics present in the training set andis susceptible to “hallucinating” output in order to pro-duce a “reasonable output” with respect to the trainingdata.

Instead of learning Eq. 7 directly, a combinationof physics-based forward models and data-driven ap-proaches could make these methods more explain-able [102]. To solve the optimization problem in Eq. 3,iterative algorithms such as ADMM and ISTA demon-strate improved performance when learning priors orforward models from a specific distribution [80, 103,104]. While traditional model-based approaches typi-cally rely on a handcrafted sparsity prior, a data-drivenapproach could offer a prior that more accurately mod-els the scene distribution [99,100]. Furthermore, thesealgorithms can be embedded within the architectureof the deep learning model directly [105]. Incorporat-ing data-driven techniques into the traditional model-based optimization schemes show promise for NLOSimaging without sacrificing explainability.

6. Conclusions and Future Directions

We reviewed the existing NLOS imaging techniquesthat rely on different principles and discussed chal-lenges towards the practical, real-time NLOS imaging.We hope that this paper will help and inspire furtherresearch toward practical NLOS imaging.

13

Page 14: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

7. Acknowledgement

The authors thank Adithya Pediredla, Ravi Athale,and Sebastian Bauer for valuable feedback. This workis supported by DARPA REVEAL program (N0014-18-1-2894) and MIT Lab consortium funding.

References

[1] http://web.media.mit.edu/~raskar/

cornar/.

[2] Fadel Adib and Dina Katabi. See through wallswith wifi! SIGCOMM Comput. Commun. Rev.,43(4):75–86, August 2013.

[3] Fadel Adib, Chen-Yu Hsu, Hongzi Mao, DinaKatabi, and Fredo Durand. Capturing the hu-man figure through a wall. ACM Trans. Graph.,34(6):219:1–219:13, October 2015.

[4] David B. Lindell, Gordon Wetzstein, and VladlenKoltun. Acoustic non-line-of-sight imaging. InIEEE Intl. Conf. Computer Vision and PatternRecognition (CVPR), 2019.

[5] Ramesh Raskar and James Davis. 5 d time-lighttransport matrix : What can we reason aboutscene properties? 2008.

[6] Ahmed Kirmani, Tyler Hutchison, James Davis,and Ramesh Raskar. Looking around the cor-ner using transient imaging. In 2009 IEEE 12thInternational Conference on Computer Vision,pages 159–166, Sep. 2009.

[7] Andreas Velten, Thomas Willwacher, OtkristGupta, Ashok Veeraraghavan, Moungi GBawendi, and Ramesh Raskar. Recoveringthree-dimensional shape around a corner us-ing ultrafast time-of-flight imaging. Naturecommunications, 3:745, 2012.

[8] Ori Katz, Pierre Heidmann, Mathias Fink, andSylvain Gigan. Non-invasive real-time imagingthrough scattering layers and around corners viaspeckle correlations. Nature Photonics, 8, 032014.

[9] Matthew Tancik, Tristan Swedish, Guy Satat,and Ramesh Raskar. Data-driven non-line-of-sight imaging with a traditional camera. InImaging and Applied Optics 2018 (COSI), pageIW2B.6. Optical Society of America, 2018.

[10] Matthew Tancik, Guy Satat, and RameshRaskar. Flash photography for data-driven hid-den scene recovery, 2018.

[11] Katherine L. Bouman, Vickie Ye, AdamB. Yedidia, Fredo Durand, Gregory W. Wor-nell, Antonio Torralba, and William T. Freeman.Turning corners into cameras: Principles andmethods. In 2017 IEEE International Conferenceon Computer Vision (ICCV), pages 2289–2297,Oct 2017.

[12] Matthew O’Toole, David B. Lindell, and GordonWetzstein. Confocal Non-Line-of-Sight ImagingBased on the Light-Cone Transform. Nature,2018.

[13] Felix Heide, Matthew O’Toole, Kai Zang,David B. Lindell, Steven Diamond, and GordonWetzstein. Non-line-of-sight imaging with par-tial occluders and surface normals. ACM Trans.Graph., 2019.

[14] Otkrist Gupta, Thomas Willwacher, An-dreas Velten, Ashok Veeraraghavan, andRamesh Raskar. Reconstruction of hidden 3dshapes using diffuse reflections. Opt. Express,20(17):19096–19108, Aug 2012.

[15] Mauro Buttafava, Jessica Zeman, Alberto Tosi,Kevin Eliceiri, and Andreas Velten. Non-line-of-sight imaging using a time-gated single pho-ton avalanche diode. Opt. Express, 23(16):20997–21011, Aug 2015.

[16] Victor Arellano, Diego Gutierrez, and AdrianJarabo. Fast back-projection for non-line ofsight reconstruction. Opt. Express, 25(10):11574–11583, May 2017.

[17] Andreas Velten Martin Laurenzis. Nonline-of-sight laser gated viewing of scattered photons.Optical Engineering, 53:53 – 53 – 7, 2014.

[18] Marco La Manna, Fiona Kine, Eric Breitbach,J. Jackson, Talha Sultan, and Andreas Velten.Error backprojection algorithms for non-line-of-sight imaging. IEEE transactions on patternanalysis and machine intelligence, 2018.

[19] Martin Laurenzis, Jonathan Klein, EmmanuelBacher, and Nicolas Metzger. Multiple-returnsingle-photon counting of light in flight and sens-ing of non-line-of-sight objects at shortwave in-frared wavelengths. Opt. Lett., 40(20):4815–4818,Oct 2015.

[20] Chenfei Jin, Jiaheng Xie, Siqi Zhang, ZiJingZhang, and Yuan Zhao. Reconstruction of mul-tiple non-line-of-sight objects using back projec-tion based on ellipsoid mode decomposition. Opt.Express, 26(16):20089–20101, Aug 2018.

14

Page 15: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

[21] Adithya Pediredla, Mauro Buttafava, AlbertoTosi, Oliver Cossairt, and Ashok Veeraraghavan.Reconstructing rooms using photon echoes: Aplane based model and reconstruction algorithmfor looking around the corner. In 2017 IEEE In-ternational Conference on Computational Pho-tography (ICCP), pages 1–12, May 2017.

[22] Julian Iseringhausen and Matthias B. Hullin.Non-line-of-sight reconstruction using efficienttransient rendering, 2018.

[23] Adithya Pediredla, Akshat Dave, and AshokVeeraraghavan. Snlos: Non-line-of-sight scanningthrough temporal focusing. 2019.

[24] Chia-Yin Tsai, Kiriakos Kutulakos, SrinivasaG. Narasimhan, and Aswin Sankaranarayanan.The geometry of first-returning photons for non-line-of-sight imaging. pages 2336–2344, 07 2017.

[25] Shumian Xin, Sotiris Nousias, Kiriakos N. Kutu-lakos, Aswin C. Sankaranarayanan, Srinivasa G.Narasimhan, and Ioannis Gkioulekas. A theoryof fermat paths for non-line-of-sight shape recon-struction. In Proceedings of the 2019 IEEE Con-ference on Computer Vision and Pattern Recog-nition, CVPR ’19. IEEE Computer Society, 2019.

[26] Chia-Yin Tsai, Aswin C. Sankaranarayanan, andIoannis Gkioulekas. Beyond volumetric albedo —A surface optimization framework for non-line-of-sight imaging. In IEEE Intl. Conf. ComputerVision and Pattern Recognition (CVPR), 2019.

[27] Felix Heide, Lei Xiao, Wolfgang Heidrich, andMatthias B. Hullin. Diffuse mirrors: 3d recon-struction from diffuse indirect illumination usinginexpensive time-of-flight sensors. In Proceedingsof the 2014 IEEE Conference on Computer Vi-sion and Pattern Recognition, CVPR ’14, pages3222–3229, Washington, DC, USA, 2014. IEEEComputer Society.

[28] Achuta Kadambi, Hang Zhao, Boxin Shi, andRamesh Raskar. Occluded imaging with time-of-flight sensors. ACM Trans. Graph., 35:15:1–15:12, 2016.

[29] Xiaochun Liu, Ibon Guillen, Marco La Manna,Ji Hyun Nam, Syed Azer Reza, Toan Huu Le,Adrian Jarabo, Diego Gutierrez, and AndreasVelten. Non-line-of-sight imaging using phasor-field virtual wave optics. Nature, pages 1–4, 2019.

[30] David B. Lindell, Gordon Wetzstein, andMatthew O’Toole. Wave-based non-line-of-sight

imaging using fast f-k migration. ACM Trans.Graph. (SIGGRAPH), 38(4):116, 2019.

[31] Wenzheng Chen, Simon Daneau, Fahim Mannan,and Felix Heide. Steady-state non-line-of-sightimaging. In The IEEE Conference on ComputerVision and Pattern Recognition (CVPR), June2019.

[32] Aparna Viswanath, Prasanna Rangarajan, Dun-can MacFarlane, and Marc P Christensen. In-direct imaging using correlography. In Imagingand Applied Optics 2018 (3D, AO, AIO, COSI,DH, IS, LACSEA, LS&C, MATH, pcAOP), pageCM2E.3. Optical Society of America, 2018.

[33] Mahed Batarseh, Sergey Sukhov, Zhean Shen,H Gemar, R Rezvani, and A Dogariu. Passivesensing around the corner using spatial coher-ence. Nature Communications, 9, 12 2018.

[34] Andre Beckus, Alexandru Tamasan, andGeorge K. Atia. Multi-modal non-line-of-sightpassive imaging. IEEE Transactions on ImageProcessing, 28(7):3372–3382, Jul 2019.

[35] Tomohiro Maeda, Yiqin Wang, Ramesh Raskar,and Achuta Kadambi. Thermal non-line-of-sightimaging. In IEEE International Conference onComputational Photography(ICCP), 2019.

[36] Takahiro Sasaki and James R. Leger. Light-fieldreconstruction from scattered light using plenop-tic data. Proc. SPIE, 10772:10772 – 10772 – 7,2018.

[37] Manel Baradad, Vickie Ye, Adam B. Yedidia,Fredo Durand, William T. Freeman, Gregory W.Wornell, and Antonio Torralba. Inferring lightfields from shadows. In CVPR, 2018.

[38] Christos Thrampoulidis, Gal Shulkind, Feihu Xu,William T. Freeman, Jeffrey H. Shapiro, AntonioTorralba, Franco N. C. Wong, and Gregory W.Wornell. Exploiting occlusion in non-line-of-sightactive imaging. IEEE Transactions on Compu-tational Imaging, 4:419–431, 2018.

[39] Charles Saunders, John Murray-Bruce, andVivek Goyal. Computational periscopy with anordinary digital camera. Nature, 565:472, 012019.

[40] Adam B. Yedidia, Manel Baradad, ChristosThrampoulidis, William T. Freeman, and Gre-gory W. Wornell. Using unknown occluders to

15

Page 16: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

recover hidden scenes. In The IEEE Confer-ence on Computer Vision and Pattern Recogni-tion (CVPR), June 2019.

[41] Rohit Pandharkar, Andreas Velten, AndrewBardagjy, Everett Lawson, Moungi Bawendi, andRamesh Raskar. Estimating motion and size ofmoving non-line-of-sight objects in cluttered en-vironments. In CVPR 2011, pages 265–272, June2011.

[42] Genevieve Gariepy, Francesco Tonolini, RyanWarburton, Susan Chan, Robert Henderson,Jonathan Leach, and Daniele Faccio. Detec-tion and tracking of moving objects hidden fromview. In Imaging and Applied Optics 2016, pageCTh4B.3. Optical Society of America, 2016.

[43] Susan Chan, Ryan E. Warburton, GenevieveGariepy, Yoann Altmann, Stephen McLaughlin,Jonathan Leach, and Daniele Franco Angelo Fac-cio. Fast tracking of hidden objects with single-pixel detectors. Electronics Letters, 53(15):1005–1008, 7 2017.

[44] Susan Chan, Ryan E. Warburton, GenevieveGariepy, Jonathan Leach, and Daniele Faccio.Non-line-of-sight tracking of people at long range.Optics express, 25 9:10109–10117, 2017.

[45] Piergiorgio Caramazza, Alessandro Boccolini,Daniel Buschek, Matthias B. Hullin, Cather-ine F. Higham, Robert Henderson, RoderickMurray-Smith, and Daniele Faccio. Neural net-work identification of people hidden from viewwith a single-pixel, single-photon detector. InScientific Reports, 2018.

[46] Jeremy Boger-Lombard and Ori Katz. Non line-of-sight localization by passive optical time-of-flight. 08 2018.

[47] Brandon M. Smith, Matthew O’Toole, and MohitGupta. Tracking multiple objects outside the lineof sight using speckle imaging. In The IEEE Con-ference on Computer Vision and Pattern Recog-nition (CVPR), June 2018.

[48] Sheila W Seidel, Yanting Ma, John Murray-Bruce, Charles Saunders, Bill Freeman, Christo-pher Yu, and Goyal Vivek. Corner occluder com-putational periscopy: Estimating a hidden scenefrom a single photograph. 2019.

[49] Jonathan Klein, Christoph Peters, Jaime Martın,Martin Laurenzis, and Matthias B. Hullin.

Tracking objects outside the line of sight using2d intensity images. Scientific Reports, 6(32491),August 2016.

[50] Sreenithy Chandran and Suren Jayasuriya.Adaptive lighting for data-driven non-line-of-sight 3d localization and object identification,May 2019.

[51] Xin Lei, Liangyu He, Yixuan Tan, Ken XingzeWang, Xinggang Wang, Yihan Du, Shanhui Fan,and Zongfu Yu. Direct object recognition with-out line-of-sight using optical coherence. In TheIEEE Conference on Computer Vision and Pat-tern Recognition (CVPR), June 2019.

[52] Gao Liang, Liang Jinyang, Li Chiye, and V WangLihong. Single-shot compressed ultrafast photog-raphy at one hundred billion frames per second.Nature, 516:74–7, 2014.

[53] Bernhard Buttgen, Thierry Oggier, MichaelLehmann, Rolf Kaufmann, and Felix Lusten-berger. Ccd / cmos lock-in pixel for range imag-ing : Challenges , limitations and state-ofthe-art.2005.

[54] Buttgen Bernhard and Seitz Pater. Robust opti-cal time-of-flight range imaging based on smartpixel structures. IEEE Transactions on Circuitsand Systems I: Regular Papers, 55(6):1512–1525,July 2008.

[55] Lin Yang, Longyu Zhang, Haiwei Dong, Abdul-hameed Alelaiwi, and Abdulmotaleb El Saddik.Evaluating and improving the depth accuracy ofkinect for windows v2. IEEE Sensors Journal,15:1–1, 08 2015.

[56] Florian Willomitzer, Fengqiang Li, PrasannaRangarajan, and Oliver Cossairt. Non-line-of-sight imaging using superheterodyne interfer-ometry. In Imaging and Applied Optics 2018(3D, AO, AIO, COSI, DH, IS, LACSEA, LS&C,MATH, pcAOP), page CM2E.1. Optical Societyof America, 2018.

[57] Roxana Rezvani Naraghi, Heath Gemar, MahedBatarseh, Andre Beckus, George Atia, SergeySukhov, and Aristide Dogariu. Wide-field inter-ferometric measurement of a nonstationary com-plex coherence function. Opt. Lett., 42(23):4929–4932, Dec 2017.

[58] Nikhil Naik, Shuang Zhao, Andreas Velten,Ramesh Raskar, and Kavita Bala. Single view

16

Page 17: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

reflectance capture using multiplexed scatteringand time-of-flight imaging. In Proceedings ofthe 2011 SIGGRAPH Asia Conference, SA ’11,pages 171:1–171:10, New York, NY, USA, 2011.ACM.

[59] Guy Satat, Barmak Heshmat, Christopher Barsi,Dan Raviv, Ou Chen, Moungi G Bawendi,and Ramesh Raskar. Locating and classify-ing fluorescent tags behind turbid layers usingtime-resolved inversion. Nature communications,6:6796, 04 2015.

[60] Anand T. N. Kumar, Scott Raymond, GregBoverman, David Boas, and Brian J Bacskai.Time resolved fluorescence tomography of turbidmedia based on lifetime contrast. Optics express,14:12255–70, 01 2007.

[61] Dan Raviv, Christopher Barsi, Nikhil Naik,Micha Feigin, and Ramesh Raskar. Pose estima-tion using time-resolved inversion of diffuse light.Opt. Express, 22(17):20164–20176, Aug 2014.

[62] Guy Satat, Barmak Heshmat, Dan Raviv,and Ramesh Raskar. All photons imagingthrough volumetric scattering. Scientific Reports,6:33946, 09 2016.

[63] Guy Satat, Matthew Tancik, and RameshRaskar. Towards photography through realisticfog. In 2018 IEEE International Conference onComputational Photography (ICCP), pages 1–10,May 2018.

[64] Andreas Velten, Di Wu, Adrian Jarabo, BelenMasia, Christopher Barsi, Chinmaya Joshi, Ev-erett Lawson, Moungi Bawendi, Diego Gutierrez,and Ramesh Raskar. Femto-photography: cap-turing and visualizing the propagation of light.ACM Trans. Graph., 32:44:1–44:8, 2013.

[65] Achuta Kadambi, Refael Whyte, Ayush Bhan-dari, Lee Streeter, Christopher Barsi, AdrianDorrington, and Ramesh Raskar. Coded time offlight cameras: Sparse deconvolution to addressmultipath interference and recover time profiles.ACM Trans. Graph., 32(6):167:1–167:10, Novem-ber 2013.

[66] Ayush Bhandari, Achuta Kadambi, RefaelWhyte, Christopher Barsi, Micha Feigin, AdrianDorrington, and Ramesh Raskar. Resolving mul-tipath interference in time-of-flight imaging viamodulation frequency diversity and sparse reg-ularization. Opt. Lett., 39(6):1705–1708, Mar2014.

[67] Achuta Kadambi, Ayush Bhandari, RefaelWhyte, Adrian Dorrington, and Ramesh Raskar.Demultiplexing illumination via low cost sensingand nanosecond coding. pages 1–10, 05 2014.

[68] Di Wu, Andreas Velten, Matthew O’Toole, Be-len Masia, Amit Agrawal, Qionghai Dai, andRamesh Raskar. Decomposing global light trans-port using time of flight imaging. Interna-tional Journal of Computer Vision, 107(2):123–138, Apr 2014.

[69] Genevieve Gariepy, Nikola Krstajic, Robert K.Henderson, Chunyong Li, Robert R. Thomson,Gerald S. Buller, Barmak Heshmat, RameshRaskar, Jonathan Leach, and Daniele Faccio.Single-photon sensitive light-in-fight imaging. InNature communications, 2015.

[70] Achuta Kadambi, Jamie Schiel, and RaskarRaskar. Macroscopic interferometry: Rethinkingdepth estimation with frequency-domain time-of-flight. In 2016 IEEE Conference on ComputerVision and Pattern Recognition (CVPR), pages893–902, June 2016.

[71] M. O’Toole, F. Heide, D. Lindell, K. Zang, S. Di-amond, and G. Wetzstein. Reconstructing Tran-sient Images from Single-Photon Sensors. Proc.IEEE CVPR, 2017.

[72] Guy Satat, Matthew Tancik, and RameshRaskar. Lensless imaging with compressive ul-trafast sensing. IEEE Transactions on Compu-tational Imaging, PP, 10 2016.

[73] Barmak Heshmat, Ik Hyun Lee, and RameshRaskar. Optical brush: Imaging through per-muted probes. Scientific Reports, 6:20217, 022016.

[74] Barmak Heshmat, Matthew Tancik, Guy Satat,and Ramesh Raskar. Photography optics in thetime dimension. Nature Photonics, 12, 09 2018.

[75] Steven M. Seitz, Yasuyuki Matsushita, and Kiri-akos N. Kutulakos. A theory of inverse lighttransport. In Proceedings of the Tenth IEEEInternational Conference on Computer Vision -Volume 2, ICCV ’05, pages 1440–1447, Washing-ton, DC, USA, 2005. IEEE Computer Society.

[76] Deanna Needell and Joel A. Tropp. Cosamp: It-erative signal recovery from incomplete and in-accurate samples. Communications of the ACM,53, 12 2010.

17

Page 18: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

[77] Amir Beck and Marc Teboulle. Fast gradient-based algorithms for constrained total variationimage denoising and deblurring problems. IEEETransactions on Image Processing, 18(11):2419–2434, Nov 2009.

[78] Stephen Boyd, Neal Parikh, Eric Chu, Borja Pe-leato, and Jonathan Eckstein. Distributed op-timization and statistical learning via the alter-nating direction method of multipliers. Found.Trends Mach. Learn., 3(1):1–122, January 2011.

[79] Syed Azer Reza, Marco La Manna, and AndreasVelten. A physical light transport model for non-line-of-sight imaging applications. arXiv preprintarXiv:1802.01823, 2018.

[80] Karol Gregor and Yann LeCun. Learning fastapproximations of sparse coding. In ICML, 2010.

[81] Xiaohan Chen, Jialin Liu, Zhangyang Wang, andWotao Yin. Theoretical linear convergence of un-folded ista and its practical weights and thresh-olds. In S. Bengio, H. Wallach, H. Larochelle,K. Grauman, N. Cesa-Bianchi, and R. Garnett,editors, Advances in Neural Information Process-ing Systems 31, pages 9079–9089. Curran Asso-ciates, Inc., 2018.

[82] Guy Satat, Matthew Tancik, Otkrist Gupta,Barmak Heshmat, and Ramesh Raskar. Objectclassification through scattering media with deeplearning on time resolved measurement. Opt. Ex-press, 25(15):17466–17479, Jul 2017.

[83] Isaac Freund. Looking through walls and aroundcorners. Physica A-statistical Mechanics and ItsApplications - PHYSICA A, 168:49–65, 09 1990.

[84] Shechao Feng, Charles Kane, Patrick A. Lee, andA.Douglas Stone. Correlations and fluctuationsof coherent wave transmission through disorderedmedia. Physical review letters, 61:834–837, 091988.

[85] Isaac Freund, M Rosenbluh, and Shechao Feng.Memory effects in propagation of optical wavesthrough disordered media. Physical review let-ters, 61:2328–2331, 12 1988.

[86] Yoav Shechtman, Yonina C. Eldar, Oren Co-hen, Henry N. Chapman, Jianwei Miao, andMordechai Segev. Phase retrieval with applica-tion to optical imaging, 2014.

[87] Kishore Jaganathan, Yonina C. Eldar, andBabak Hassibi. Phase retrieval: An overview ofrecent developments, 2015.

[88] Jacopo Bertolotti, Elbert G. van Putten, Chris-tian Blum, Ad Lagendijk, W. Vos, and Allard P.Mosk. Non-invasive imaging through opaquescattering layers. Nature, 491:232–234, 2012.

[89] Brandon M. Smith, Pratham Desai, Vishal Agar-wal, and Mohit Gupta. Colux: Multi-object3d micro-motion analysis using speckle imaging.ACM Trans. Graph., 36(4):34:1–34:12, July 2017.

[90] Andre Beckus, Alexandru Tamasan, AristideDogariu, Ayman F. Abouraddy, and George K.Atia. On the inverse problem of source recon-struction from coherence measurements. J. Opt.Soc. Am. A, 35(6):959–968, Jun 2018.

[91] Pradeep Sen, Billy Chen, Gaurav Garg,Stephen R. Marschner, Mark Horowitz, MarcLevoy, and Hendrik P. A. Lensch. Dual pho-tography. In ACM SIGGRAPH 2005 Papers,SIGGRAPH ’05, pages 745–755, New York, NY,USA, 2005. ACM.

[92] Neil G. Alldrin and David J. Kriegman. A pla-nar light probe. In Proceedings of the 2006 IEEEComputer Society Conference on Computer Vi-sion and Pattern Recognition - Volume 2, CVPR’06, pages 2324–2330, Washington, DC, USA,2006. IEEE Computer Society.

[93] Ashok Veeraraghavan, Ramesh Raskar, AmitAgrawal, Ankit Mohan, and Jack Tumblin. Dap-pled photography: Mask enhanced cameras forheterodyned light fields and coded aperture refo-cusing. In ACM SIGGRAPH 2007 Papers, SIG-GRAPH ’07, New York, NY, USA, 2007. ACM.

[94] Antonio Torralba and William T. Freeman. Acci-dental pinhole and pinspeck cameras: Revealingthe scene outside the picture. In 2012 IEEE Con-ference on Computer Vision and Pattern Recog-nition, pages 374–381, June 2012.

[95] Felix Naser, Igor Gilitschenski, Guy Rosman,Alexander Amini, Fredo Durand, Antonio Tor-ralba, Gregory W. Wornell, William T. Freeman,Sertac Karaman, and Daniela Rus. Shadowcam:Real-time detection of moving obstacles behind acorner for autonomous vehicles. pages 560–567,11 2018.

[96] Jonathan Klein, Martin Laurenzis, Do-minik Ludewig Michels, and Matthias B. Hullin.A quantitative platform for non-line-of-sightimaging problems. In BMVC, 2018.

18

Page 19: arXiv:1910.05613v1 [eess.IV] 12 Oct 2019

[97] Xiaochun Liu, Sebastian Bauer, and AndreasVelten. Analysis of feature visibility in non-line-of-sight measurements. In The IEEE Confer-ence on Computer Vision and Pattern Recogni-tion (CVPR), June 2019.

[98] Ashish Bora, Ajil Jalal, Eric Price, and Alexan-dros G. Dimakis. Compressed sensing using gen-erative models. In Doina Precup and Yee WhyeTeh, editors, Proceedings of the 34th Interna-tional Conference on Machine Learning, vol-ume 70 of Proceedings of Machine Learning Re-search, pages 537–546, International Conven-tion Centre, Sydney, Australia, 06–11 Aug 2017.PMLR.

[99] Sebastian Lunz, Carola Schoenlieb, and OzanOktem. Adversarial regularizers in inverse prob-lems. In S. Bengio, H. Wallach, H. Larochelle,K. Grauman, N. Cesa-Bianchi, and R. Garnett,editors, Advances in Neural Information Process-ing Systems 31, pages 8516–8525. Curran Asso-ciates, Inc., 2018.

[100] Aviad Aberdam, Jeremias Sulam, and MichaelElad. Multi-layer sparse coding: The holisticway. SIAM Journal on Mathematics of Data Sci-ence, 1(1):46–77, 2019.

[101] Kyong Hwan Jin, Michael Mccann, EmmanuelFroustey, and Michael Unser. Deep convolutionalneural network for inverse problems in imag-ing. IEEE Transactions on Image Processing,26(9):4509–4522, Sep. 2017.

[102] Chengqian Che, Fujun Luan, Shuang Zhao,Kavita Bala, and Ioannis Gkioulekas. Inversetransport networks, 2018.

[103] Yan Yang, Jian Sun, Huibin Li, and ZongbenXu. Deep admm-net for compressive sensingmri. In D. D. Lee, M. Sugiyama, U. V. Luxburg,I. Guyon, and R. Garnett, editors, Advances inNeural Information Processing Systems 29, pages10–18. Curran Associates, Inc., 2016.

[104] J H. Rick Chang, Chun-Liang Li, Barnabas Poc-zos, B Kumar, and Aswin Sankaranarayanan.One network to solve them all — solving lin-ear inverse problems using deep projection mod-els. In 2017 IEEE International Conference onComputer Vision (ICCV), pages 5889–5898, Oct2017.

[105] Steven Diamond, Vincent Sitzmann, Felix Heide,and Gordon Wetzstein. Unrolled optimizationwith deep priors, 2017.

19