Light field imaging: modelling, parameterization and...

Preview:

Citation preview

Light field imaging: modelling,

parameterization and sparsification

Atanas Gotchev, Tampere University

11.6.2019 2

The most popular

city to live and study

in

Tampere UniversitiesTampere University of Technology,

University of Tampere and Tampere

University of Applied Sciences

• 35,000 students

• 5,000 employees

Tampere

Third largest

city in Finland,

220,000

inhabitants

One of the

fastest

growing urban

centres in

Finland

Methods for capture, representation and processing real world 3D visual data

Knowledge about perception of depth and visual cues

Optimal visualization on emerging 3D displays

This project has received funding from the European Union’s

Horizon 2020 research and innovation programme under the

Marie Sklodowska-Curie grant agreement No 764951.

The science

of more

exciting

tomorrow

Presentation Outline

• Introduction to plenoptic function, 4D light field and light field displays

• Epipolar plane image representation and densely sampled light fields (DSLF)

• DSLF reconstruction

• Angular super-resolution

• Spatial super-resolution

• DSLF compression

• DSLF applications

Plenoptic function, 4D light field and light field displays

Plenoptic function (PF)

• Introduced by Adelson and Bergen (1991)

• Plenus (complete) + Optic = Plenoptic

• 7-D continuous function that describes the light field P(q,j,l,t,Vx,Vy,Vz)

• (Vx, Vy, Vz) – location in 3D space

• (q, j) – angles determining the direction

• l – wavelength

• t – time

x

zy

j

q

(Vx,Vy,Vz)

Two-plane parameterization

• A 4-D approximation of PF, parameterized through two parallel planes L(u,v,s,t)

u

s

Ds

Du

v

u

t

s

Levoy and Hanrah (1996) – light field

Gortler et al. (1996) – Lumigraph

Light field displays

• Perceptual light field (PLF): how the human eyes sample

the light field

• Light field displays aimed at reconstructing:

• Stereo

• Focus (accommodation and retinal blur)

• Continuous parallax

A. Stern, Y. Yitzhaky, B. Javidi, “Perceivable light fields:

Matching the requirements between the human visual system

and autostereoscopic 3-D displays,” Proc. IEEE, Oct. 2014.

M. Banks, D. Hoffman, J. Kim, G. Wetzstein, “3D Displays“,

Annual Review of Vision Science 2016

A non-exhaustive LF display nomenclature

• Integral Imaging displays

• Super-multiview displays

• Tensor displays

H. Huang and H. Hua, “Systematic

characterization and optimization of 3d light

field displays,” Opt. Express, 2017

G. Wetzstein et. all., “Tensor displays: Compressive

light field synthesis using multilayer displays with

directional backlighting,” ACM Trans. Graph., July 2012

Y. Takaki, “Development of super multi-

view displays,” ITE Transactions on Media

Technology and Applications, 2014.

Projection Based Light Field Displays

• Ray generators

• Discrete to continuous conversion

• LF reconstruction instead of views

T. Balogh, “The HoloVizio system,” Proc. SPIE 6055, 2006

Epipolar plane images, their Fourier domain characteristics, and the densely-sampled light field

Forming epipolar plane image (EPI) from a 3D scene

s

t

u

v

(s,t) – camera plane

(u,v) – image plane

Two-plane parameterization

tAtBtCtDtE

tAtB

tC

tDtE

t

z

tAtBtCtDtE

Forming epipolar plane image (EPI) from a 3D scene

𝑣 =𝑣2 − 𝑣1𝑡2 − 𝑡1

𝑡 − 𝑡 + 𝑣 =𝑓

𝑧0𝑡 − 𝑡1 + 𝑣1

Δ𝑡 = 𝑡2 − 𝑡1

Δ𝑣 = 𝑣2 − 𝑣1Δ𝑣 =

𝑓

𝑧0Δ𝑡

Chai 00 siggraph

t

v

t1 t2

v1

v2

EPIs100

160

220

280

340

100

A

A

C

E

B

D

F

B C

D E F

v

u

t

v

Full parallax

• 4D EPI hyper-cube

t

v

s

u

s

t

u

v

EPI in continuous Fourier domain

zminzmax

fv

t

Chai 00 siggraph

EPIv

t

Wv

Wt

~zmax

~zmin

~inf

~ 0~ f

Discretization in spatial and angular domains

image

resolu

tion

camera density

Wv

Wt

Wt

Wv

2𝜋

Δ𝑡

2𝜋

Δ𝑣

Wt

Wv 2𝜋

Δ𝑡2𝜋

Δ𝑣

Wt

2𝜋

Δ𝑡

Wv

2𝜋

Δ𝑣

Alias free sampling

• Scene with Lambertian properties and without occlusions

• Practical estimation for scene sensing / rendering

#images

#layers1

resolution

1

J.-X. Chai, X. Tong, S.-C. Chan, H.-Y. Shum, “Plenoptic

sampling,” SIGGRAPH (Computer Graphics), July 2000.

Δ𝑡 =1

𝐾𝑓𝑣𝑓ℎ𝑑𝑁𝑑 , 𝑁𝑑 ≥ 1

ℎ𝑑 =1

𝑧𝑚𝑖𝑛−

1

𝑧𝑚𝑎𝑥

𝐾𝑓𝑣𝑓ℎ𝑑 = min 𝐵𝑣𝑠 ,

1

2Δ𝑣,1

2𝛿𝑣

𝐵𝑣𝑠 – highest (texture)frequency

Δ𝑣 – sampling camera resolution

𝛿𝑣 – rendering camera resolution

𝑁𝑑 – Number of layers

Δ𝑡 – Sampling interval

Densely sampled light field (DSLF)

• Sampling that allows to treat the disparity space as a

continuous space

• Less than 1px disparity between adjacent views

• Lines in EPI become unambiguous

• Influenced by

• Sampling density on the t and v plane

• (Minimal) depth and (smallest) details in the scene

• Bilinear interpolation can be used for finding finer

details

• Without introducing any major aliasing errors

t

Dt

Densely sampled light field (DSLF)

20 40 60 80 100 120 140

20

40

60

80

100

120

14020 40 60 80 100 120 140

2

4

6

8

10

12

14

20 40 60 80 100 120 140

20

40

60

80

100

120

140

dmax = 1dmax = 10

Continuous plenoptic function

Linear interpolationAdvance interpolation

Densely sampled light field recontstruction (aka angular super-resolution) by sparsification in shearlet transform domain

Reconstruction by processing EPIs

Set of captured viewsCoarsely

sampled

Densely

sampled

t

v

t

v

≤1px

disp.

lines in EPI domain

cones in spectral

domain

Structured data

Reconstruction by processing EPIs

• Impainting: fill in holes (missing pixels) with visually-

acceptable values

• argmin𝛼

1

2HD𝛼 − Hy 2

2 + 𝜆 𝛼 1 where H is the

operator selecting the given and missing values

• D is a proper dictionary / transform domain where the light field gets sparsified

Hy

y

Shearlet elements in Fourier and spatial domain

• Dictionary is formed by shearlet atoms and the coefficients are found by Shearlet transform

y = D𝛼; 𝛼 = 𝑆 y 𝑦 = 𝑆∗(𝛼)

Vagharshakuan, Bregovic, Gotchev, Light

Field Reconstruction using Shearlet

Transform, IEEE Trans. PAMI, 2017

The algorithm

• Reconstruction formula ො𝑦 = argmin𝑦

𝑆(𝑦) 1, subject to 𝑥 = 𝐻𝑦

t

v

t

v

y𝑥 = 𝐻𝑦

The algorithm

• How to solve this?

ො𝑦 = argmin𝑦

𝑆(𝑦) 1, subject to 𝑥 = 𝐻𝑦

• One needs a regularizer, which will minimize the 𝑙1 norm

• Regularizer applied in the form of hard thresholding in the shearlet domain in

the fashion of denoising…

(𝑇𝜆𝑠)(𝑘) = ቊ𝑠(𝑘), |𝑠(𝑘)| ≥ 𝜆0, |𝑠(𝑛)| < 𝜆

The algorithm

• Iterative procedure

𝑦𝑛+1 = 𝑆∗ 𝑇𝜆𝑛(𝑆(𝑦𝑛 + 𝛼𝑛(𝑥 − 𝐻𝑦𝑛))) ,

where

(𝑇𝜆𝑠)(𝑘) = ቊ𝑠(𝑘), |𝑠(𝑘)| ≥ 𝜆0, |𝑠(𝑘)| < 𝜆

is a hard thresholding operator and 𝛼𝑛 is an

acceleration parameter controlling the convergence

Vagharshakuan, Bregovic, Gotchev, Light

Field Reconstruction using Shearlet

Transform, IEEE Trans. PAMI, 2017

Epipolar-plane image reconstruction (x32)

Input (16

views)Ground-truth Reconstructed

Epipolar-plane image reconstruction (x32)

How to handle full parallax

• Hierarchical reconstruction allows to use lower number of layers

Full parallax

Semi-transparent scenes

Non-Lambertian Scene

Ground Truth Shearlet Reconstruction SFFT[2]

[2] L. Shi, H. Hassanieh, A. Davis, D. Katabi, and F.

Durand, “Light field reconstruction using sparsity in

the continuous fourier domain,” ACM Trans. on

Graphics (TOG), vol. 34, no. 1, p. 12, 2014

Sergio Moreschini, Robert Bregovic, Atanas

Gotchev, Shearlet-Based Light Field

Reconstruction of Scenes with non-Lambertian

properties, 3DTV-CON 2018

Non-Lambertian Scene

Ground Truth Shearlet Reconstruction SFFT[2]

36

Non-Lambertian Scene

Shearlet Reconstruction SFFT[2]

Joint spatial-angular super-resolution

In angular direction….

Horizontal parallax light field Densely sampled light field

In spatial direction….

High resolution and densely sampled light field

Low spatial resolutionmulti-perspective images

Required high resolution images

𝐱𝐝𝐬𝐲 𝐱𝐬𝐫

𝐲 = 𝐇𝐬𝐩𝐭 𝐱𝐬𝐫 𝐱𝐬𝐫 = 𝐇𝐚𝐧𝐱𝐝𝐬

𝐇𝐬𝐩𝐭 - given decimation

matrix in spatial domain

𝐇𝐬𝐩𝐭 - decimation matrix

in angular dimension

argminxsr,xds

𝐲 − 𝐇𝐬𝐩𝐭 𝐱𝐬𝐫 𝟐

𝟐+ 𝛾 𝐱𝐬𝐫 −𝐇𝐚𝐧𝐱𝐝𝐬 𝟐

𝟐 + 𝜆 𝐒𝐱𝐝𝐬 𝟎

Spatial and angular super-resolution formulated as a variational optimization problem

Formulating the problem….

𝐲 − 𝐇𝐬𝐩𝐭 𝐱𝐬𝐫 𝟐

𝟐+ 𝛾 𝐱𝐬𝐫 −𝐇𝐚𝐧𝐱𝐝𝐬 𝟐

𝟐 + 𝜆 𝐒𝐱𝐝𝐬 𝟎

xsrk = xsr

k−1 + 𝜏 A y − Hsptxsrk−1 + 𝛾zk

A~Hspt−1

almost inverse, interpolation filter + guided filtering

𝐲 − 𝐇𝐬𝐩𝐭 𝐱𝐬𝐫 𝟐

𝟐+ 𝛾 𝐱𝐬𝐫 − zk

𝟐

𝟐, for fixed z𝐤 = 𝐇𝐚𝐧𝐱𝐝𝐬

Gradient descent

Spatial super-resolution

𝐲 − 𝐇𝐬𝐩𝐭 𝐱𝐬𝐫 𝟐

𝟐+ 𝛾 𝐱𝐬𝐫 −𝐇𝐚𝐧𝐱𝐝𝐬 𝟐

𝟐 + 𝜆 𝐒𝐱𝐝𝐬 𝟎

S. Vagharshakyan, R. Bregovic and A. Gotchev, "Accelerated Shearlet-Domain Light Field Reconstruction,"

in IEEE Journal of Selected Topics in Signal Processing, vol. 11, no. 7, pp. 1082-1091, Oct. 2017

𝛾 𝐱𝐬𝐫𝐤 −𝐇𝐚𝐧𝐱𝐝𝐬 𝟐

𝟐+ 𝜆 𝐒𝐱𝐝𝐬 𝟎, for fixed 𝐱𝐬𝐫

𝐤

Iterative thresholding in Shearlet transform domain

Angular super-resolution

Results: block-average x2

Mattia Rossi and Pascal Frossard, Geometry-Consistent Light Field

Super-Resolution Via Graph-Based Regularization, IEEE Tran. on

Image Processing, vol. 27, no. 9, pp. 4207-4218, Sep. 2018

Results: block-average x3

Results: block-average x4

Results: Gaussian average x2

Martin Alain, Aljosha Smolic, "Light Field Super-Resolution

via LFBM5D Sparse Coding", IEEE International

Conference on Image Processing (ICIP 2018), 2018

Results: Gaussian average x3

Results: Gaussian average x4

Compression

POC 0

POC 4

POC 8

POC 12

POC 16

• POC 0,4,8,12,16 are encoded with MV-HEVC.

• Predict & encode intermediate views with MV-HEVC.

• Predict & encode intermediate views with Shearlet transform.

• Anchor, encoded POC 0 to POC 16 with HEVC.

Research Methodology (Single Layer Example)

Compression

Sub-Sampling of

Views

Input 17x17 Views

Decoded 5x5 Views

MV-HEVC

Encoder

MV-HEVC

Decoder

5x5 Views

Shearlet

Transform

Prediction

Predicted [17x17 - 5x5] Views

Residual

Estimation

Reference [17x17 - 5x5] Views

MV-HEVC

Encoder

Encoded Stream

Proposed Compression Scheme

Encoder

Preprocessing

Encoded Stream

Decoded 5x5 Views

MV-HEVC

Decoder

Encoded 5x5 Views

Shearlet

Transform

Prediction

Residual

Compensation

17x17 Output Views

Proposed Compression Scheme

Decoder

MV-HEVC

Decoder

Predicted [17x17 - 5x5] Views

Residual [17x17 - 5x5] Views

Encoded Residual [17x17 - 5x5] Views

Compression

Delta SNR=-2.21

Truck Image

Rate Distortion curves between 17x17 Grid Encoded by HEVC & X265 and

17x17 Grid reconstructed by Shearlet (using 5x5 HEVC decoded grid)

Delta SNR=-3.88

X265 AnchorHEVC Anchor

HEVC, 35 dB

Shearlet, 39 dB

Compression

Delta SNR=-0.17

Bunny Image

Rate curves between 17x17 Grid Encoded by HEVC and 17x17

Grid reconstructed by Shearlet (using 5x5 HEVC decoded grid)

Delta SNR= -1.85

X265 AnchorHEVC Anchor

Ground Truth

HEVC with PSNR 37 dB

Shearlet with PSNR 41 dB

Applications

Continuous refocusing for

integral microscopy with

Fourier plane recording

• Conversion of light fields to several different types of holographic representations (e.g. holographic stereogram,

Fresnel holograms) are studied.

• For example, hogels of holographic stereograms consist of several windowed planes waves (propagating to different

directions) whose intensities are defined by the captured light field:

𝑂𝐻𝑆 𝑥 =

𝑚

rect𝑥 − 𝑚∆𝑥

∆𝑥×

𝑖

𝐿 𝑚, 𝑖 exp 𝑗2𝜋𝑓𝑥𝑚𝑖𝑥

Hologram generation from light fields

• Holograms usually impose dense capture of light fields, which requires tedious work e.g. by camera rigs.

• We demonstrated that the capture constraints can be significantly relieved by utilizing the shearlet decomposition

based light field reconstruction. Thus, it becomes possible to use camera arrays.

Perceived images by human eye at different

viewpoints corresponding to LFs with

(a) 1mm, (b) 8mm baselines.

(hologram reconstructions are simulated via

wave field propagation)

(a) (b)

Sahin E., Vagharshakyan S., Mäkinen J., Bregovic R., Gotchev A. “Shearlet-domain light field reconstruction for

holographic stereogram generation”, 2016 IEEE Int. Conf. Image Processing (ICIP). IEEE, 2016. p. 1479-1483.

Hologram generation from light fields

Light field displays

Fourier analysis of Light Field displays

• Projection-based LF displays

• Optical modules

• Ray generators

• Holographic screen

• Discrete to continuous conversion

• LF reconstruction instead of views

T. Balogh, “The HoloVizio system,” Proc. SPIE 6055, 2006

Ray Propagation in LF displays

x

z

Screenplane(s)

(x,z) = (0,0)

zp1

Ray generator(RG) plane

dp

ds

1 Np

FOVp

Ray r

zp2

zp3

zp4

Samplingpatterns

LF sampling topologies in ray space

-2 0 2-2

-1

0

1

2

x

-2 0 2-2

-1

0

1

2

x

-2 0 2-2

-1

0

1

2

x

-4 -2 0 2 4-4

-2

0

2

4

x

-4 -2 0 2 4-4

-2

0

2

4

x

zp1 zp3

-50 0 50-4

-2

0

2

4

x

-50 0 50-4

-2

0

2

4

x

-20 0 20-4

-2

0

2

4

x

zp2

z=0

z=0

zp2 zp4

zp4

Display bandwidth

• Angular-spatial bandwidth at the screen level

determined by the size of the Voronoi cells

calculated for the sampling grid at the screen

plane

• Determines the display passband (throughput)

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

R. Bregović, P. T. Kovács, A. Gotchev, “Optimization of light field display-camera

configuration based on display properties in spectral domain,” Opt. Express, Feb. 2016.

Display-camera setup

xc

zp

Ray

generators

plane

Screen

plane

Viewing /

camera

plane

(0,0)

z

xxp

s

FOVcam

FOVproj

zc

Nc

Np1

1

Finite number of rays

generated by the display

Limited display

bandwidth

Enough ‘correct’ rays

captured by cameras

Display bandwidth at camera plane

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Ω Ω

Ω x

Camera setup for optimal capture

-0.5 0 0.5-20

-10

0

10

20

x

Blue: 𝑆 ҧ𝑧𝑐 𝑃∗ 𝑉( ҧ𝑥𝑝, ത𝛼𝑝, ҧ𝑧𝑝)

Green: 𝑃∗ 𝑉( ҧ𝑥𝑐 , ത𝛼𝑐)Red: 𝑃∗ 𝑉(𝑥𝑐

𝐵𝐼𝐺 , 𝛼𝑐𝐵𝐼𝐺)

• Optimal with respect to a given display

• Desired visualization quality

• Determine the optimal display setup

• Given a display setup

• Determine the optimal capture (data) setup

• Will never have matching data

• LF interpolation / reconstruction needed

Conclusions

• Light Field technologies capable of recreating 3D visual cues beyond binocularity

• Densely Sampled Light Field as an LF representation capable to deliver the desired density

of rays for recreating focus and continuous parallax visual cues

• Computational imaging tools for DSLF reconstruction from sparse cameras

• Research challenges related with the computational complexity of LF reconstruction

techniques

• Research challenges related with the LF display technologies

Thank you for your attention!

Recommended