Neural Rendering - dvl.in.tum.de · Photo-realistic Image Synthesis The Rendering Equation [Kajiya...

Preview:

Citation preview

Neural Rendering

1Prof. Leal-Taixé and Prof. Niessner

Rendering

2

3D Scene:- Material- Lighting- Geometry

(incl. animation)

Camera View Point- Extrinsics- 6 DoF (3rot, 3trans)

Camera Def.- Intrinsics- Often:

- focal length- principal point)

Prof. Leal-Taixé and Prof. Niessner

Photo-realistic Image Synthesis

The Rendering Equation [Kajiya 86]

3Prof. Leal-Taixé and Prof. Niessner

Need 3D Content for Rendering

Textures Material & LightingGeometry

4Prof. Leal-Taixé and Prof. Niessner

Computer Vision for Reconstruction

ICCV’09 [Agarwal et al.]: Building Rome in a Day5Prof. Leal-Taixé and Prof. Niessner

Computer Graphics

3D Digitization

Computer Vision

6Prof. Leal-Taixé and Prof. Niessner

Traditional Graphics vs Deep Learning

3D Model + Textures + Shading -> Synthetic Image

Star Wars Rogue One

Discriminator loss

Generator loss

[Karras et al. 18]

Generative Adversarial Networks

7Prof. Leal-Taixé and Prof. Niessner

Idea of Neural RenderingNovel View point synthesis:

8Prof. Leal-Taixé and Prof. Niessner

Neural Network-> Encodes entire scene description, lighting, materials,

etc.

6 DoF Camera Pose / View Point

Neural Rendering with Pix2PixGround truth for training- Pose + Target Image (e.g., observed from real world)- Constrain with re-rendering loss

Testing- Given unseen pose, generate image

Prof. Leal-Taixé and Prof. Niessner 9

Neural Rendering with Pix2Pix

10Prof. Leal-Taixé and Prof. Niessner

Other Neural Rendering- Conditioned on Faces (Deep Video Portraits)

- Conditioned on Human Skeleton (Everybody Dance Now)

Prof. Leal-Taixé and Prof. Niessner 11

Neural Rendering with Pix2Pix

12Prof. Leal-Taixé and Prof. Niessner

Deep Voxels

[Sitzmann et al. CVPR’19] Deep Voxels13Prof. Leal-Taixé and Prof. Niessner

Deep Voxels• Main idea for video generation:

– Why learn 3D operations with 2D Convs !?!?

– We know how 3D transformations work• E.g., 6 DoF rigid pose [ R | t ]

– Incorporate these into the architectures• Need to be differentiable!

– Example application: novel view point synthesis• Given rigid pose, generate image for that view

[Sitzmann et al. CVPR’19] Deep Voxels14Prof. Leal-Taixé and Prof. Niessner

Deep Voxels

2D U-Net

Rendering

Lifting Layer2D 3D

2D U-Net

2D FeatureExtraction

Source View R, t

Projection Layer

3D 2D

OutputSourceTarget

View R, t3D U-Net

3D Features

Simplified overview for novel view synthesis

[Sitzmann et al. CVPR’19] Deep Voxels15Prof. Leal-Taixé and Prof. Niessner

Deep Voxels

[Sitzmann et al. CVPR’19] Deep Voxels16Prof. Leal-Taixé and Prof. Niessner

Deep Voxels

Issue: we don’t know the depth for the target!-> Per-pixel softmax along the ray-> Network learns the depth

Occlusion Network:

[Sitzmann et al. CVPR’19] Deep Voxels17Prof. Leal-Taixé and Prof. Niessner

Deep Voxels

[Sitzmann et al. ’18] Deep Voxels18Prof. Leal-Taixé and Prof. Niessner

Deep Voxels

[Sitzmann et al. ’18] Deep Voxels19Prof. Leal-Taixé and Prof. Niessner

Deep Voxels: Insights• Lifting from 2D to 3D works great

– No need to take specific care for temp. coherency!

• All 3D operations are differentiable

• Currently, only for novel view-point synthesis– I.e., cGAN for new pose in a given scene

• But: limited resolution due to dense 3D voxel grid

[Sitzmann et al. ’18] Deep Voxels20Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Both Scene Representation and Differentiable Renderer often adapted from traditional computer graphics.

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)21Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)22Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)23Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)24Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)25Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

“True 3D”High quality

No reconstruction priors

Memory O(n3)

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)26Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

“True 3D”High quality

No reconstruction priors

Memory O(n3)

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)27Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

“True 3D”High quality

No reconstruction priors

Memory O(n3)

High quality

Requires good SFMNo compact

representation

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)28Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

“True 3D”High quality

No reconstruction priors

Memory O(n3)

High quality

Requires good SFMNo compact

representation

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)29Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

“True 3D”High quality

No reconstruction priors

Memory O(n3)

High quality

Requires good SFMNo compact

representation

High quality

Requires good SFM

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)30Prof. Leal-Taixé and Prof. Niessner

Importing 3D structure from CG

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Traced

Volumetric

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

“True 3D”High quality

No reconstruction priors

Memory O(n3)

High quality

Requires good SFMNo compact

representation

High quality

Requires good SFM

Pros

Cons

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)31Prof. Leal-Taixé and Prof. Niessner

Scene Representation NetworksSitzmann et al., Neurips 2019

Scene

Representa

tion

Renderer

ℝ3՜ ℝ𝑛

ReLU MLP Generalized (learned) sphere-tracing

Generalizati

on

Hypernetwork

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)32Prof. Leal-Taixé and Prof. Niessner

Scene Representation NetworksSitzmann et al., Neurips 2019

Full 3D Reconstruction from single image!

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)33Prof. Leal-Taixé and Prof. Niessner

NERF: Neural Radiance FieldsMildenhall et al., arXiv 2020

Scene

Representa

tion

Renderer

ℝ6՜ ℝ3

ReLU MLP +Positional Encoding

View Direction

Volumetric,stratified sampling

Generalizati

on

None.

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)34Prof. Leal-Taixé and Prof. Niessner

NERF: Neural Radiance FieldsMildenhall et al., arXiv 2020

Photorealistic, including view-dependence!(~100 images)

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)35Prof. Leal-Taixé and Prof. Niessner

Requirements

Scene

Representa

tion

Multi-Plane Images Voxelgrids Image-based Point Clouds Implicit Function

Renderer(Alpha) compositing Volumetric

Ray-basedRasterization Splatting Sphere-Tracing

Volumetric

Pros

Cons

Fast renderingHigh qualityGeneralizes

Only 2.5DSize

“True 3D”High quality

No reconstruction priors

Memory O(n3)

High quality

Requires good SFMNo compact

representation

High quality

Requires good SFM

High qualityMay generalize!

Expensive rendering, training

Slides: Vincent Sitzmann (Eurographics State-of-the-art on Neural Rendering)36Prof. Leal-Taixé and Prof. Niessner

Neural Textures: Features on 3D Mesh

37Prof. Leal-Taixé and Prof. Niessner

Siggraph’19 [Thies et al.]: Neural Textures

3D Geometry

Neural Texture

Neural Textures: Features on 3D Mesh

38Prof. Leal-Taixé and Prof. Niessner

UV-Map

Sampled Texture

Siggraph’19 [Thies et al.]: Neural Textures

3D Geometry

Rendering

3D 2D

View R, t

Neural Texture

Neural Textures: Features on 3D Mesh

39Prof. Leal-Taixé and Prof. Niessner

UV-Map

Renderer Output Image

Sampled Texture

Siggraph’19 [Thies et al.]: Neural Textures

3D Geometry

Rendering

3D 2D

View R, t

Neural Texture

Neural Textures: Features on 3D Mesh

40Prof. Leal-Taixé and Prof. Niessner

Deferred Neural RenderingA

lbe

do

De

pth

No

rm

al

Lig

hti

ng

Deferred Renderer

Handcrafted ”Feature Maps“

Siggraph’19 [Thies et al.]: Neural Textures41Prof. Leal-Taixé and Prof. Niessner

Deferred Neural RenderingA

lbe

do

De

pth

No

rm

al

Lig

hti

ng

Deferred Renderer

Handcrafted ”Feature Maps“Learned

Neural

Siggraph’19 [Thies et al.]: Neural Textures42Prof. Leal-Taixé and Prof. Niessner

UV-Map

Renderer Output Image

Sampled Texture

3D Geometry

Rendering

3D 2D

View R, t

Neural Texture

Siggraph’19 [Thies et al.]: Neural Textures

Deferred Neural Rendering

43Prof. Leal-Taixé and Prof. Niessner

Neural Textures: Features on 3D Mesh

Siggraph’19 [Thies et al.]: Neural Textures44Prof. Leal-Taixé and Prof. Niessner

Input UV-Map Ours

Novel View-Point Synthesis

Siggraph’19 [Thies et al.]: Neural Textures45Prof. Leal-Taixé and Prof. Niessner

Ground Truth Ours

Novel View-Point Synthesis

Siggraph’19 [Thies et al.]: Neural Textures46Prof. Leal-Taixé and Prof. Niessner

Ge

om

etr

yE

dit

ing

Inp

ut

Se

qu

en

ce

Scene Editing

Siggraph’19 [Thies et al.]: Neural Textures47Prof. Leal-Taixé and Prof. Niessner

Scene Editing

Siggraph’19 [Thies et al.]: Neural Textures48Prof. Leal-Taixé and Prof. Niessner

Scene Editing

Siggraph’19 [Thies et al.]: Neural Textures49Prof. Leal-Taixé and Prof. Niessner

Facial Animation

Siggraph’19 [Thies et al.]: Neural Textures50Prof. Leal-Taixé and Prof. Niessner

Facial Animation

Siggraph’19 [Thies et al.]: Neural Textures51Prof. Leal-Taixé and Prof. Niessner

Facial Animation

Siggraph’19 [Thies et al.]: Neural Textures52Prof. Leal-Taixé and Prof. Niessner

Facial Animation

Siggraph’19 [Thies et al.]: Neural Textures53Prof. Leal-Taixé and Prof. Niessner

Facial Animation

Siggraph’19 [Thies et al.]: Neural Textures54Prof. Leal-Taixé and Prof. Niessner

Deferred Neural Rendering

Siggraph’19 [Thies et al.]: Neural Textures56Prof. Leal-Taixé and Prof. Niessner

Deferred Neural Rendering

Siggraph’19 [Thies et al.]: Neural Textures57Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

58Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

59Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

60Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

[Hannun et al.] DeepSpeech RNN

Output of the RNN of DeepSpeech:- Logits of alphabet (|alphabet|=29)

We use a time window (n=16)

61Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

62Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

63Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

Person-specificBlendshape Expression Model

64Prof. Leal-Taixé and Prof. Niessner

Neural Voice PuppetryAudio2Expression Training

65Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

Hundreds of commentator videos available-- all with ‘neutral’ talking style --

66Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

Flame Model Basel Model

with pose

67Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

68Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

69Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

70Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

71Prof. Leal-Taixé and Prof. Niessner

Big Open Challenges

72Prof. Leal-Taixé and Prof. Niessner

Big Open Challenges

Photo-realistic Reconstruction

73Prof. Leal-Taixé and Prof. Niessner

Big Open Challenges: How much can AI do?

Siggraph’19 [Thies et al.]: Neural Textures74Prof. Leal-Taixé and Prof. Niessner

Big Open Challenges: 3D in NetworksWhy learn 3D operations, such as transformations?

-> differentiate known operators

Capsule networks are motivated by inverse graphics [Sabour et al. 17]

75Prof. Leal-Taixé and Prof. Niessner

See you next week

77Prof. Leal-Taixé and Prof. Niessner

Some Extra Slides:

Prof. Leal-Taixé and Prof. Niessner 78

Neural Voice Puppetry

79Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

80Prof. Leal-Taixé and Prof. Niessner

Neural Voice Puppetry

81Prof. Leal-Taixé and Prof. Niessner

Facial Reenactment

82

Dense Sparse

Prof. Leal-Taixé and Prof. Niessner

Neural Rendering and Reenactment ofHuman Actor Videos

[Liu et al. 19] Neural Rendering and Reenactment of Human Actor Videos 83Prof. Leal-Taixé and Prof. Niessner

84

Body ReenactmentDense Sparse

Prof. Leal-Taixé and Prof. Niessner

85

Open Challenges

• Motion Capturing

• Person-specific Motions/Expressions

• Temporal Stability

• Image Quality

Prof. Leal-Taixé and Prof. Niessner

Recommended