99
Probabilistic Approaches for RGB-D Video Enhancement and Applications Speaker: Lu Sheng Supervisor: Prof. King Ngi Ngan Lu Sheng, Thesis Oral Defense

RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Probabilistic Approaches for RGB-D Video Enhancement and Applications

Speaker: Lu ShengSupervisor: Prof. King Ngi Ngan

Lu Sheng, Thesis Oral Defense

Page 2: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Why RGB-D Data Essential?

RGB: 2D visual pattern Depth: 3D geometry

RGB image cannot explicitly tells the computer the 3Dstructure of each object

Depth cannot tell us the texture patterns overlaid

RGB + Depth helps us to comprehensively understand the3D visual world

Lu Sheng, Thesis Oral Defense

Page 3: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Why RGB-D Data Essential?

Explosive growth of 3D applications

3D reconstruction Novel view synthesis

Virtual reality / Augmented reality 3DTV & FTV Refocus

Motion sensing /gesture recognition

Lu Sheng, Thesis Oral Defense

Page 4: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Why RGB-D Data Essential?

Explosive growth of 3D applications

Autonomousnavigation & safety

Personal & industrial robots

Scene understanding Pedestrian detection Action recognitionLu Sheng, Thesis Oral Defense

Page 5: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Stereo vision

Shape-from-shading Structure-from-motion

Recent Depth Acquisition Methods

L R

Drawbacks

Usually computationally intensive

Mediocre quality

Require simple or artificial shooting conditions.Lu Sheng, Thesis Oral Defense

Page 6: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Recent Depth Acquisition Methods

Kinect Time-of-flight camera Laser scanner

Compare to passive methods

Standard resolution depth frames in video frame rate

More robust to difficult shooting conditions

Drawbacks

Poor quality impedes the depth-based tasks to give full play to their potential performances

Lu Sheng, Thesis Oral Defense

Page 7: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

High Quality Depth Data are Important

A lot of applications require high quality depth data

Spatiotemporal depth video enhancement is necessary

Depth data cannot perform structural regularization by their own

If accompanied by synchronized RGB data

multi-modal structural features shared by texture and geometry enable guidance from the texture features to regularize the depth maps

Lu Sheng, Thesis Oral Defense

Page 8: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Depth is NOT Texture

Depth links to the 3D geometry of the captured scene

Learn effective methods to encode these observations

Spatial relationshipsbetween objects

Depth ordering

Occlusion reasoning

Object segmentation

Geometric structures inside each object

Piecewise smoothness

Distinctive discontinuities

Lu Sheng, Thesis Oral Defense

Page 9: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Goals

Explore effective ways to render robust spatiotemporal RGB-D depth video enhancement

Learn specific treatments compatible to 3D geometry forenhancement and depth-based applications

Employ probabilistic approaches to model these tasks

Lu Sheng, Thesis Oral Defense

Page 10: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Hybrid Geometric Hole Filling Strategy

for Spatial Enhancement

Spatial RGB-D Enhancement

Lu Sheng, Thesis Oral Defense

Page 11: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

enhanced depth image

Low resolution Noise & outliers Depth missing holes Structure distortions

RGB-D images upsampled raw depth image

? High definite Structure optimized Complete

Lu Sheng, Thesis Oral Defense

Page 12: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

Observations

Co-occurrences between depth discontinuities and image edges

Homogeneous texture patterns have similar 3D geometries

Lu Sheng, Thesis Oral Defense

Page 13: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Hybrid Geometric Hole Filling Strategy

Filtering-based Depth Interpolation

Segment-based Depth Propagation

Hole Filling

Depth Map Refinement

Input RGB-D pair Output RGB-D pair

Lu Sheng, Thesis Oral Defense

Page 14: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Hole Partitioning

Up-sample low-resolution depth map into sparse grid

Pixels are divided into two parts

in hole region:

with depth values:

Further partition holes into two parts

based on valid depth pixels in its neighbors

Lu Sheng, Thesis Oral Defense

Page 15: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Filtering-based Depth Interpolation

Filtering-based Depth Interpolation for region

Require enough depth info. in the neighbors to infer a reliable depth value

Joint Bilateral Filtering

Fill Fill whole image

× =

Lu Sheng, Thesis Oral Defense

Page 16: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Depth Propagation under Segment Constraint

Depth Propagation for region

Segment constraint

Depth variation is smooth in an over-segmented RGB patch

One parametric surface model in one patch

Generate segments

Superpixel – simple linear iterative clustering (SLIC)

Hole patch

Patch with known depth

Partially filled patch

After filling

Lu Sheng, Thesis Oral Defense

Page 17: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Depth Propagation under Segment Constraint

Filling the partially filled patches by surface fitting with RANSAC

Surface propagation for patches

Assign the surface model by finding its most similar RGB patch with known surface model in the neighborhood

The cost function models the statistical texture similarity and spatial distance

A greedy algorithm is exploited

Lu Sheng, Thesis Oral Defense

Page 18: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Depth Propagation under Segment Constraint

Generate segments Fill in partially filled patches Fill in hole patches

Depth map refinement

Various filtering methods can be exploited here

A standard joint bilateral filtering is utilized for simplicity

Lu Sheng, Thesis Oral Defense

Page 19: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results

Middlebury dataset

Error metric: Bad Pixel Ratio (Δ𝑑 ≥ 1 as bad pixel)

[1] C. Richardt, et. al., Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos, CGF, 2012[2] L. Wang, et. al., Stereoscopic inpainting: Joint color and depth completion from stereo images, CVPR 2008.

RGB Images Depth images Ground truth Muti-res JBU[1] Wang et.al [2] Proposed method

BP: 8.35% BP: 3.65% BP: 3.33%

BP: 14.10% BP: 3.10% BP: 2.51%

Lu Sheng, Thesis Oral Defense

Page 20: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Weighted Structure Filters

Based on Parametric Structural Decomposition

Spatial RGB-D Enhancement

Lu Sheng, Thesis Oral Defense

Page 21: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

A variety of popular image filters are related to the local statistics of the input image

Median filter: catch half point at the cumulative local distribution

Mode filter: seek the global mode of the local distribution

Average filter: estimate the expectation of the local distribution

Lu Sheng, Thesis Oral Defense

Page 22: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

Provided with a guidance feature map

Image intensity, patches, edge maps, …

These filters can be extended to joint weighted filters

Propagate local feature statistics into the target image

Various applications

Enhancement / de-noising / style manipulation / structure decomposition ….

Lu Sheng, Thesis Oral Defense

Page 23: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

Disparity enhancement

Image denoising

JPEG artifact removalContrast enhancement

Image stylization

Joint depth upsampling

Lu Sheng, Thesis Oral Defense

Page 24: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Weighted Distribution Estimation

The weighted distribution is

encodes both the spatial nearness and range affinity

measures the data compatibility

Brute-force implementation is of high computational cost

Computational cost depends on the number of samples 𝑔𝑖

✓ Hundreds of filtering operations are required to output a satisfactory distribution

✓ How to reduce it but do not distort the distribution?

𝑔𝑖

Lu Sheng, Thesis Oral Defense

Page 25: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Structures in a Local Patch

cloud

object

tower

sky

Lu Sheng, Thesis Oral Defense

Page 26: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Structures in a Local Patch

cloud

object

tower

sky

A patch of a natural image does not contain a large number of structures

Nearby patches share similar structures

Two pixels are similar if they both have high likelihoods to the same local structures

It is possible to construct the distribution of a local patch by the mixture model Lu Sheng, Thesis Oral Defense

Page 27: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

A Probabilistic Kernel

Convention kernel for data compatibility

Assume the image is conveyed by several (e.g. 𝐿) structures throughout the image domain

Measure the difference between 𝑓𝑥 and 𝑓𝑦

Lu Sheng, Thesis Oral Defense

Page 28: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

A Probabilistic Kernel

Each structure is a probabilistic model

Two pixels are similar if they both have high responses to the 𝑙𝑡ℎ model

Assemble all models

Gaussian distribution with noise std

Lu Sheng, Thesis Oral Defense

Page 29: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Weighted Distribution Estimation

Kernel

Gaussian, Kronecker delta, etc.

Distribution Estimation

Kernel

Local structure similarity

Distribution Estimation

Conventional Distribution The Proposed Distribution

Need hundreds of filtering operations

Only 𝐿 filtering operations to get 𝜓𝐱 𝑙 , 𝑙 ∈ ℒ

A mixture models!

Lu Sheng, Thesis Oral Defense

Page 30: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Gaussian Models for the Local Structures

Gaussian distribution to define the models for the local structures

Uniformly Quantized Models (UQM)

Locally Adaptive Models (LAM)

Lu Sheng, Thesis Oral Defense

Page 31: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Gaussian Models for the Local Structures

Estimation of the Locally Adaptive Models

Hierarchical Clustering by Binary Space Partition Tree

1

𝑆1

3

𝑆3

2

𝑆2

7

6

4

5

+

+

+

- -

-with

Lu Sheng, Thesis Oral Defense

Page 32: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

The speedup of the proposed method

The gain is generally 2~4x faster for grayscale image 6~12x faster for color image Even faster for disparity map or cartoon-style

image due to their high structural homogeneity A manual threshold to stop model generation

Runtime comparison

Estimate the necessary LAM models on the BSD3000 dataset

Lu Sheng, Thesis Oral Defense

Page 33: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Application-I: Disparity Enhancement (error metric: RMSE)

~16s

~4s<1s

Lu Sheng, Thesis Oral Defense

Page 34: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Application-I: Disparity Enhancement

Cover more details & avoid staircase artifact Although small number of LAM models cannot cover all the details, it is

still superior to the UQM models

Lu Sheng, Thesis Oral Defense

Page 35: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Raw Color frame

Spatial filter Spatiotemporal filter

Lu Sheng, Thesis Oral Defense

Page 36: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Application-II: JPEG Block Artifact Removal

Piecewise smooth results and reduce staircase artifact but do not distort necessary structures

Lu Sheng, Thesis Oral Defense

Page 37: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Application-III: Contrast Enhancement

source image

after structure-preserving

filtering

after detailenhancement

Lu Sheng, Thesis Oral Defense

Page 38: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Application-IV: Joint Depth Map Upsampling

Lu Sheng, Thesis Oral Defense

Page 39: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Spatiotemporal Enhancement

based on Static Structure

Spatiotemporal RGB-D Enhancement

Lu Sheng, Thesis Oral Defense

Page 40: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

A raw depth video of a natural scene

Contains various complex and even unpredictable dynamic contents

Suffers spatial and temporal artifacts

Raw Kinect video

Color-coded Raw TOF video

Lu Sheng, Thesis Oral Defense

Page 41: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

A raw depth video of a natural scene

Contains various complex and even unpredictable dynamic contents

Suffers spatial and temporal artifacts

After the spatial enhancement

Reduce artifacts in spatial domain

But introduce temporal flickering

No temporal consistency

Aggravate flickering artifacts

Raw Kinect video

Spatial JBF

Lu Sheng, Thesis Oral Defense

Page 42: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

After a conventional spatiotemporal enhancement

Still contain temporal flickering

Distort depth variation on dynamic objects

Coherent spatiotemporal JBF

Spatial JBF

How to eliminate the temporal flickering while not distort the necessary depth

variation along dynamic objects?

Lu Sheng, Thesis Oral Defense

Page 43: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Static Structure

A moving object

A static object

The static background

Kinect or another depth camera

Lu Sheng, Thesis Oral Defense

Page 44: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Static Structure

A moving object

A static object

The static background

Kinect or another depth camera

Captured depth map

Lu Sheng, Thesis Oral Defense

Page 45: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Static Structure

Intrinsic structure underneath the captured scene

lies on or behind the surface of the input depth frame

A probabilistic medium to indicate whether a region is static

A moving object

A static object

The static background

Kinect or another depth camera

static structure

Lu Sheng, Thesis Oral Defense

Page 46: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Static Structure

Simple observations

Moving objects stay in its front

Static regions or visible background area are fused into it

A moving object

A static object

The static background

Kinect or another depth camera

static structure

Lu Sheng, Thesis Oral Defense

Page 47: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Static Structure Spatiotemporal Enhancement

Robust static/dynamic region detection by the static structure

Spatiotemporally enhance the static region with the static structure

Spatially optimized the dynamic foreground

Temporally coherent for static region and depth variation preserved

for dynamic contents

How to estimate static structure?

Lu Sheng, Thesis Oral Defense

Page 48: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Generative Model for Static Structure

Camera center

Line of sight

Current static structure

Behind the structure

Before the structure

A Probabilistic Generative Model

Lu Sheng, Thesis Oral Defense

Page 49: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Generative Model for Static Structure

A Probabilistic Generative Model

If incoming depth belongs to

State-I: the static structure

Camera center

Line of sight

Current static structure

State-I

Lu Sheng, Thesis Oral Defense

Page 50: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Generative Model for Static Structure

A Probabilistic Generative Model

If incoming depth belongs to

State-I: the static structure

State-II: outliers in the front or moving objects

Camera center

Line of sight

Current static structure

State-II

is an indicate function that is equal to 1, when input argument is true and 0 vice visa

Lu Sheng, Thesis Oral Defense

Page 51: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Generative Model for Static Structure

A Probabilistic Generative Model

If incoming depth belongs to

State-I: the static structure

State-II: outliers in the front or moving objects

State-III: outliers rearward or revealedbackground

Camera center

Line of sight

Current static structure

State-III

is an indicate function that is equal to 1, when input argument is true and 0 vice visa

Lu Sheng, Thesis Oral Defense

Page 52: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Generative Model for Static Structure

A Probabilistic Generative Model

The likelihood of w.r.t. the given static structure

Gaussian prior over

Dirichlet prior over the frequency of each state

Camera center

Current static structure

Lu Sheng, Thesis Oral Defense

Page 53: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Online Update Scheme

A Probabilistic Generative Model

The posterior

is the set of previous depth samples

is the set of current samples

Camera center

Current static structure

Lu Sheng, Thesis Oral Defense

Page 54: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Online Update Scheme

A Probabilistic Generative Model

The posterior

is the set of previous depth samples

is the set of current samples

If the input frame only contains the static scene and outliers, the updated static structure will be governed by the posterior, and we have

Its probable depth is

The reliability of the model is

Variational approximation for efficiencyCamera center

Updated static structure

Lu Sheng, Thesis Oral Defense

Page 55: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Layer Assignment

Label the input depth frame into three layers

𝑙𝑖𝑠𝑠: agree with estimated static structure

𝑙𝑑𝑦𝑛: belong to dynamic objects

𝑙𝑜𝑐𝑐: refer to the previous occluded structure

𝑙𝑖𝑠𝑠 and 𝑙𝑜𝑐𝑐 defines the current static regions

Fully Connected Conditional Random Fields with effective inference based on real-time high-dimensional filters

𝒍𝒊𝒔𝒔

𝒍𝒅𝒚𝒏

𝒍𝒐𝒄𝒄

Lu Sheng, Thesis Oral Defense

Page 56: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Layer Assignment & Online Update of the Static Structure

(a)

(b)

(c)

(d)

(e)

#1 #2 #3 #4

#1 #2 #3 #4

#1 #2 #3 #4

#1 #2 #3 #4

#1 #2 #3 #4

#5

#5

#5

#5

#5

Raw depth

Raw color

Layer assign.

Depthstatic

struct.

Colorstatic

struct. Lu Sheng, Thesis Oral Defense

Page 57: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Layer Assignment & Online Update Update of the Static Structure

#1 #2 #3 #4 #5

(a)

(b)

(c)

#1 #2 #3 #4 #5

#1 #2 #3 #4 #5

Raw depth

Layer assign.

Depthstatic

struct.

Lu Sheng, Thesis Oral Defense

Page 58: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Spatiotemporal Depth Video Enhancement

Input data (𝑡)

Layer Assignment

VariationalApproximation

Spatial Enhancement

Static Structure (𝑡)

Static Structure (𝑡 − 1)

Spatiotemporal Depth Video Enhancement

Online Static Structure Updating Scheme Enhanced depth frame

(𝑡)Lu Sheng, Thesis Oral Defense

Page 59: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Result Comparisons

(a) Raw RGB-D videos

(b) Proposed method (c) Lang et al. [3]

[1] C. Richardt, et. al, “Coherent spatiotemporal filtering, upsamplingand rendering of RGBZ videos,” Computer Graphics Forum, 2012.

[2] D. Min et al, “Depth video enhancement based on weighted mode filtering,” TIP, 2012.

[3] M. Lang et al, “Practical temporal consistency for image-based graphics applications,”TOG. 2012.

superior in static scene reconstruction dynamic object enhancement

Lu Sheng, Thesis Oral Defense

Page 60: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Result Comparisons

(a) Raw RGB-D videos (b) Proposed method

(c) CSTF [1] (d) WMF [2] (e) Lang et al. [3]Lu Sheng, Thesis Oral Defense

Page 61: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Color frames

Depth frames

CSTF [1]

WMF [2]

Lang et al. [3]

Ours

Closed-upsLu Sheng, Thesis Oral Defense

Page 62: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Result Comparisons

(b) Proposed method (c) Lang et al. [3]

(a) Raw RGB-D videos

(b) Proposed method (c) Lang et al. [3]

(a) Raw RGB-D videos

dyn_kinect_1 dyn_kinect_2

Lu Sheng, Thesis Oral Defense

Page 63: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Result Comparisons

(a) dyn_kinect_2 (b) dyn_kinect_3

Color

Depth

Lang et al. [3]

Ours

dyn_kinect_1 dyn_kinect_2

Lu Sheng, Thesis Oral Defense

Page 64: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Applications

Application-I: Background Subtraction

color image by raw depth image by the proposed method

Lang et al. [3] CSTF [1] WMF [2]Lu Sheng, Thesis Oral Defense

Page 65: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Applications

Application-II: Novel View Synthesis

(a) color image (b) raw depth image (c) enhanced depth image

(d) by raw depth image (e) by static structure (f) by enhanced depth imageLu Sheng, Thesis Oral Defense

Page 66: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

A Generative Model

for Robust 3D Facial Pose Tracking

Depth-based Application

Lu Sheng, Thesis Oral Defense

Page 67: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

Why facial pose tracking interesting?

Immersive Video Communication

3DTV & Free-viewpoint TV

VR / AR and etc.

With expression added?

Image/Video Editing

Performance Capturing and etc.

Lu Sheng, Thesis Oral Defense

Page 68: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

How to let it

Markerless

No explicit or manual markers

Realtime

Cannot afford sophisticated correspondence estimation & face shape representation

Robustness and Smoothness

Robust to illumination variations, occlusions & outliers

Robust to varying facial expressions

Temporally coherent tracking

Adaptive to any user on-the-fly without manual calibration

Lu Sheng, Thesis Oral Defense

Page 69: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

RGB based facial pose tracking has been successfully performed under optimally constrained scenes

It is fragile for unconstrained capturing conditions

Illumination variations

Shadows

Large and severe occlusions

Common in numerous applications in consumer level

Lu Sheng, Thesis Oral Defense

Page 70: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Introduction

Commodity real-time range sensors

Explicitly tell the space relationship

Irrelevant to illumination variations & shading

Easier inference for occlusions

BUT new challenges arisen Noise, missing values &

outliers Complex occlusions Varying expressions Online user adaptation

Lu Sheng, Thesis Oral Defense

Page 71: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

The Proposed Method

A framework that

unifies pose tracking and face model adaptation on-the-fly

offers accurate, occlusion-aware and uninterrupted 3D facial pose tracking

A visibility constrained criterion for

correspondence-free and occlusion-aware rigid facial pose estimation

A generative multilinear face model

both models the identity and expression

facilitates the online face model personalization without the interference caused by the expression variations

Lu Sheng, Thesis Oral Defense

Page 72: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Probabilistic 3D Face Parameterization

Multilinear Face Model

Unifies the representations of identity and expression

Models the face dataset as a 3D tensor

Decomposes it by High-order singular value decomposition

Any face can be reconstructed as

Lu Sheng, Thesis Oral Defense

Page 73: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Probabilistic 3D Face Parameterization

Generative models for face modeling

Model the uncertainties of the shape, identity, and expression

Feasible to simulate, predict the face identity and expression

Enable group-wise rigid facial pose estimation suitable for any faces

The generative face model can be learned from a training dataset

FaceWarehouse Dataset

150 identity, 47 expressions Different ages, genders, races … Its diversity lets the learned face

model cover most common identities and expressions

Lu Sheng, Thesis Oral Defense

Page 74: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Probabilistic 3D Face Parameterization

Identity and Expression Priors

Multilinear Gaussian Face Model

Learned from the FaceWarehouse datasettogether with the core tensor

for for

(b) Variance by (c) Variance by

mm

(a) Mean face (d) Variance by Lu Sheng, Thesis Oral Defense

Page 75: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Probabilistic Facial Pose Tracking

Rigid PoseTracking

Identity Adaptation

Input

Output

Identity distribution

Pose Parameters Face Model

Lu Sheng, Thesis Oral Defense

Page 76: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Transform a canonical face model to match the input point cloud

The warped face model has the distribution

Robust Facial Pose Estimation

(b) Variance by (c) Variance by

mm

(a) Mean face (d) Variance by

Face model in canonical coordinate

inputpoint cloud

scale rotation translation

Lu Sheng, Thesis Oral Defense

Page 77: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Ray Visibility Constraint

Occlusions are inevitable in uncontrolled scenarios

Occluded human faces are always behind the occluding objects, like hairs,fingers/gestures, glasses, accessories

Self-occlusion Occluded by hair

Occluded by hand/gestureOccluded by accessories

Lu Sheng, Thesis Oral Defense

Page 78: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Ray Visibility Constraint

Ray Visibility Constraint

If correctly aligned

the visible face model points are those that overlap with the input point cloud

the rest face model points should always be occluded by the input point cloud

(a) Case-I (b) Case-II (c) Case-III

Face point is visible Face point is occluded

Should be prevented

Lu Sheng, Thesis Oral Defense

Page 79: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Ray Visibility Constraint

Connect point pair along a ray

their distance along the surface of the input data

The distribution of one face model point ismapped along the surface normal direction

The face model point is visible

The face mode point is occluded visible

occluded

face distribution

line-of-sightcamera

Lu Sheng, Thesis Oral Defense

Page 80: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Ray Visibility Constraint

Ray Visibility Score

Measures the compatibility between the distributions of the face model andthe input point cloud

Applies the Kullback-Leibler Divergence

data distribution

projected model distribution

The minimization of ray visibility score results in the optimalcompatibility between these two distribution

Quasi-Newton method & further refined by particle swarm optimization

Occlusions receive constant penalties

Visible points punish the misalignment & model uncertainties

More robust than ICP-based cost function

solver

Lu Sheng, Thesis Oral Defense

Page 81: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Robust Facial Pose Estimation

Result comparison with the generic face model

(a) Color image (b) Point cloud (c) Initial alignment

(d) ICP (e) RVC + ML (f) RVS (g) RVS + PSO

Lu Sheng, Thesis Oral Defense

Page 82: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Robust Facial Pose Estimation

More results with the generic face model

(a) Color image (b) Point cloud (c) Initial alignment (d) Ours

no explicit correspondences

handle occlusions even with apoor initial pose

less vulnerable to bad localminima

PSO increases the robustness

Lu Sheng, Thesis Oral Defense

Page 83: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Online Identity Adaptation

Variational Approximation

The face model is identified by the identity distribution

It can be online estimated through assumed density filtering (ADF)

The data likelihood A mixture distribution encoding the model and outlier The model fitting function is robust to quantization with a modified

projection distance

The variance of identity is enlarged per frame to prevent overfitting Lu Sheng, Thesis Oral Defense

Page 84: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Online Identity Adaptation

(a)

(b)

(c)

Results of online model adaptationLu Sheng, Thesis Oral Defense

Page 85: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Experiments on public depth-based facial pose datasets

Biwi dataset ICT-3DHP dataset

Dataset 𝑵𝒔𝒆𝒒 𝑵𝒇𝒓𝒎 𝑵𝒔𝒖𝒃𝒋 occlusions expressions 𝝎𝒎𝒂𝒙

Biwi 24 ~15K 25accessories

hairneutral ~ slight

±75 yaw±60 pitch

ICT-3DHP 10 ~14k 10accessories

hairslight ~

exaggerated±75 yaw±45 pitch

Lu Sheng, Thesis Oral Defense

Page 86: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Robust to profiled faces due to large rotations and occlusions from hair andaccessories.

profiled face profiled faceocclusions

occlusions expressions profiled faceocclusions

Lu Sheng, Thesis Oral Defense

Page 87: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

The proposed system is also effective to the expression variations

Ray visibility constraint

efficiently infer the occlusionsagainst the face model

optimize the visible face areaagainst the occlusions

Personalized face model

enables compact fitting

robust to changes in thepersonalized expressions

Lu Sheng, Thesis Oral Defense

Page 88: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Adaptation between different users

Three different identities are presented in three adjacent frames

Lu Sheng, Thesis Oral Defense

Page 89: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Experimental Results & Discussions

Comparison with the state-of-the-arts

MethodErrors

Yaw (deg) Pitch (deg) Roll (deg) Trans (mm)

Ours 2.3 2.0 1.9 6.9

RF 8.9 8.5 7.9 14.0

Martin 3.6 2.5 2.6 5.8

CLM-Z 14.8 12.0 23.3 16.7

TSP 3.9 3.0 2.5 8.4

PSO 11.1 6.6 6.7 13.8

Meyer 2.1 2.1 2.4 5.9

Li* 2.2 1.7 3.2 -

*This method is based on RGB-D data

Discriminative: RF Model fitting: CLM-Z, PSO, Martin et al.,

Meyer et al. Feature-based: TSP RGB-D: Li*

MethodErrors

Yaw (deg) Pitch (deg) Roll (deg)

Ours 3.4 3.2 3.3

RF 7.2 9.4 7.5

CLM-Z 6.9 7.1 10.5

Li* 3.3 3.1 2.9

Biwi dataset ICT-3DHP dataset

Lu Sheng, Thesis Oral Defense

Page 90: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Conclusion

Lu Sheng, Thesis Oral Defense

Page 91: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Conclusions

Hybrid Geometric Hole filling Strategy for Spatial enhancement

• Hybrid hole filling merging the interpolation and parametric structure propagation

• A novel texture-constrained patch matching method for a robust structure inference

Weighted Structure Filters Based on Parametric Structural Decomposition

• An efficient distribution estimation that are adaptive to local image structure

• Accelerating joint weighted filters without structural distortions

Lu Sheng, Thesis Oral Defense

Page 92: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Conclusions

Spatiotemporal Enhancement based on Static Structure

• Robust temporally consistent depth enhancement based on a probabilistic static structure of the captured scene

• The dynamic content is enhanced spatially while the static region favors a long-range spatiotemporal optimization

A Generative Model for Robust 3D Facial Pose Tracking

• A robust depth-based facial pose tracking system with an adaptive face model personalization

• The multilinear generative face model and the visibility-constrained rigid pose estimation improve the robustness

Lu Sheng, Thesis Oral Defense

Page 93: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Publications

Lu Sheng, King Ngi Ngan, Chern-Loon Lim and Songnan Li, Online Temporally Consistent Indoor Depth Video Enhancement via Static Structure, TIP, 2015.

Songnan Li, King Ngi Ngan, Raveendran Paramesran and Lu Sheng, Real-time Head Pose Tracking with Online Face Template Reconstruction, TPAMI, 2016.

Lu Sheng, Tak-Wai Hui and King Ngi Ngan, Accelerating the Distribution Estimation for the Weighted Median/Mode Filters, ACCV, 2014.

Lu Sheng, Songnan Li and King Ngi Ngan, Temporal Depth Video Enhancement Based On Intrinsic Static Structure, ICIP, 2014.

Lu Sheng, King Ngi Ngan and Songnan Li, Depth Enhancement Based On Hybrid Geometric Hole Filling Strategy, ICIP, 2013.

Chi Ho Cheung, Lu Sheng and King Ngi Ngan, A disocclusion filling method using multiple sprites with depth for virtual view synthesis, ICMEW, 2015.

Songnan Li, King Ngi Ngan and Lu Sheng, Screen-camera Calibration Using a Thread, ICIP, 2014.

Songnan Li, King Ngi Ngan and Lu Sheng, A Head Pose Tracking System Using RGB-D Camera, ICVS, 2013.

Lu Sheng, Jianfei Cai and King Ngi Ngan,, TIP, in preparation. A Generative Model for Robust 3D Facial Pose Tracking, TIP, in preparation.

Lu Sheng and King Ngi Ngan, Weighted Structural Prior for Structure-preserving Image and Video Applications, TIP, in preparation. Lu Sheng, Thesis Oral Defense

Page 94: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Thanks to

My supervisor Prof. King Ngi NganProf. Jianfei Cai

Committee members Prof. Wai Kuen Cham, Prof. Thierry Blu,

and Prof. Kwanghoon Sohn

My lovely IVP labmates

& My sweet families!

Lu Sheng, Thesis Oral Defense

Page 95: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Depth Propagation under Segment Constraint

Cost function construction

Randomly select 𝑘 sub-patches in each patch

Estimate similarity between two sub-patches

Calculate the cost of 𝑗𝑡ℎ sub-patch of 𝑢 with 𝑣, and find the 𝑣∗ patch with the minimum cost

Form a histogram indicating the number of sub-patches in 𝑢 that matches with 𝑣

Add spatial constraint, the cost is

Lu Sheng, Thesis Oral Defense

Page 96: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Gaussian Models for the Local Structures

Kernel Specification

Distribution is a mixture of Gaussian models

Constant time filter: Domain transform filter [1] Guided image filter [2]

[1] K. He et al., ECCV 2010[2] E. Gastal and M. Oliveira, ACM ToG 2011

Noise variance

Lu Sheng, Thesis Oral Defense

Page 97: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Gaussian Models for the Local Structures

Noise std

Lu Sheng, Thesis Oral Defense

Page 98: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Online Update Scheme

Variational Parameter Estimation

Factorize the posterior into independent Gaussian and Dirichlet distributions

The reliability of the model

The probable depth is

The posterior can be approximated by

Recursive estimation is possible!

Lu Sheng, Thesis Oral Defense

Page 99: RGB D Video Enhancement and Applications · Lu Sheng, Thesis Oral Defense. Structures in a Local Patch cloud object tower sky Lu Sheng, Thesis Oral Defense. Structures in a Local

Online Update Scheme

Variational Parameter Estimation

Factorize the posterior into independent Gaussian and Dirichlet distributions

The posterior can be approximated by

Moment matching to estimate the hyperparameters

Closed-

form

solutions!

Lu Sheng, Thesis Oral Defense