rod-tv : surface reconstruction on demand by tensor voting page 2 Outline 1. Introduction and motivation 2. Rod-tv framework overview 3. Rod-tv modules

rod-tv : surface reconstruction on demand by tensor voting

page 2

Outline

1. Introduction and motivation2. Rod-tv framework overview3. Rod-tv modules

3a. Multi-scale data hierarchy3b. Levels-of-detail control3c. Visibility culling3d . Tensor voting

4. Results5. Conclusions and future work

Outline


3a . Multi-scale data hierarchy3b . Levels-of-detail3c. Visibility culling3d . Tensor voting



page 4

1. Introduction and motivation

Problem statement

ỜDigital models are demanding nowadays in some industries, for instance, 3D movie industry, 3D game industry, etc.

ỜModels can be acquired by using 3D scanning devices

> Bunny dataset – 35,948 points (scanned by Stanford University)


page 5


Problem statement

Ờ Usually, digital models are represented as a set of 3D isolated points, i.e. (xi, yi, zi)

Ờ 3D scanned models are huge and noisy- Huge : real, detailed, accurate 3D models

- Noisy : measurement errors in the scanning process

ỜOur concern: How to render these 3D models efficiently?- [Approach 1] point-rendering in computer graphics- [Approach 2] surface reconstruction in computer vision


page 6


Approach 1: Point-rendering

Ờ A point-rendering is a graphical technique which uses simple geometrical shapes as rendering primitives

Ờ Example- QSplat [Rusinklewicz and Levoy in 2000]


page 7


Point-rendering in QSplat

bunny dataset - 35,948 points

point-rendering – using sphere as a rendering primitive

[high-resolution] [low-resolution]

[close-up view]


page 8


Point-rendering in QSplat

Ờ In QSplat, the input 3D points are organized as a hierarchy

ỜQSplat uses above hierarchy structure for view-dependent, levels-of-detail rendering

> A spectrum of data resolution

decreasing data resolution


page 9


Problems in point-rendering

bunny [zoom-in] - 8,365 pts

point-rendering [zoom-in]

Ờ Problems- upper-bound by the scanning resolution- can handle noise-free 3D point input only


page 10


Summary of point-rendering

Ờ Advantages- Fast and costless in rendering

[Fast : hierarchy for view-dependent and levels-of-detail control][Costless : simple geometrical shapes]

Ờ Limitations- Upper-bound by scanning resolution- Noise-sensitive application system

Ờ Comments- We need:

- A flat and smooth surface in any data resolution- A noise-robust application system


page 11


Approach 2: Surface reconstructionỜ A surface reconstruction is a vision technique

which uses polygonal patches to represent a surface

Ờ Example- Tensor voting [Medioni, Mi-Suen Li and Chi-Keung Tang in 2000]


page 12

1. introduction and motivation

Tensor voting ~ Flat and smooth surfaceỜ Tensor voting can generate a smooth and flat

surface at any data scanning resolution


page 13


Tensor voting ~ Noise-robustnessỜ Tensor voting can handle noisy 3D point as an

input


page 14


Problems in tensor voting

Ờ Problems - surface reconstructing process is usually costly and can be regarded as two reasons:

1. Lack of levels-of-detail control

- Surface is reconstructed in single data resolution only without consider the viewer position

2. Lack of visibility control

- Input 3D Points are fully processed whether they are visible or not given a particular viewing volume


page 15


Summary of tensor voting

Ờ Advantages- Flat and smooth surface is generated in any data resolution- Noise-insensitive reconstruction system

Ờ Limitations- Lack of levels-of-detail control- Lack of visibility control

Ờ Comments- We need :

- View-dependent and levels-of-detail surface reconstruction system


page 16


Our contributions: Rod-tv

ỜOur approachsurface reconstruction on demand

by tensor voting- Abbreviate to rod-tv

- A “graphics for visions” approach

ỜGeneral idea- Defer vision surface reconstruction

[tensor voting]

- Use graphical techniques to query a better visible subset to meet the “on demand” requirement

[levels-of-detail control and visibility culling]

Outline


3a Multi-scale dataset3b Levels-of-detail3c Visibility culling3d Tensor voting



page 18

2. Rod-tv framework overview

input: - 3D isolated points

1. generate a spectrum of data resolution

2. desire an appropriate data resolution at given particular viewing position

3. cull all the invisible parts

4. reconstruct surface on visible subset

output: on demand surface


page 19

Outline


- 3a. Multi-scale data hierarchy- 3b. Levels-of-detail control- 3c. Visibility culling- 3d. Tensor voting



page 20

3. Rod-tv modules

Module 1: Multi-scale data hierarchyỜModule overview

- Input : a 3D dataset in single-resolution- Ambition : use a single data structure to generate a spectrum of dataset resolution


page 21

3. Rod-tv modules

Module 1: Multi-scale data hierarchyỜModule description

- Make use of hierarchical spatial data structure~ Grid Pyramid~ Octree

- Usually it is organized as tree- Resolutions are encoded inside tree internal nodes- Different height generated different resolutions

ỜModule implementation- Component 1: Grid Pyramid- Component 2: Octree

[both of them are spatial volume hierarchies]


page 22

3. Rod-tv module 1 – multi-scale data hierarchy

Component 1: Grid Pyramid

Ờ A Grid Pyramid is:- A hierarchy of volume [3D array]

- Volume resolution decrease from the bottom of the pyramid to the top

A. lo-resolution – the entire scene is represented by 1 grid cell

B. mi-resolution – the entire scene is represented by 8 grid cells

C. hi-resolution – the entire scene is represented by 64 grid cells

D. the corresponding tree representation of the Grid Pyramid


page 23

3. rod-tv modules – multi-scale dataset

component 1: grid pyramid

Top-down approachthe construction proceedsdownwards by iteratively subdividing current cells into 8 equal cells in 3-D case and 4 equal cells in 2-D case

Bottom-up approachthe construction proceedsupwards by iteratively consolidating a group of 8 cells in 3-D case and a group of 4 cells in 2-D case

A. input 3D points (12 pts)

B. Grid Pyramid – level 0cell size = 1number of points = 11

C. Grid Pyramid – level 1cell size = 2number of points = 8

D. Grid Pyramid – level 2cell size = 4number of points = 3

E. Grid Pyramid – level 3cell size = 8number of points = 1

F. the corresponding tree representation of the Grid Pyramid


page 24


Component 1: Grid Pyramid

Ờ Advantages- Multi-scale representation capability

- Grid cell elements can be accessed in constant time

Ờ Limitations- Memory is wasted on tessellating a large group of empty grid cells- Total number of resolutions are limited


page 25

Ờ An Octree is:- Subset of Grid Pyramid- Adaptive cell tessellation

~ large octant is used to represent low density region

~ small octant is used to represent high density region


Component 2: Octree

A. Octree example in 3-D space

B. the corresponding tree representation of the Grid Pyramid


page 26

Octree building rulesSubdivision stops if1) number of points in each cell less than certain threshold2) tree height is higher than certain threshold

A. input 3D points (12 pts)

B. Octree – level 3number of cells = 1number of points = 1

C. Octree – level 2number of cells = 4 number of points = 3

D. Octree – level 1number of cells = 13number of points = 8

E. Octree – level 0Number of cells = 19number of points = 11

F. the corresponding tree representation of the Grid Pyramid


page 27


Component 2: Octree

Ờ Advantages- Memory is saved using adaptive cell tessellation

Ờ Limitations- Points are too close: unlimited subdivision- Octree is not balanced


page 28

3. Rod-tv modules

Summary of multi-scale data hierarchyỜ Comments

- Data structure alternative tips:~ Large memory : Grid Pyramid~ Limited memory : Octree

- Multi-scale dataset to:~ Facilitate levels-of-detail [module 2]~ Facilitate visibility culling [module 3]


page 29

Outline


- 3a. Multi-scale dataset- 3b. Levels-of-detail control- 3c. Visibility culling- 3d. Tensor voting



page 30

3. Rod-tv modules

Module 2: Levels-of-detail (on demand)

ỜModule overview- Input : a multi-scale dataset and a viewing position - Ambition : select an appropriate resolution dataset

line-of-sight distance

Closer to the viewer[need more points to represent]

Farther to the viewer[need less points to represent]

STAN

10,242 points3,708 points1,016points272points


page 31

3. Rod-tv modules description

Module 2: Levels-of-detail (on demand)

ỜModule description- Make use of the hierarchy we build in module 1- Traverse the hierarchy from coarse-to-fine resolution- In other words, we traverse the tree from root-to-leaf order- Then, selection criteria is made on each internal node and finally appropriate data resolution is returned

~ selection criteria can be made either in object-space or image-space

ỜModule implementation- Component 1: Range based [object-space]

- Component 2: Screen based [image-space]


page 32

3. Rod-tv module 2 – levels-of-detail (on demand)

Component 1: range-based methodỜ Levels-of-detail selection criteria:

- Separate the 3-D space into different ranges {d1,d2,d3}

- Selection is made based on the viewer-object distance

d2d3

line-of-sight distance

d1

Decision rule

If distance < d1 then LOD 0 is returned

elseif d1 < distance < d2 then LOD 1 is returned

elseif d2 < distance < d3 then LOD 2 is returned

else LOD 3 is returned

endif

[LOD 0] [LOD 1] [LOD 2] [LOD 3]


page 33

step 1. define rangesd1, d2, d3 and d4

step 2. compute the viewer-object distance

step 3.start traverse from the root level

step 4.check whether the criteria is met or not

If yes, then stopElse, then continue traversing next lower level and repeat step 4

d1

d2

d3 d4

viewer-object distance

line-of-sight distanceviewer

object

d1

d3

d4

LOD 3

LOD 2

LOD 1

LOD 0

LOD 0LOD 1LOD 2LOD 3

d2


page 34

step 1. define rangesd1, d2, d3 and d4

step 2. compute the viewer-object distance

step 3.start traverse from the root level


If yes, then stopElse, then continue traversing next lower level and repeat step 4

d1

d2

d3 d4

viewer-object distance


object

d1

d3

d4

LOD 3

LOD 2

LOD 1

LOD 0


d2


page 35

3. Rod-tv module 2 – levels-of-detail (on demand)

Component 2: screen-based methodỜ Levels-of-detail selection criteria:

- Project the cell on the image plane- Use sphere instead of octant/cube to avoid alignment- Selection is made based on the number of pixel covered on the image plane


page 36

step 1. Define a threshold t for pixel coverage

step 2. Start traversing from the root of the tree

step 3.Computer the cell projection s


If s < t then stopElse, then continue traversing next lower level and repeat step 3


object

LOD 3

LOD 2

LOD 1

LOD 0


image-plane

1

4

12

24


page 37

3. Rod-tv modules

Summary of levels-of-detail (on demand)

Ờ Advantages- Range-based method : easy to implement- Screen-based method : a single threshold is used

Ờ Limitations- Range-based method : hard to set a range-thresholds

Ờ Comments- Decision is made on-the-fly (demand)- An appropriate resolution dataset is generated after running this module, therefore “on demand” processing is achieved


page 38

Outline





page 39

3. Rod-tv modules

Module 3: Visibility culling

ỜModule overview- Input : an appropriate resolution dataset- Ambition : cull all the invisible parts

ỜModule description- Cull all the tokens which are outside viewing frustum- Cull all the tokens which are occluded by the frontiers

ỜModule implementation- Component 1: View frustum culling - Component 2: Occlusion culling


page 40

3. Rod-tv module 3 – visibility culling

Component 1: View frustum culling - Find out a visible subset that is inside the frustum- A token-frustum intersection test- Usually, a frustum consists of six clipping planes, namely,left, right, top, bottom, near and far.


page 41


Component 1: View frustum culling

- A token is inside the view frustum if d >= 0 in all clipping planes

a x+b y+c z+d=0

TTTTTTTTTTTTTT

tokend

D = T Na,b,c

- Plane equation :

token token token tokenT = x ,y ,zTTTTTTTTTTTTTT

- Token position :

- Token-plane intersection test :

D > 0 – in front of the planeD = 0 – on the planeD < 0 – behind the plane


page 42

- Find out a better visible subset that is inside the viewing frustum but not visible in the final result

- In other words, we cull all the tokens which are occluded by the frontier


Component 2: Occlusion culling


page 43


Component 2: Occlusion culling

- Project token on the image plane and decision is made based on the projection and the depth information

- Make use of occlusion map (o-map) and depth map (d-map)- o-map: potential occluder are recorded and corresponding covering pixels are marked- d-map: record projected token depth information

image plane

occlusion map (o-map)

depth map (d-map)


page 44

Depth test

If the token projection overlaps in the o-map, then we compare the token depth in the d-map. We keep the one which is closer to the viewer and cull the other one.


page 45

3. Rod-tv module 3

Summary of visibility culling

Ờ Advantages- No CPU time is wasted on invisible parts:

~ Outside the viewing volume~ Occluded by the frontier~ Less than certain pixel on the image plane

Ờ Limitations- Hard to produce extract occluder set

Ờ Comments- A better visible subset dataset is produced after running this module, therefore “on demand” processing, once again, is achieved


page 46

Outline





page 47

3. Rod-tv modules

Module 4: Tensor voting

ỜModule overview

~ [module 2] range/screen levels-of-detail selection~ [module 3] view frustum and occlusion culling

- Input : a “on demand” visible point set- Ambition : reconstruct “on demand” surfaces


page 48

3. Rod-tv modules

Module 4: Tensor voting

ỜModule description:- Make use of the original tensor voting algorithm- use tensor as data representation- use voting as data communication- consists of four stages:

~ stage 1: information encoding~ stage 2: sparse tensor voting~ stage 3: dense tensor voting~ stage 4: feature extraction


page 49

3. Rod-tv module 4 – tensor voting

Data representation

- Tensor voting uses a 2nd order symmetric tensor as a data representation

- Symmetric tensor can be visualized as an ellipsoid- Mathematically, we can use eigensystem to represent the ellipsoid

maxmax

max mid min midmid

min min

eλ 0 0

S= e e e 0 λ 0 e

0 0 λ e

T T T

max max mid mid min minmax mid minS=λ e e λ e e λ e e


page 50


Data representation

symmetric max mid stick mid min plate min ball

Tmax maxstick

T Tmax max mid midplate

T T Tmax max mid mid min minball

T = λ - λ T λ - λ T λ T

T e e

T e e e e

T e e e e e e

- After reorganizing the eigensystem, the ellipsoid can be decomposed into stick , plate and ball tensor

[stick tensor] [plate tensor] [ball tensor]


page 51


Data communication

- Tensor voting uses voting algorithm as a data communication- Voter site propagates data to its neighborhood by using predefined voting kernel- Votes then are then collected and accumulated in the votee site

voter site p voter site r

votee site q


page 52

3. rod-tv module 4 – tensor voting

tensor voting system


page 53


Stage 1: information encoding


page 54


Stage 2: sparse tensor voting


page 55


Stage 3: dense tensor voting


page 56


Stage 4: feature extraction


page 57

Outline





page 58

4. Results

Surface reconstruction (on demand) by tensor voting

STAN

surface reconstruction on LUNG datasetleft : origin tokens - 3,035 tokensnoisy tokens - 759 tokens (25% of the original)right:surface reconstruction by tensor voting


page 59

4. Results


STAN

surface reconstruction on demand of the LUNG datasetoriginal tokens:3,035 tokenson-demand tokens:1,823 tokens


page 60

4. Results


STAN

surface reconstruction on demand of the LUNG datasetoriginal tokens:3,035 tokenson-demand tokens:1,805 tokens


page 61

4. Results


STAN

surface reconstruction on demand of the LUNG datasetoriginal tokens:3,035 tokenson-demand tokens:232 tokens


page 62

4. Results



page 63

4. Results



page 64

4. Results



page 65

5. Conclusions and future work

Conclusions

Ờ Introduce levels-of-detail surface reconstruction for dense and noise 3D points

- [module 1] exploit the Grid Pyramid or Octree data structure for multi-scale data representation

- [module 2] use range-based or screen-based method for levels-of-detail “on demand” selection

- [module 3] apply culling algorithm for finding better “on demand” visible subset

- [module 4] reconstruct surface by means of tensor voting

Ờ Implementation Issue

- nearly real time and “on demand” surface can be generated on-the-fly (Tensor voting is the bottleneck)


page 66

5. Conclusions and future work

Future work

Ờ System optimization- Optimized neighborhood searching for tensorial support

- Octree parallel data structure- Compact hierarchical data structure, such as wavelet- Sub-pixel technique in both occlusion map and depth map- Viewer caching for dynamic update surface


page 67

Question and Answer

Documents

rod-tv : surface reconstruction on demand by tensor voting page 2 Outline 1. Introduction and motivation 2. Rod-tv framework overview 3. Rod-tv modules