ufdcimages.uflib.ufl.eduufdcimages.uflib.ufl.edu/UF/E0/05/08/65/00001/CORRING_J.pdf · ACKNOWLEDGMENTS Thanks to my wife, Emily, for her patience, love, support, and input. She listened

A COMPLEX-VALUED FIELD MODEL FOR SHAPE REPRESENTATIONWITH APPLICATIONS IN COMPUTER VISION AND GRAPHICS

By

JOHN R. CORRING

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2017

c⃝ 2017 John R. Corring

For Alexander

ACKNOWLEDGMENTS

Thanks to my wife, Emily, for her patience, love, support, and input. She listened to

me go on for hours about everything ranging from the theoretical aspects of my thesis to

specific engineering problems. For that, and her hard work, I will always be grateful.

Thanks to Anand Rangarajan for creative motivation. Anand always believed in

me throughout this long process. His breadth of knowledge helped me to craft a thesis

that reflects the scope of my own interests in Shape Analysis. I’ll never forget this, or our

extra-curricular conversations.

Thanks to Arunava Banerjee, Alireza Entezari, Michael Jury, and Paul Robinson. I

learned a lot from each member of my committee. The depth of theoretical and practical

knowledge gained from you all was invaluable.

Thanks to Joe Wilson for supporting me during the last four years. By working in the

CSI Lab I gained a lot of practical knowledge. My colleagues from the CSILab — Brandon

Smock, Ferit Toska, Gus Munoz, Maks Levental, Pete Dobbins — were great sources of

friendship and support over the years. Thanks to Nuri Yeralan, Karthik Gurumoorthy,

Subhojit Sengupta, Jan Stuehmer, Thomas Mollenhof. Stimulating conversations with

brilliant people are the principal form of payment provided to a PhD student and you were

the best sources.

Thanks to Dr. Frank Schmidt and Dr. Daniel Cremers for inviting me to Munchen,

Germany to experience working in your lab. The time spent there was inspiring and

invaluable for deciding what to do after graduation.

Thanks to my mom and dad for their encouragement. Thanks to my brothers for their

support and love.

Finally, I want to remember Grandmother Maryann Corring who passed away while I

was working on my PhD. She introduced me to great literature, architecture, history, and

arts at a young age and instilled a passion for learning in me that helped define who I am

today. She is missed.

4

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

CHAPTER

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.1 Prior Work on Shape Models . . . . . . . . . . . . . . . . . . . . . . . . . . 141.1.1 Implicit Shape Representations . . . . . . . . . . . . . . . . . . . . . 151.1.2 Explicit Shape Representations . . . . . . . . . . . . . . . . . . . . . 19

1.2 Prior Work on Registration and Matching . . . . . . . . . . . . . . . . . . 201.3 Prior Work on Surface Reconstruction . . . . . . . . . . . . . . . . . . . . 241.4 Outline of this Document . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2 REPRESENTING SHAPE WITH PHASE . . . . . . . . . . . . . . . . . . . . . 29

2.1 The Complex Wave Representation . . . . . . . . . . . . . . . . . . . . . . 312.1.1 Analysis of ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.1.2 ψ for Oriented Multi-curve Shapes . . . . . . . . . . . . . . . . . . . 36

2.2 Wave Mixtures as Geometric Primitives . . . . . . . . . . . . . . . . . . . . 392.2.1 A Note on Gabor Analysis . . . . . . . . . . . . . . . . . . . . . . . 412.2.2 Square-root Densities and Probabilistic Interpretation of the Complex

Wave Mixture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.3 Relationship Between Signed Distances and Complex Wave Mixtures . . . 432.4 An Embedding Theorem for Complex Wave Mixtures . . . . . . . . . . . . 44

3 REGISTRATION: RESONANT DEFORMABLE MATCHING . . . . . . . . . 46

3.1 Hypothesis Classes for Registration . . . . . . . . . . . . . . . . . . . . . . 463.1.1 Euclidean Transformations . . . . . . . . . . . . . . . . . . . . . . . 47

3.1.1.1 Euler angles . . . . . . . . . . . . . . . . . . . . . . . . . . 483.1.1.2 Quaternions . . . . . . . . . . . . . . . . . . . . . . . . . . 483.1.1.3 Action of Euclidean transformations on the normal vector 49

3.1.2 Affine Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 493.1.3 Nonrigid Transformations and Regularization . . . . . . . . . . . . . 50

3.1.3.1 Thin-plate spline radial basis functions . . . . . . . . . . . 513.1.3.2 Gaussian radial basis functions . . . . . . . . . . . . . . . 53

3.2 Introducing Normal Variables for the Target Oriented Point-set . . . . . . 543.3 Choosing a Suitable Distance Function . . . . . . . . . . . . . . . . . . . . 553.4 Gradient Computation and Optimization Details . . . . . . . . . . . . . . . 58

5

3.5 A Brief Comparison with Currents . . . . . . . . . . . . . . . . . . . . . . 613.6 Analysis of the RDM Objective Function . . . . . . . . . . . . . . . . . . . 64

3.6.1 Inner Product of CWRs . . . . . . . . . . . . . . . . . . . . . . . . . 643.6.1.1 Isotropic CWRs . . . . . . . . . . . . . . . . . . . . . . . . 643.6.1.2 Anisotropic CWRs . . . . . . . . . . . . . . . . . . . . . . 65

3.6.2 Asymptotic Behavior of the RDM Objective Function . . . . . . . . 66

4 REGISTRATION: EMPIRICAL ANALYSIS . . . . . . . . . . . . . . . . . . . . 68

4.1 Experimental Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.2 Rigid and Affine Registration . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.2.1 Range of Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.2.2 Gaussian Noise on Points . . . . . . . . . . . . . . . . . . . . . . . . 694.2.3 Missing Points With Outliers . . . . . . . . . . . . . . . . . . . . . . 72

4.3 Synthetic Normal Recovery, Warps, and Occlusions . . . . . . . . . . . . . 724.4 Non-Synthetic Matching Experiments . . . . . . . . . . . . . . . . . . . . . 764.5 CMU House Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.6 3-D Subcortical Structure Registration . . . . . . . . . . . . . . . . . . . . 774.7 Maximum Likelihood Registration with |ψ|2 as a Density . . . . . . . . . . 78

5 THEORY OF THE REPRESENTATION: CONNECTEDNESS, COMPLETENESS,AND CONTRIBUTIONS TO THE GABOR EXPANSION . . . . . . . . . . . . 84

5.1 Connectedness of Pairs of Complex Waves . . . . . . . . . . . . . . . . . . 845.1.1 Zeros of ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.1.2 Connectedness of θψ = 0 for Symmetric Configurations . . . . . . 865.1.3 Connectedness of θψ = 0 for Asymmetric Configurations . . . . . 90

5.1.3.1 Numerical analysis of asymmetric connectedness . . . . . . 905.1.3.2 An analytical condition for asymmetric connectedness . . . 92

5.2 Going Beyond Two Atoms with Imψ . . . . . . . . . . . . . . . . . . . . . 985.2.1 Stability of Level-sets of Imψ . . . . . . . . . . . . . . . . . . . . . . 1015.2.2 The Class of Curves Approximated By F . . . . . . . . . . . . . . . 106

5.3 Asymptotic Approximation of Modular Distance Fields by Gabor Atoms . 107

6 FURTHER EXPLORATIONS AND APPLICATIONS . . . . . . . . . . . . . . 111

6.1 Curve Extraction from ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1116.1.1 Mean Shortest-Path Error Evaluation on 2-D Data: MPEG7 Dataset 1136.1.2 Hausdorff Distance-based Evaluation of 3-D Data: Spheres, Bunny,

FAUST Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1156.2 ψ for kPCA on Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216.3 Generalization of the CWR to Embedded Surfaces . . . . . . . . . . . . . . 125

7 CONCLUSION AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . . . 132

7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1327.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

7

LIST OF TABLES

Table page

1-1 The Fast Marching algorithm for constructing signed distance functions. . . . . 17

2-1 Technical Layout of the Operations on ψ . . . . . . . . . . . . . . . . . . . . . . 35

4-1 Range of convergence for rotations. . . . . . . . . . . . . . . . . . . . . . . . . . 70

4-2 Average (Standard deviation) initial DICE and final DICE scores over a set offour subcortical structures registered along the boundary. . . . . . . . . . . . . . 78

6-1 An algorithm for extracting the shape corresponding to a collection of orientedpoints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

8

LIST OF FIGURES

Figure page

1-1 Visualization of the fast marching process. . . . . . . . . . . . . . . . . . . . . . 16

2-1 A simple example of composition of a distance function from oriented pointsusing the Complex Wave Representation (CWR). . . . . . . . . . . . . . . . . . 33

2-2 Visualization of the phase of ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2-3 Merging of two curves as oriented points move closer together. . . . . . . . . . . 38

2-4 Zero level-sets of the phase of ψ for subject 1 of FAUST under several differentvalues of σ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3-1 Tait-Brian angles. ψ provides the rotation about the initial z−axis, θ the rotationabout the subsequent y−axis, and Φ the rotation about the subsequent x−axis. 1 47

3-2 An example of surface reconstruction by RDM. . . . . . . . . . . . . . . . . 56

3-3 An example of curve reconstruction by RDM. . . . . . . . . . . . . . . . . . 57

3-4 Profile of the L2 distance function over several transformations and choices ofparameters σ, λ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4-1 Median error and variance for rigid transformation with pointwise Gaussian noise. 71

4-2 Median error and variance for rigid transformation with dropped inliers andoutliers added. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4-3 Median error and variance for affine transformations. . . . . . . . . . . . . . . . 74

4-4 Comparison of different techniques for estimating normal vectors from an organizedpoint-set and an oriented point-set. . . . . . . . . . . . . . . . . . . . . . . . . . 75

4-5 Experimental comparison of RDM and other matching algorithms on a 2-D dataset. 80

4-6 Experimental comparison of RDM and other matching algorithms on 3-D datasets. 81

4-7 Recall graphs and area under the curve for the CMU House. . . . . . . . . . . . 82

4-8 Maximum likelihood alignment using |ψ(x)|2 as a density. . . . . . . . . . . . . . 83

5-1 Visualization of g along a vertical slice of the set containing a zero of Imψ. . . . 87

5-2 Numerical experiments showing the connectedness and non-connectedness atdifferent values of parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5-3 Plots showing the zero crossings of interest for the analytical solution to thedisconnection problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

9

5-4 An explanatory figure to accompany the proof of approximation for the multi-atomcase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6-1 Zero level-sets of subject 5 of the FAUST sequence. . . . . . . . . . . . . . . . . 114

6-2 Average error on shortest path between 250 pairs of points (randomly chosen)in the estimated mesh at 10 sampling rates. . . . . . . . . . . . . . . . . . . . . 116

6-3 Sphere reconstruction over different sampling rates. . . . . . . . . . . . . . . . . 117

6-4 Face reconstruction over different sampling rates. . . . . . . . . . . . . . . . . . 118

6-5 Bunny (closed surface) reconstruction over different noise levels. . . . . . . . . . 119

6-6 Bunny (closed surface) reconstruction over different sampling rates. . . . . . . . 120

6-7 Closed curves and density estimates from linear combinations of CWRs. kPCAbasis of CWRs as a subspace classifier. . . . . . . . . . . . . . . . . . . . . . . . 123

6-8 Recovery of closed curves in training and testing samples for the Gatorbait dataset.124

6-9 A fish drawn on the surface of a sphere using the spherical CWR. . . . . . . . . 127

6-10 A diamond drawn on a mesh from FAUST using 4 oriented points. . . . . . . . 129

6-11 Registration of spherical oriented point-sets. . . . . . . . . . . . . . . . . . . . . 131

10

Abstract of Dissertation Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of theRequirements for the Degree of Doctor of Philosophy

A COMPLEX-VALUED FIELD MODEL FOR SHAPE REPRESENTATIONWITH APPLICATIONS IN COMPUTER VISION AND GRAPHICS

By

John R. Corring

May 2017

Chair: Anand RangarajanMajor: Computer Engineering

Shape processing and analysis is a growing area of research straddling computer

vision and computer graphics. Probability density and signed distance representations

have become central to the field. While modeling uncertainty about observations and

providing a convex representation, probability densities lack geometric precision and often

exhibit topological inaccuracies. Signed distances lack the robustness of densities and the

geometry of the feature space is very complicated.

In this thesis, we develop a parametric model for approximating signed distance

functions. This allows us to build up a useful shape representation with accurate

topological and geometric data from point and normal estimates. The representation

is approximately linear under union of parameters, thus allowing for easy computation and

conventional statistical methods to be leveraged for shape processing.

We develop an algorithm for registering oriented point-sets that leverages the

representation. We compare this algorithm with various contenders in registration. We

also develop algorithms for mesh extraction from sparsely scattered oriented points,

dimensionality reduction on a collection of shapes, and approximation of distance

functions on Riemannian surfaces. Empirical validation of all of the approaches outlined

in this work is performed. Theoretical contributions include the description of a family of

curves for which the representation converges uniformly, clarification of the regularizing

principle on the geometric component of the representation, and an analysis of the

11

asymptotic behavior of the representation with respect to the magnitude variance and

frequency variance parameters.

12

CHAPTER 1INTRODUCTION

Shape modeling is a core area of research in Computer Vision and Graphics. The

principal problems in shape modeling are

Representation: the parameters stored to reconstruct or retrieve the shape. Choosinga representation entails a variety of limitations and requirements, there is no one sizefits all representation.

– Representations can be focused on visualization,

– Easy retrieval or lookup may be a major component,

– Ease of estimation from noisy measurements and sparsity may be a majorfeature.

Deformation: this component implements the morphology of the shapes. It isusually represented as the class of maps used to map one shape object intoanother. Depending on the representation the deformations may need to havecertain invariant subgroups.

– Deformations may be rigid—limiting the degrees of freedom of the shape to anextrinsic pose,

– Deformations may also be nonrigid, with a variety of different penaltiesavailable to restrict the range of nonrigid motions.

Incorporation: once a hypothesis on the type of morphology the shape will undergois established, external influences that implement task-specific processes need to beimposed on the shape. These are usually a component in inferring an appropriatedeformation.

– By representing the shape as a field, the class of deformations acting on thefield typically must be evaluated everywhere but the modeling can be veryprecise,

– Representations that take samples from the field allow quicker evaluation ofincorporated effects but may suffer from accuracy issues.

Once these problems are addressed the shape framework for Computer Vision is available:

using prior shape knowledge to regularize ill-posed problems of Vision. Problems such as

image segmentation can be conditioned by incorporating prior shape knowledge about

13

the subject to be segmented. This thesis is about a new approach to implementing this

framework.

Shape models are useful in their own right as symbolic virtual objects, representative

of a class or a collection of observations of a form. Concretely, they can be used to

estimate otherwise difficult to measure statistics: such as range of body dimensions for a

given organism. While the mathematical study of shape spans multiple fields and has a

deep history [41, 70, 93, 97], we have a more narrow focus in this work. We will emphasize

distinctions between shape representations that influence the range of modeling situations

handled by the representations. We will also review the mathematics underlying the

representation when necessary.

This thesis outlines a new representation of shape. The representation has both

implicit components, a Complex-valued function on an ambient embedding space, and

explicit components, sampled from the normal bundle of a co-dimension 1 manifold.

It has a meaningful extension away from the set or surface of interest: the magnitude

stores the probability of observing a point while the phase is approximately the signed

distance in a narrow band around the surface. We have included several useful properties

of the representation and proof useful theorems regarding the class of shapes that can

be represented. In this thesis document we will lay out the progress on studying the

representation and describe how we have approached the problems of shape modeling

using the representation. We also present an empirical study of registration of curves

as oriented point-sets, performing statistics on collections of shapes, and extending the

representation in various ways.

1.1 Prior Work on Shape Models

The representation problem from above can be approached from an abstract

standpoint first and specified after decisions are made regarding the task type and

subservient modeling. In general, a shape representation for a set A ⊂ Rd consists of a

domain Ω (possibly an embedding space, but often simply an index set) mapped to range

14

R (which may in general consist of objects) by fA such that one can reconstruct A up

to an equivalence class. There are many vagaries in this abstract definition: what does it

mean to “reconstruct A up to an equivalence class?” what are the structures of Ω, R? etc.

These all depend on the representation. I’ll give examples below, as we explain specific

representations.

1.1.1 Implicit Shape Representations

Implicit shape representations portray embedded sets and shapes. Suppose that we

are given A ⊂ R2, with A open. Furthermore, suppose that A has a boundary given

by a family of C2 closed curves. Often, from the shape standpoint, we are interested in

representing these closed curves ∂A. An implicit representation fA : R2 → F should have

some distinguished property along ∂A that allows us to “reconstruct A”. For example, we

could say that fA should have a maximum value along ∂A. Then we could reconstruct A

by considering the maximal α such that fA ≥ α = ∅. There are a few hiccups with this

representation, and that makes it a good example to start from. First off, you couldn’t

reconstruct A in general from this since we have no further properties of fA to exploit:

we wouldn’t be able to distinguish A from Ac. Second, this representation is not injective:

there just aren’t enough constraints on fA, and there are a continuum of equivalent

functions representing A this way. Finally, very importantly for computer applications,

this representation is not very stable to small perturbations in f . I’ll return to a specific

implementation of a shape representation that deals with these problems after motivating

the registration problem.

Implicit representations abound in the literature [15, 40, 43]. The signed distance

function (SDF) is an example in which the sign encodes interior/exterior properties with

the absolute value encoding the distance to the nearest point in the set of curves (surfaces)

[74, 75, 77]. Contrast this with the unsigned distance function which lacks interior/exterior

information. Surprisingly, there is little work on registering template and target SDFs. We

address the technical reasons for this below.

15

p

∞∞−0.5

0.5∞

∞∞

∞∞

∞

∞∞

∞−0.30.7

∞∞

∞∞

∞

∞∞

∞−0.90.1

∞∞

∞∞

∞

∞∞

∞∞−0.3

0.7∞

∞∞

∞

∞∞

∞∞−0.5

0.5∞

∞∞

∞

∞∞

∞∞−0.5

0.5∞

∞∞

∞

∞∞

∞∞−0.3

0.7∞

∞∞

∞

∞∞

∞−0.90.1

∞∞

∞∞

∞

∞∞

∞−0.30.7

∞∞

∞∞

∞

∞∞−0.5

0.5∞

∞∞

∞∞

∞

∂AAΩ \A

Ω

p

FA

FA(p)

p

(a)

p

∞−1.5−0.5

0.51.5

∞∞

∞∞

∞

∞∞−1.3

−0.30.7

1.7∞

∞∞

∞

∞∞

∞−0.90.1

1.1∞

∞∞

∞

∞∞

∞−1.3−0.3

0.71.7

∞∞

∞

∞∞

∞−1.5−0.5

0.51.5

∞∞

∞

∞∞

∞−1.5−0.5

0.51.5

∞∞

∞

∞∞

∞−1.3−0.3

0.71.7

∞∞

∞

∞∞

∞−0.90.1

1.1∞

∞∞

∞

∞∞−1.3

−0.30.7

1.7∞

∞∞

∞

∞−1.5−0.5

0.51.5

∞∞

∞∞

∞

∂AAΩ \A

Ω

p

FA

FA(p)

p

(b)

Figure 1-1. Visualization of the fast marching process. a) shows the field after 20 iterationswhile b) shows the field after 40 iterations.

The signed distance bS : Rd → R for an open set S (I’ll assume S has a smooth

boundary consisting of a collection of simply connected hypersurfaces) satisfies

|∇bS| = 1 (1–1)

bS|∂S = 0, bS ∈ C1(∂S).

[26] offers a full accounting of the analysis of SDFs as a shape representation. Signed

distance functions are typically constructed via the fast marching or fast sweeping

algorithm [91, 99, 109]. All of these methods appear to be related to Dijkstra’s algorithm

[30] in that they propagate distance information outward from the seeding zone of the

boundary. Note that in general fast marching etc. may be used for computing things other

than distances. Furthermore, the unsigned distance function—in which the values are all

non-negative and typically the seeding zone is a set of points—can also be constructed by

fast marching. Here we give an overview of the fast marching algorithm as a reference for

the rest of the paper.

16

Table 1-1. The Fast Marching algorithm for constructing signed distance functions. This isthe standard way to construct implicit representations of curves and surfaces. Inthis thesis will will explore an alternative.

Require: I a domain of nodes, S ⊂ I an initial seeding zone.1: function FastMarching(I, S)2: l(i) ← Unvisited for all i ∈ I∧ /∈ S3: b(i) ← ∞ for all i ∈ I∧ /∈ S4: l(i) ← Visited for all i ∈ S5: b(i) ← 0 for all i ∈ S6: for i in l(i) = Unvisited do7: b(i) ← Update by numeric approximation of (1–1). If the new value decreasesb(i) then l(i) = visited.

8: end for9: q ← Arg min

i:l(i)= visitedb(i)

10: l(q) ← accepted11: for j neighbors of q such that l(j) is not accepted do12: b(j) ← Reestimate by (1–1)13: if b(j) decreased in the previous step then14: l(j) ← visited15: end if16: if There is a node k with l(k) visited then17: Return to step 10 with q = k18: end if19: end for

return ℓimkm=1 such that αip > αip+1 .20: end function

The signed distance function is considered a staple of shape modeling. Here are a few

reasons why.

Stability of reconstruction: since the function goes smoothly through 0 along theshape, the representation is robust to small, smooth additive noise. Standardalgorithms, such as marching squares [61], can be used for recovering the shape.

Computability: by the algorithm above the representation is constructible.

Methodology: there is a broad literature of algorithms that leverage variationalmethods to compute solutions to problems ranging from graphics to vision [74].

However, we will encounter several problems with the signed distance function when

we discuss registration below. The more common implicit representation to use for

registration is a probability density.

17

The probability density representation of shapes is very simple: given a set A with

smooth boundary ∂A, draw samples maNa=1 from ∂A that cover it evenly. Then form a

density estimate from these samples. The simplest estimate is the Parzen window estimate

[78]. Given the samples the Parzen window estimate is the function

f(x|maNa=1) =1

hdN

N∑a=1

g(x− aσ

),

where the g are probability density functions or kernels. σ is a spatial variance or radius

associated with the kernel g, for which there are reasonable estimates based on the original

data set.

Other techniques for estimating densities include K-Means, EM [27, 47], and Bayesian

methods [83]. Recall the example of the implicit shape representation with the shape

encoded in the largest level-set. Probability density estimates suffer from the same

problems that this representation does. Since peaked values of functions are not stable,

recovering geometry from densities is complicated. The marching squares approach that

can be used to recovery a mesh from the signed distance function no longer applies since

the level-sets are now shaped like the cross-sectional profile of the kernel g. Note that

unimodal kernels cannot have broad and smooth profiles for the level-sets of large values

that cross through the centroid: then the expected value under the model density g would

not represent the centroid. Bi-modal and multi-modal densities frequently arise in the

wavelet literature in the context of approximating curves [13].

There is a close relationship between Gaussian Parzen density estimators and

unsigned distance functions: as σ → 0 the logarithm of the mixture approaches the

unsigned distance. This fact as well as the ease of computation and robustness of the

densities biases one towards this representation in applications. On the other hand, the

unsigned distance lacks the crucial topological information that the signed distance has.

In this work, we try to bridge the gap between signed distances and densities by breaking

the symmetry of the Gaussian density and using Gabor functions (wavelets) as square-root

18

density estimators. We find that square-root densities are an interesting new area ripe for

exploration in the field of shape representation.

1.1.2 Explicit Shape Representations

Explicit shape representations differ from implicit representations in that the

domain of the function which is used to recover the shape is a set of indices that refers to

parameters. So explicit representations include sets of points, meshes, and graphs. This

work focuses on the implicit shape representation that emerges from a particular choice

of representing function, given an explicit set of locations and directions. This section

serves to prime the reader on competing methods for solving the problems that we use as

validate the quality of the representation.

At the very simplest end we have point-sets: Ω = I = 1, 2, . . . , n and R = Rd.

In this setting reconstruction of an embedded shape is ill-posed, but we can hypothesize

polygonal reconstructions up to a rigid transformation by choosing an ordering on the

points. Unfortunately, there is no universally accepted hypothesis test or metric for

choosing such a surface. If we suppose that we also have explicit neighborhood information

we could let Ω = I = 1, 2, . . . , n and R be a set of neighbors and distances to the

neighbors. In this setting we are specifying the shape by an adjacency list, which we

could convert into a matrix. However, there are many ways to embed a set of points

corresponding to indices 1, . . . , n into e.g. R3. For instance, if we have an embedding

of points m1, . . . ,m2 ∈ R3 and then we apply a rotation R and a translation T then

TRmii ∼ mii under this representation. Moreover, the entire class of isometries

(distance preserving maps) is an invariance class for this representation. This can be a

good thing: we may want to encode some invariance in our representation, so that we

identify shapes based on intrinsic properties.

Another example of an explicit shape representation is a mesh. Meshes can be

represented in an abstract way without specifying any locations for the simplices.

However, to do so we need to specify additional intrinsic properties: one can specify a

19

set of surrounding triangles (as embedded triangles in R3, not abstract simplicial features)

and their associated vertices in a particular order.

Neither meshes nor adjacency matrices are the primary explicit shape representation

of interest for comparison with this work. Rather, an intermediate representation with

Ω = I and R = (Rd)2 is considered, with the first entry being a point and the second entry

a normal vector to an underlying curve or surface. We refer to these as oriented point-sets.

An example of an oriented point set is the set of barycenters and normal vectors from a

mesh. Note that an oriented point may have a non-unit normal vector component with

magnitude zero, in which case it effectively acts as an un-oriented point. While the curve

estimation problem remains ill-posed with OPSs, we now have a 1st order condition for

C1 surfaces that allows us to better evaluate a hypothesized sequence of points, curve, or

spline.

1.2 Prior Work on Registration and Matching

Many problems in computer vision require us to determine correspondences between

similar sets of features. Matching generally refers to these types of problems. Despite

the frequency with which these types of problems arise, researchers are often faced with

scenarios where it is very difficult to even define what a correspondence between two

objects should be—no natural map, moreover bijection, may exist at all. This is often due

to mismatched representations. Work focused on determining point correspondences for

matching organized features has been abundant, as is highlighted below, but there remains

a clear need for handling mismatched representations. With this limitation in mind, given

features sets A = fiNi=1 and B = giMi=1 a correspondence between A and B is a function

h : 1, . . . , N → 1, . . . ,M ∪ ∅.

First note the subtle use of matching vs. registering: when working with an implicit

representation of shape registering is more appropriate since accounting for a “matching”

between two fields is intuitively difficult to check while a “registration” implies an

underlying point-wise alignment. A registration can be used to determine a matching

20

using a nearest-neighbor assignment; a matching can determine a registration once a

space of transformations is determined and a choice of estimator for the selection of a

transformation is chosen. This work provides a method for producing a registration for the

mismatched case where the template consists of oriented points and the target consists of

points, under the assumption that both template and target are drawn from outlines of

shapes coming from the same class.

Given a collection of features describing a set or object embedded in Rd, correspondences

can be obtained via registration and vice versa. One of the first broadly successful

approaches to registering point-sets was Iterated Closest Point (ICP) alignment [5].

ICP is a simple alternating algorithms between two stages:

1 Find the nearest neighbors to points in the moving template,

2 Fit a transformation to the correspondences generated from these neighbors.

ICP terminates when the change in the tranformation is sufficiently small or none of

the neighbors change. ICP uses both registration and correspondence in the alignment

process. There are many variants of ICP [85]. Some of the most popular employ

restrictions on the types of matches considered or use sampling strategies to improve

the robustness of the final fit. We highlight ICP because it is one of the few approaches to

registration or alignment in which the authors emphasized the algorithms ability to handle

mismatched representations: they explain how to compute point-to-parametric-entity

distances in their paper [5]. Note that ICP can be used for a variety of transformation

types, just as the algorithms presented below. The field of registration has gone in two

very different directions since the development of ICP: measure-based registration and

correspondence-based registration. We will discuss both now.

In medical image processing special emphasis is place on the modeling of the features

and objects being registered. A major component in the modeling is the use of density-

or measure-based shape models, such as discussed above. Registering these models is

often done with measure-based registration approaches. In these approaches, sparse

21

feature sets are first converted into scalar field representations. Then the fields are

aligned by some metric on the representations. Finally non-rigid registration of the

template field with that of the target yields dense point to point correspondences by

post-processing. In this representation we expect high values to be associated with

the object (set or its boundary), require that our functions are positive valued and

integrate to 1 on Ω, and can only reconstruct points on the object by sampling. When

point-features or images are converted into non-parametric densities then information- and

estimator-theoretic distances can be employed. Kernel Correlation [98] and gmmreg [52]

both use Parzen-window densities, relying on correlation and L2-distance for registration

objectives, respectively. The method of matching distributions, or currents, [37] allows

singular measures to be matched. A crossover between the density and distance fields

citedeng2014riemannian utilized distance transforms yielding a density field which is

matched by a geodesic distance. In these works the unifying theme is a field that organizes

the point-features in terms of uncertainty.

SDFs organize features in terms of geometry. Registering SDFs has also been

attempted using an approach [51, 77] based on modeling pixel-wise behavior in a

pre-computed distance transform. The first technical problem encountered in registering

SDFs is the choice of a distance measure between them. The aforemented methods [51, 77]

use likelihood and mutual information based approaches on the values of the signed

distances treated as per-pixel random variables. While this is effective for the problem

at hand, it leaves much to be desired: they only consider small ranges of deformation,

is very slow, and does not address some of the problems that we point on below. In any

framework for registering transforms, such as distance functions, one must address the

problems inherent to the transform functions themselves which we now list. Far away

from the shape boundary (in the far field) SDFs take large values. This renders many

standard distances useless, like Lp or W p, unless one restricts to a compact domain

beforehand. Choosing this domain in a general way to allow arbitrary transformations

22

registering template to target requires the selection of invalid location values or the

repeated computation of intersection (and hence changing of the objective domain). Not

only is this awkward, but mathematically inconsistent. The second problem (referenced

previously) is that SDFs are usually not available in closed form, in sharp contrast to

parametric density representations. This implies that closed form distances between SDFs

are elusive. Third, note that registering is extremely difficult to perform within the space

of SDFs. For ϕ ∈ H to maintain the properties of SDFs H must be included in the

rigid transformations; to go beyond this some tampering with the values of the function

are required. The difficulty of managing this constraint is related to the reinitialization

problem in level-set methods [33, 39, 94] where ϕ is the (instantaneous) motion of an

interface represented by a level set function.

Point-set based methods usually feature explicit estimation of correspondences,

possibly in a soft or probabilistic fashion. Coherent Point Drift (CPD) [72] and TPS-RPM [18]

are two standard-bearers. TPS-RPM alternates between estimating the (soft) correspondence

and a TPS deformation. CPD uses a similar formulation, but also imposes additional

constraints (arising from motion coherence theory) on the deformation. RPM-LNS [111]

imposes symmetric neighborhood structures to preserve local shape while allowing global

deformation.

Graph matching methods have also been employed for point registration [38].

Local and global relations can be encoded in graphs, yielding a powerful structure for

correspondence estimation. While graph matching is a computationally hard problem,

algorithms for structured graphs and relaxation techniques show promise for point

matching [55, 112, 113]. When a planar shape is available as a cyclic graph elastic

matching can be done quickly given a choice of point descriptor (such as curvature)

[87]. Manifolds induce Laplace-Beltrami eigenfunctions [60], providing a canonical basis

from which to perform matching from a joint coordinate perspective [58] or a function

mapping perspective [76]. These methods all rely on equivalent organization of source

23

and target. While organization elevates the richness of the matching techniques available,

it also presents a difficulty: these methods require a level footing between template and

target. Estimating a graph or mesh from points can be very challenging.

Point feature organization can be viewed from many perspectives: computational

geometric methods [31], psychological gestalt principles [9, 45, 67], clustering [79, 108],

and level-set methods [69, 110] all organize points in some sense. Shape representations

are typically chosen to engender a desired organizational aspect of shapes [56]. Through

a multi-valued function or a distributional representation, different aspects of shapes

can be embodied in fields that interact predictably [15, 20, 43]. These works provide a

spectrum of organizational principles that can be used to temper the difficulty of the

point matching problem. In this work we obtain a reconstruction while registering, which

means that no target structure needs to be estimated before registering. Few works touting

simultaneous registration and reconstruction are currently available [17, 66]. Next we

discuss an important class of geometry processing algorithms that obtain structure from

unstructured observations.

1.3 Prior Work on Surface Reconstruction

Surface reconstruction is the problem of choosing a surface to represent a collection of

partial observations in 3-D (points, oriented points, contours, and patches). This problem

is clearly ill-posed as an infinite number of solutions exist for almost any collection of

features in space. Most of the research on surface reconstruction consists of choosing

a form of regularization, engineering methods to improve the speed of inference from

features to an implicit field and then to a mesh, and data-driven methods. In the first

two cases, surfaces are taken to belong to general classes of smooth shapes. We will

focus on these methods in this review, but will mention how data-driven methods can be

incorporated into our approach as well.

Note that in the following we are focused on reconstructing compact surfaces without

boundary. They have no holes or openings and therefore no boundaries. Representing

24

boundaries cannot be done easily with a single, continuous implicit function. Also, we

note that most of the methods we will discuss reconstruct implicit fields. Thus, meshes are

obtained by extracting a level-set by marching cubes.

Early in the development of implicit field models the ”blobby model”, metaball, or

mixture of densities model became popular for visualizing potential fields of molecules in

Quantum Mechanics [6]. This model is similar to our own and serves as a good starting

point for introduction to the field. The basic idea is to represent a surface as a level-set

of a mixture of Gaussian components. This approach takes advantage of the fact that a

T−level-set of isotropic elements with different variance values can be re-expressed as a

.5−level-set and formulates the free parameters in terms of the variance and location

of each element. General quadrics featured in the exponent can be used to model

non-physical shapes including people and natural scenes. While the fitting process is not a

focus of this work, it was studied subsequently [65]. While this metaball approach is less

popular today, it is still commonly used for modeling particle-based matter interactions

(such as fluids flowing against each other) and for modeling molecules [84].

Simplistic atom-fitting approaches slowly gave way to more robust and faster

approaches based on splines and radial basis functions (r.b.f.’s) used in interpolation and

approximation theory [14, 32, 53, 73]. In this approach a norm on the smoothness of the

approximating implicit function is used along with some spline constraints. Researchers

frequently encode normal vectors as “normal points” which have a pre-defined distance

from the surface [14]. Rising to popularity before the wavelet revolution [24, 62] modeling

with metaballs missed out on the scale-space approach of function approximation. As

spline-wavelets [32, 53, 73] and other methods from computational physics [14] came

to maturity these approaches to function approximation gave rise to new surface

reconstruction algorithms. The standard approach to fast fitting of B-splines is using

a k-d-tree approach to localization and fitting a level-set of a simple class of functions

(typically algebraic) to the points in the neighborhood. Care must be taken to maintain a

25

neighborhood structure and weighting scheme enforcing smoothness of the final function

[73]. Fitting objectives range from least-squares based approaches to physically and

variationally inspired approaches [53, 54].

In the Poisson-based surface reconstruction [53, 54] a set of oriented points (an OPS)

S = (mi, νi)Ni=1 is provided and the question of estimating an underlying indicator field

for the implied surface is addressed. If the indicator field is represented by the unknown

function χ : Rd → R+. The authors of [53] make the observation that the gradient of a

smoothed χ, ∇χ, will be approximated well by a (carefully chosen) interpolated vector

field V . Specifically, they employ the Gaussian interpolant and then solve the Poisson

equation (instead of the vector differential equation ∇χ = V due to integrability concerns)

∆χ = ∇ · V

= ∇ ·

(1

N

N∑i=1

g(x−mi)νi

). (1–2)

The authors address scalability and compare with several standard-bearers on the basis of

computation-time and memory requirements.

Our own approach can be viewed as similar to this last in that we exploit a

continuation field. Note that the following reasoning is not rigorous, but provides

a heuristic for the suggested analogy with the Poisson-based reconstruction. The

continuation field that we use comes from a Hamilton-Jacobi equation that only yields a

linear equation under the correct parameterization (Chapter 2). One way to parameterize

it comes from the classical superposition principle: if ψ1, ψ2 ∈ C2(Rd,C) decay quickly

enough and |ψ∗1(x)ψ2(x)| = ϵ at x ∈ supp(|ψ1|), then

Arg(ψ1 + αψ2)(x) = tan−1

(|ψ1(x)|2 sin(θ1(x)) + αϵ sin(θ2(x))

|ψ1(x)|2 cos(θ1(x)) + αϵ cos(θ2(x))

).

Now we can contrast this with the RHS of Equation (1–2) (the argument to the divergence

operator). First, off we require a d−dimensional field to store the spline-based vector

field; the phase-based field only requires the 2−d Complex representation. Another issue

26

is that this vector field is clearly not always integrable and so estimating a distance field

from it is not easy. Indeed, identifying a contour associated with the oriented point-set

can run into stability issues since following streamlines going perpendicular to the normal

direction will lead into regions with very small magnitudes—not to mention vanishing of

the vector field. Finally, the representation by spline-based vector fields is not injective

since (m, ν1), (m,−ν1) leads to a zero field. These technical issues suggest that there

is room in the shape canon for yet another continuation-based representation. While this

thesis focuses on the atomic representation explained below in Chapter 2, at the end of

the thesis we outline a direction towards exploiting the superposition principle above for a

general approach.

1.4 Outline of this Document

In Chapter 2 we have brought in the basic genesis of this idea from the 2014 paper

[20]. This chapter provides a top-down view of the motivation behind this representation.

It explains how the CWR can be thought of as a linearization of the solution space for

Equation (1–1). An anecdotal surface reconstruction example is featured, the modular

distance field is defined, and some theoretical claims are made concerning how well

the finite approximation preserves properties of the modular distance field as well as a

projective embedding theorem.

In Chapter 3 we introduce a model and an algorithm for simultaneous point-set

registration and normal estimation. The normal estimation can be used to extract a

surface reconstruction (as mentioned above) but this is not the focus of this work. We

do compare the model to an algorithm that uses the normal-field interpolant provided

by the action of the deformation on the original oriented point-set, but no extensive

surface-reconstruction metrics are devised.

In Chapter 4 experiments validating the model and algorithm are presented. We also

include an early algorithm for likelihood-based registration and show results from applying

kPCA (using the r.k.H.s. corresponding to the image of L2(Rd) under the appropriate

27

Gabor Transform) to a subset of the MPEG7 dataset (planes) for surface reconstruction,

density estimation, and curve classification.

In Chapter 5 is an outline of some of the theoretical work we have done so far. We

have proved a theorem on the connectedness of the zero level-set of the phase of a mixture

of 2 atoms under special condition. We have also shown the conditions under which one

can extend beyond 2 atoms, and the overall class of curves that can be approximated

under these conditions. We also provide the theoretical justification for the representation

which goes through the modular distance field estimation.

In Chapter 6 we show several applications of the idea that go beyond registration.

One can more readily perform shape statistics using the CWR and we discuss this in a

section. We also show how to extract curves from the implicit function given by the phase

of the CWR. Beyond that, we extend the representation in two new directions: nonlinear

phase components and embedding on general geodesically complete, closed, Riemannian

manifolds. This allows one to represent co-dimension > 1 shapes with the CWR—a

distinguishing characteristic among implicit shape representations.

Finally, in Chapter 7 we review the thesis and discuss future directions of research.

28

CHAPTER 2REPRESENTING SHAPE WITH PHASE

In this chapter we seek to address the lack of rapprochement between distance

transforms and density functions. The main advantage of the distance transform is its

implicit curve representation whereas that of the density function is its representation of

uncertainty and noise. A unified representation would be beneficial to shape analysis

provided the respective advantages of both the distance and density functions are

preserved. To set the stage, we first turn to the mathematical underpinnings of the

distance function since they hold the key to subsequent integration.

Distance transforms satisfy the static Hamilton-Jacobi equation ∥∇S(x)∥ = 1 where

S(x) is the distance function. If the signed distance function is sought, the zero level set

of S(x) is a set of curves embedded in 2-D (or a set of surfaces embedded in 3-D). The

typical way to compute these fields given smooth initial conditions S|∂A = 0 is to use

the fast marching method [91]. However, in real-world settings the problem of interest is

typically a harder one:

Given partial boundary data ∂S = P compute the distance function of S. (2–1)

In this case the problem is ill-posed: many potential solutions can exist depending on the

structure of P (it could be partial curves, a point-set, an oriented point-set consisting of

locations and normal vectors, etc.). It is worth pointing out that even the fast-marching

method uses an initialization heuristic since the contours are often given as an explicit

sequence of points. The difficulty of computing the signed distance function from an

unorganized set of points is well known (see the massive body of work on reconstruction).

What is not so well known is the curious fact that the static Hamilton-Jacobi equation—a

nonlinear differential equation—is closely related to the static Schrodinger equation—a

linear differential equation [11]. It turns out that the Hamilton-Jacobi scalar field S(x)

29

is approximately the phase of the complex Schrodinger wave function ψ(x) which is the

solution to the wave equation.

As pointed out in previous work [20, 90], if we parameterize ψ = R expiS/ℏ, the

phase that we estimate approximately obeys the eikonal equation

−ℏ2∆ψ = ψ

−ℏ2∆(R expiS/ℏ) = ψ

−ℏ2(∆R + i

∇S · ∇Rℏ

− R(∇S · ∇S)ℏ2

)expiS/ℏ = ψ

||∇S||2 ≈ 1. (2–2)

with Equation (2–2) converging to the local condition of Equation (1–1) uniformly as ℏ ↓

0. In previous work this relation was used to motivate a distance function representation

using point- and line-potentials and the K0 Green’s function for the Schrodinger equation

[90].

Here we are interested in the signed distance. It obeys the same eikonal equation but

also has a continuity condition for the boundary, employing the boundary and domain

condition of Equation (1–1). This is difficult to model with a point-based Green function

for two reasons.

A Green function g for L will solve Lg = δ, where δ ∈ S is the delta distribution.g have some discontinuity in it (as we see with the Green function of the Helmholtzand Schrodinger operators). On the other hand, signed distances are smooth throughpoints on the boundary.

We will need to have different directions associated with different points, to reflectthe local direction of the eikonal field.

We do not pursue the Green function formalism in this work but provide a heuristic

parametric solution as a finite mixture model of complex ‘atoms’ (to use a suggestive

term). Since a wave function magnitude is related to a normalizable density function, it is

natural to ask a follow-up question: whether probabilistic information concerning a shape

30

can be embedded in the wave function magnitude? To answer this question, we turn to the

consideration of density functions next.

Density functions estimated from unorganized point-sets come in both parametric and

non-parametric flavors. Shape densities in the form of histograms, mixtures of Gaussians,

wavelets and kernel expansions are all used in the literature [23, 81]. If a mixture of

Gaussians is sought, the density function p(x) is peaked at a set of 2-D (or 3-D) “cluster

centers” with the degree of “peakedness” depending on the variance of the underlying

cluster. The difficulty of computing the mixture density function from an unorganized set

of points is well known—involving a search for cluster exemplars and associated covariance

matrices. The question at hand is whether one can associate a density function p(x)

with the squared magnitude of the wave function with the phase of the wave function

continuing to play the role of the distance function. This will boil down to normalizing the

Complex-valued continuation field that results from our choice of mixture below. By doing

so we have a candidate for an integrated shape representation with the wave function

magnitude and phase representing location uncertainty and curve geometry respectively.

2.1 The Complex Wave Representation

We begin by summarizing previous work which introduced an approximation to the

unsigned distance function [90] by solving the Schrodinger equation corresponding to the

static Hamilton-Jacobi equation ∥∇S∥ = 1:

Sτ (x) ≈ −τ log ϕτ (x;µ) = −τ logN∑k=1

exp

−∥x− µk∥

τ

, (2–3)

where µ = µiNi=1 is a collection of locations and the scalar field ϕτ (x;µ) is the solution to

the linear differential equation

−τ 2∇2ϕτ (x;µ) + ϕτ (x;µ) =N∑k=1

δ(x− µk). (2–4)

In Equation (2–4), τ is a free parameter and the approximation Sτ becomes increasingly

accurate as τ → 0. Superpositions of solutions are allowed, in sharp contrast to standard

31

distance transforms break down under addition. In the present work, we seek to go

beyond the unsigned distance transform (and linear differential equation approximations

thereof). In shape analysis, connectedness is fundamental to applications, but is rarely

available explicitly from the representation. Unsigned distance transforms solve a

wave-front equation that is not suited to dealing with issues of connectedness. The

approximation in Equation (2–4) does not fare any better since it is based on an isotropic

Green’s function evaluated at a point-set. To embody connectedness as a feature, ϕτ

must be modified. Drawing inspiration from the complex nature of wave functions in

physics, we introduce a complex modulation factor to Equation (2–3) that encodes normal

information. Intuitively, we can extend the geometric information contained in the normal

of a shape by propagating the phase as suggested by the (classical) superposition principle.

This leads to a phase factor expiνTk (x−µk)

λ modulating the real function ϕτ above [82].

The modulation that results from the newly introduced phase factor [20] acts as a

local spline, borne out in level sets of the phase of the complex wave. The magnitude

then acts as a regularizer to enable joining and hand-off of the spline through neighboring

kernels. Note that the normal information νk is attached to the center µk and therefore

is the statistics needed for this representation are oriented point-sets. The resulting

representation, in it simplest form, is

ψσ,λ(x;µ, ν) =N∑k=1

exp

−∥x− µk∥

2

2σ2+ i

νTk (x− µk)λ

,

where ν = νkNk=1 is a collection of normal vectors. N will be used to represent the

number of oriented points in C, with Ni referring to the points in Ci when considering

multiple oriented point-sets. Henceforth, oriented point-sets have unit normals unless

otherwise specified. We will write ψ(x; C) as shorthand for ψσ,λ(x;µ, ν) with σ and

λ suppressed wherever not explicitly needed. The wave function ψ(x; C) contains

geometry information of the curve through the level sets of the phase, probability

32

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(a)

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−1 −0.5 0 0.5 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

(b)

Figure 2-1. A simple example of composition of a distance function from oriented pointsusing the Complex Wave Representation (CWR). a) The level-sets of the angleare shown with higher values in red and lower in blue. b) The level-sets ofthe density are shown. Each oriented point on its own forms a linear wavepropagating in the direction of the normal vector. The density associatedwith each atom alone is Gaussian. By superimposing several waves we get anon-Gaussian density and the phase acts like an approximate signed distancefunction.

33

density information via the squared magnitude, and distance information [90] through

the logarithm of the magnitude (as λ→∞ and σ → 0).

A technical issue arises due to the wrapped nature of the 2-D wave function phase. At

any location x, we obtain the modular distance along the normal vector to the zero level

set of the phase

d(x;C) = λ arctan

(Im[ψ(x;C)]

Re[ψ(x;C)]

).

Note that the phase (carrying orientation data) is now a property of the field (Figure 2-2),

and is therefore defined everywhere. The unsigned distance transform obtained from

a point-set, despite also being defined everywhere, lacks the crucial connectedness

information, causing its zero level-sets to be broken islands marooning the original

points. The connectedness component afforded by the phase is critical to shape boundary

representation and perceptual grouping. In Chapter 5 proof of the continuation and

connectedness properties are carried out.

A principal advantage of using distance transforms is the integration of point

information via a field. Robust analysis of shapes is enabled by this property. Unfortunately,

the tight constraints imposed by the distance transform (such as ∥∇S∥ = 1) do not permit

averaging, component analysis and the like, thereby limiting the effectiveness of the

representation. The wave representation ψ allows for superposition and other operations

(Table 2-1) enabling a richer variety of potential applications than standard distance

transforms. With distance information in the magnitude along with connectedness and

orientation information provided by the phase, ψ preserves the attractive properties of the

signed distance function.

1 http://www.cise.ufl.edu/∼anand/GatorBait 100.tgz

34

http://www.cise.ufl.edu/~anand/GatorBait_100.tgz

Table 2-1. Technical Layout of the Operations on ψ. This table shows the diversity of therepresentation ψ for shape analysis. To the best of our knowledge, we have notseen a field representation for shapes with both connectedness and probabilityinformation over the whole of the embedding space. Note that the inner productdepends on the normalization, by Equation (3–8).

Unsigned Distance d2(x; C) ≈ −2σ2 log(ψψ)

Modular Distance (MD) d(x; C) = λ arctan(

Im[ψ(x;C)]Re[ψ(x;C)]

)Curve Geometry n(x; C) = ∇λ arctan

(Im[ψ(x;C)]Re[ψ(x;C)]

)Sampling Probability p(x; C) = |ψ|2/∥ψ∥22

Spatial, Frequency Variance σ, λ

MD Linearity ψ3(x; C3) = ψ1(x; C1) + ψ2(x; C2)→ d(x; C3) ≈ d(x; C1∪C2)

Kernel k((µj, νj), (µk, νk)) =exp

−

∥µj−µk∥2

4σ2 −σ2∥νj−νk∥2

4λ2+

i(νj+νk)(µj−µk)

2λ

(2π)D/2σD

2.1.1 Analysis of ψ

There are some similarities between the wave function ψ (with a Gaussian kernel) and

a Gabor wavelet. The latter has been extensively studied and used with great success in

pattern recognition [59]. One interpretation of ψ is as a square-root density. The study

of Gabor frames for square-root density approximation is basically nonexistent, and is a

possible direction leading out of this research. The unnormalized function |ψ(x)|2 is

|ψ(x)|2 ∝N,N∑

j=1,k≥j

cos

(νj(x− µj)− νk(x− µk)

λ

)exp

−∥x− µj∥

2

2σ2− ∥x− µk∥

2

2σ2

.

Note that this is not the L2 norm but the squared magnitude of ψ at location x. It is not

obvious from the expression above, but as |ψ(x)|2 is the magnitude squared of a complex

number, it is nonnegative everywhere. However, unlike Parzen estimates using kernels

with support equal to the entire domain (such as Gaussian or exponential kernels) zeros

may occur when a mixture is formed. When suitably normalized, |ψ(x)|2 can be treated

as a probability density function which immediately connects it to the plethora of shape

density functions used in the literature.

35

50 100 150 200 250 300

50

100

150

200

250

300

(a)

0 50 100 150 200 250 3000

50

100

150

200

250

300

(b)

50 100 150 200 250 300

50

100

150

200

250

300

(c)

0 50 100 150 200 250 3000

0.1

0.2

0.3

(d)

0 50 100 150 200 250 300−0.1

0

0.1

0.2

0.3

(e)

Figure 2-2. Visualization of the phase of ψ. Left: level sets of the unsigned distancetransform. Center: oriented point-set. Right: level sets of the phase of theCWR. In the second row, the scanlines indicated by the red lines are shown.Near the point locations, the level set of the phase is much clearer while forthe unsigned distance function, the normal information is totally inaccuratenear the data points. Conforming to the continuity requirement for perceptualgrouping [86] is a primary attribute of ψ. In the unsigned distance transform,the pectoral fin (the small fin below the gill fin) is basically indiscerniblewhereas in the phase it is very clear. The point-set used is a sampling from theGatorBait data set1 , specifically Acanthuridae Acanthurus Chronixis composedof a linear superposition of shapes that individually form closed curves. Seealso Figure 3-3.

2.1.2 ψ for Oriented Multi-curve Shapes

As discussed above, ψ has unique properties (relative to distance functions) stemming

from additivity of the representation, leading to an additivity or “superimposability”.

Depending on the choices of the free parameters σ and λ, modifying a shape with new

position and orientation data can be very easy. We briefly justify the viability of this

attractive property, and its limitations, below.

36

When

d(x) = λ arctan

(Im[ψ(x;C)]

Re[ψ(x;C)]

)

= λ arctan

N∑k=1

sin(νTk (x−µk)

λ) exp−∥x−µk∥2

2σ2

N∑k=1

cos(νTk (x−µk)

λ) exp−∥x−µk∥2

2σ2

is evaluated, the contribution of each of the cluster centers to the sum decays exponentially;

the slow growth of the arctangent yields stability to small contributions. To see what this

means for superposition, consider an oriented point-set C1 and let C2 be a new oriented

point-set to be superimposed. Let q1 be the zero level set of the unwrapped d(x;C1) and q2

be the zero level set of the unwrapped d(x;C2). Provided that p(x;C2)≪ p(x;C1), ∀x ∈ q1

and p(x;C1) ≪ p(x;C2), ∀x ∈ q2, the superposition of C1 and C2 is stable: the resulting

zero level sets approximately match q1 ∪ q2. This superposition principle allows us to

compose shapes additively.

The takeaway from this is that multiple curves can often be “added” easily: if one has

multiple CWRs, then provided that the properties detailed above hold, one can compute

the field by simply adding their fields together. Under this operation, the stability of

the level sets depends on the distance to the initial set and the free parameters. When

fields interact with each other and the above fails, then point discontinuities can arise in

the phase field of ψ. However, provided that the abutting shapes have agreeing normal

information (as in Figure 2-2), the resulting superposition can maintain the desirable

features of each of the underlying sets.

The frequency of the oscillatory part and spatial accuracy of the density play a

key role in the level sets of d. If the sampling of the curve location or normal data is

insufficient, the superposition limitation mentioned above kicks in and the curve may

be grouped incorrectly. As superimposed shape boundaries abut, the phases begin to

interfere. Eventually, once the abutting shapes fall within close enough in the receptive

37

Figure 2-3. Merging of two curves as oriented points move closer together. Since thenormals are aligned opposite to each other near the center of the image, thecurve portions originating from those points cancel each other out.

fields of one another, the curves cancel each other out. The parameters σ and λ act

as uncertainty parameters between the multi-curve and high curvature paradigms of

shapes. A route to mitigating the abutment issue (a universal phenomenon in multi-curve

representations) is allowing non-uniform frequency and spatial parameters to control the

degree of precision of d.

38

2.2 Wave Mixtures as Geometric Primitives

Here, we state additional interesting properties of this shape representation for the

purposes of curve reconstruction and as a feature function before proceeding to showcasing

simultaneous matching and reconstruction (Chapter 5 contains proofs). The analysis

(mathematical and experimental) is extended beyond the 2−d setting.

Consider a point-set augmented with directional information at each point. That is,

let S = (ma, νa)Ma=1, where νa is a normal associated with the point ma. We use S to

denote the set underlying the oriented point-set S, with each ma ∈ ∂S and νa pointing

in the outward direction from S. The complex field we use extends the standard Gaussian

Parzen window density to a square-root of a density by using the normal information,

written (unnormalized) as

ψS(x) =M∑a=1

e−∥x−ma∥2

2σ2 +iνTa (x−ma)

λ . (2–5)

λ controls the frequency of the wave: the lower the value of λ the higher the spatial

frequency. The wave oscillates along the normal near a point feature but integrates

information from different wavefronts in the far field (near and far are a function of σ, λ).

The squared magnitude of ψ(x) encodes probability density information. Zero level-sets of

the phase now carry shape geometry information.

The mixture in Equation (2–5) has similarities to the venerable Gabor filter or

wavelet—well known to vision researchers and mathematicians [24, 48, 59]. This allows us

to leverage the mathematical literature to prove useful properties of this representation,

such as proof of injectivity below which follows a similar argument for a related Gabor

system [48]. Gabor systems are families of time-frequency translates of an admissible

function. The primary difference between the choice of atom in Equation (2–5) and

the kind used in signal processing [59] is that we do not enforce biologically motivated

constraints. The connection to signed distance functions (and static Hamilton-Jacobi

equations) is more subtle [2].

39

Figure 2-4. Zero level-sets of the phase of ψ for subject 1 of FAUST under severaldifferent values of σ. A uniform value of σ is used across all of the orientedpoints. σ ranges from 1 × 10−4 to 1 × 10−2 from top to bottom, left to right.Note the stippling in the upper-right rendering. This shows the region ofinfluence of each oriented point.

In contrast to the unsigned distance function, the signed distance is smooth across

shape boundaries (providing a stable reconstruction) with the sign of the distance

indicating whether a location is inside or outside the shape. When we fit Parzen

window density estimators to a point-set, we can obtain an approximate unsigned

distance function at every point. The relation G(x) ≈ CRe−R2(x)

2σ2 holds (with CR being

a normalization constant), where the approximate unsigned distance function R(x)

approaches the true distance pointwise as σ decreases toward zero [90]. For oriented

point-sets the relation is

ψS(x) ≈ ΨS = e−b2S(x)

2σ2 +ibS(x)

λ (2–6)

40

where bS(x) is the SDF. We refer to ΨS as the modular distance function or MDF for a

set S. For a fixed S, the approximation becomes more accurate as σ, λ → 0. Note that

the magnitude is agnostic to the sign of the distance whereas the phase carries the sign

but is modular due to the wrapped nature of the phase. Note that we do not require or

use phase unwrapping, all of the analysis will be carried out in the wrapped setting (phase

unwrapping only occurs when surface reconstruction is executed (Chapter 6). We will be

clear when referring to the modular distance function whether the abstract function in

Equation (2–6) is being referred to or the modular distance that arises from the CWR is

being referred to.

A few key advantages to using the modular distance function in lieu of the signed

distance function are: i) the modulus decays as we approach the far-field, handling the

far-field issue mentioned above, ii) Equation (2–6) allows us to derive distances in closed

form (Equation (3–8)) we avoid the concerns with region v. boundary representation (in

exchange for modularity), iv) it can be approximated with a parametric mixture as laid

out in this thesis document.

2.2.1 A Note on Gabor Analysis

While this work is primarily applied, we think it is worth mentioning the history

of the Gabor or Weil-Heisenberg representation to put some of the mathematics

in context. Gabor originally proposed the time-frequency [35] representation as an

uncertainty-minimizing family of functions that act as a frame from which to encode

signals. The family of time-frequency shifts

TmEvg : v ∈ R,m ∈ R,

implement the Weyl-Heisenberg or Gabor transform from L2(R) → L2(R2). That is, the

operator

Ψg : L2(R)→ L2(R2) : f 7→ f : f(a, b) = ⟨TaEbg, f⟩ ,

41

is a faithful square-integrable representation of the Weyl-Heisenberg group implementing

an isometry between the domain and range [49]. While the advent of Wavelets has

refocused the interests of many in the Vision and Electrical Engineering community

towards approximate phase-space packet design, the Gabor Transform still has a number

of interesting related open problems (from a pure and applied standpoint). For instance:

which g lead to S : (Λ, g) 7→∑

a,b,c∈Λ cTaEbg being linearly independent for arbitrary

finite Λ ⊂ R2d? [48]. The proof of the embedding of sets of oriented points relies on

arguments that lead to this question. Though we have not addressed it in this work, the

use of a discrete Gabor Frame offers the promise of reconstruction error bounds [24, 59].

Finally, uncertainty relations (the support of Ψg is never a set of finite measure) and

stable reconstruction (the support is never too localized) are both intrinsic features of the

Gabor transform [105].

2.2.2 Square-root Densities and Probabilistic Interpretation of the ComplexWave Mixture

The CWR provides a geometric completion field or implicit shape representation

in the phase, as shown above visually and explain in further detail below theoretically.

Another perspective on the CWR is as a density. Square-root densities have been used in

the shape literature before, and can offer some advantages over classical densities from the

standpoint of having an elegant analytical expression for common information-theoretic

objectives [80]. From Table Table 2-1 we see that the normalized CWR is the square-root

of a density. We now briefly discuss some useful features of the CWR, and also Gabor

wavelets, as a square-root density.

First, from an inference standpoint the CWR can provide a better density estimate

for points drawn along curves provided that the normal estimates are correct and for

neighboring points on the curve the lines perpendicular to each oriented point intersects

the other along their Voronoi boundary. In Chapter 5 we discuss the significance of

this requirement in more detail. The effect is that the density estimates develop a

42

nonuniformity that reflects the coherence of the atoms: when the normal vectors are

aligned correctly this is often called coherence in the physics literature [89]. A principal

feature of coherence between atoms is the superposition aspect: when a well defined phase

can be observed the atoms are said to be coherent, but an indistinguishability between

the two emerges as a byproduct. In density estimation we observe the same effect: when

the oriented points or atoms are “coherent” we see an increase in density along a ridge

that follows the parameters, but we cannot say which one of the atoms is responsible for it

since both are necessary to observe a departure from normal Gaussian density values.

2.3 Relationship Between Signed Distances and Complex Wave Mixtures

To solidify the claim made in Equation (2–6), first note that ||ΨS||2 < ∞: |ΨS(x)|2 =

| exp− b2S(x)

2σ2 + ibS(x)/λ|2 is dominated by its concave envelop ΨS, which has

||ΨS||22 ≤ ||ΨS||2 ≤((2πσ2)d/2 + 1)πd/2 diam(S)d

Γ(d2+ 1)

, (2–7)

by an application of volumes of revolution. Furthermore, note that

||ΨS||22 ≥ (2πσ2)d/2 (2–8)

as d(x, p) > d(x, S) for all p ∈ S.

Then, note that as σ → 0 that ⟨exp− ||x−m||22σ2 +iv

T (x−m)λ,Ψσ,λ⟩ → 0 whenever m /∈ ∂S.

And as λ → 0 destructive interference causes ⟨exp− ||x−m||22σ2 + iv

T (x−m)λ,Ψσ,λ⟩ → 0 by an

application of the stationary phase expansion [106]. This means that as σ, λ shrink, the

only significant coefficients of the Gabor Transform of ΨS come from atoms centered on

the boundary, oriented in the outward normal direction.

This result implies that the CWR parameters are essential to the representation of an

MDF, with increasing relative weight in the Gabor expansion of the MDF as the variance

parameters shrink. The takeaway from this is that the CWR approximately recovers the

MDF if we can take the parameters to be sufficiently small—which is fine to do provided

we can sample densely enough with sufficient precision. In Chapter 6 the proximity of the

43

CWR to the MDF is explored empirically. More evidence supporting the substitution of

the signed distance by the complex wave mixture is provided in Section 3.3. Theoretical

results on the relationship are established in Chapter 5.

2.4 An Embedding Theorem for Complex Wave Mixtures

In some contexts, invariance of representation is desirable [58, 76]. For the purposes of

deformable matching, however, having a 1−to−1 mapping between the point features and

function representation is a prerequisite for employing distances as objective functions: if

a feature function is not injective, it is possible that two non-registered point-sets result

in the same feature functions, with zero distance between them. This is precluded in the

complex wave representation. Note that this injectivity was not furnished in the original

work [20]. The proof below follows Heil, Ramanathan, and Topiwala’s paper [48].

Theorem 2.1. ψ· is an injective map from finite sets of oriented points to L2. Any metric

on L2 distinguishes oriented point sets under this representation.

Proof. Let A = (ma, νa)Aa=1,B = (qb, ωb)Bb=1 be distinct oriented point-sets. We will

show that ψA − ψB is not identically zero. Suppose that m1 (a location in A, with index 1

by reordering) is on the convex hull of K = maAa=1 ∪ qbBb=1. Without loss of generality

assume m1 = 0. Let C = A ∪ B \ (m1, ν1). Then

ψA − ψB = exp−||x||2

2σ2+ i

νT1 x

λ+

∑(r,γ)∈C

h(r,γ)(x) exp−||x||2 − 2xT r

2σ2

= exp−||x||2

2σ2

expiνT1 x

λ+

∑(r,γ)∈C

h(r,γ)(x) expxT r

σ2

where each h(r,γ) = [−1](r,γ)∈B exp− ||r||2

2σ2 + iγT (x−r)λ. Since m1 is on the convex hull, there

is a ray κpκ>0 in the Voronoi cell of m1 (relative to K). So there is a κ sufficiently large,

so that ∣∣∣∣∣∣∑

(r,γ)∈A∪B\m1,ν1

h(r,γ)(x) expκpT r

σ2

∣∣∣∣∣∣ < ϵ/2,

44

so |ψA(κp)− ψB(κp)|2 > exp−||κpσ||2(1 − ϵ) > 0. If an oriented point-set has multiple

oriented points with the same location (but distinct normals at these oriented points) we

can use the injectivity of the Fourier Transform [34] to show that the sum of trigonometric

polynomials (for the duplicated locations) is nonvanishing. Thus, the above argument

holds even in that case. If d is a metric on L2 it is nonzero on pairs of distinct functions,

distinguishing oriented point-sets.

Note that in this proof the coefficients (which we have not mentioned in this work,

besides as normalization coefficients when projecting to the unit Hilbert sphere) in front of

the mixture elements of the CWR are immaterial since they are constants. So the intuition

of the resulting proof is that: no two distinct mixtures (possibly with identical coefficients)

evaluate to having the same phase and the same density as functions on the unit Hilbert

sphere.

45

CHAPTER 3REGISTRATION: RESONANT DEFORMABLE MATCHING

In registration, we seek a transformation of the template objects onto the target

objects. We denote the transformation of the positions as ϕ(ma)Ma=1 where ϕ ∈ H is an

element of the set of a class of hypothesized transformations. We will also estimate the

normal vectors on the target curve or surface as we perform registration. Once models for

registration and organization are chosen, a smooth model fit measure or distance between

shapes is developed. We choose one that is also expressible analytically and C∞ smooth.

Finally, theoretical relationships with other approaches to deformable registration are

studied.

3.1 Hypothesis Classes for Registration

We depart from standard registration techniques as in our case the transformation of

not only the template centers maMa=1, but also the template normals νaMa=1, is carried

out under the action of ϕ. The appropriate transformation is the Jacobian, ϕ′, of the

deformation ϕ. ϕ′ evaluated at the point ma acts linearly on the template normal νa. Note

that ϕ′ : Rd → Rd is the derivative of the deformation with respect to the spatial variable,

not the parameters of the transformation. ϕ acts on (ma, νa) by

ϕ · (ma, νa) = (ϕ(ma), ϕ′|maνa), with (ϕ′|ma)i,j =

∂ϕ(i)

∂xj

∣∣∣∣ma

,

and ϕ · S = ϕ · (ma, νa)Na=1. We can write the transformed template as

ψϕ·S(x) =M∑a=1

e−∥x−ϕ(ma)∥2

2σ2 ei(ϕ′|maνa)

T(x−ϕ(ma))

λ . (3–1)

Note that the centers and normals have been transformed via the action of the deformation

ϕ but the location variable x remains intact. This allows us to define a distance between

template and target functions in terms of a feature-space domain integral, which we will

minimize w.r.t. ϕ. Before going into more detail about the distance, we discuss several

hypothesis classes commonly used in registration.

46

Figure 3-1. Tait-Brian angles. ψ provides the rotation about the initial z−axis, θthe rotation about the subsequent y−axis, and Φ the rotation about thesubsequent x−axis. 1

3.1.1 Euclidean Transformations

Also referred to as rigid or rigid-body transformations, the Euclidean Transformations

comprise a group given by

E(n) = O(n)⊗ T (n).

The equation above is born out by observing that applying a translation T and rotating

by R is equivalent to applying R and translating by RTTR the conjugated translation.

Recall O(n) included reflections and rotations. Note that E(n) is a Lie group.

The Euclidean group can be parameterized in several ways. Due to the normality of

T (n) ◁ E(n), we can factor out T (n). T (n) is clearly smoothly isometrically isomorphic

to Rn and therefore is parameterized automatically. O(n) is the catch. There are two

standard parameterizations: Euler (or Tait-Brian) angles and Quaternions. Before

1 This image was taken from Wikimedia commons. [19]

47

proceeding we note that O(n) is not connected—there is no path from the identity to

a reflection about an axis. Therefore the parameterizations we discuss are of SO(n), and

the optimization over reflections must be handled separately.

3.1.1.1 Euler angles

Perhaps the most common parameterization of SO(3) is using yaw-pitch-roll [71],

Tait-Bryan, or Euler angles. The three angles (Φ, θ, ψ) act on a rigid body v by (Φ, θ, ψ) ·

v = RxΦR

yθR

zψv, where R

a· indicates a rotation about the a−axis. Note that in higher

dimensions O(n) has the added complication of requiring n angles to parameterize a

general rotation.

To derive updates to the estimate of a rotation for a non-convex problem we will need

to compare (at least) the derivatives of the parameters through the objective function. An

example of this follows

∂ψ [(Φ, θ, ψ) · v] =

sin(ψ) − cos(ψ) 0

cos(ψ) sin(ψ) 0

0 0 0

v.

3.1.1.2 Quaternions

One disadvantage to the Euler-angle representation of SO(n) is gimbal-lock. This

occurs when ψ approaches ±π/2 at which point to achieve a rotation of θ = 0 in the

original coordinate system, θ′ = π/2 is required—leading to a large discontinuity as

ψ → ±π/2. The formal geometric reason for this is that there is no smooth cover of SO(3)

by (S1)3).

Quaternions offer a solution by parameterizing SO(3) as a double cover. The

representation encodes the transformation as q = (qx, qy, qz, qw) and represents a rotation

of θ = 2 cos−1(qw) radians about the (qx, qy, qz)/(sin(cos−1(qw))) axis. Checking the algebra

of Q verifies the composition of rotations. Geometric analogy with S3 embedded in R4

48

illustrates the double cover: qw = 0 defines a subset of the unit quaternions covering SO(3)

and rotating qw about the origin provides a second cover.

Note that the double cover is only an analogy unless q is fixed to unit length. This

is usually done in ad hoc manners, rendering the quaternions (unit quaternions) still

imperfect. In practice, we use the quaternions.

3.1.1.3 Action of Euclidean transformations on the normal vector

Rigid transformations act on the normal vector by dropping the translation. This

maintains unit length of the normal vector. The differentiation of this transformation with

respect to parameters mimics that performed above (in the Euler angle setting).

3.1.2 Affine Transformations

Affine transformations arise frequently in registration in the imaging field as different

intrinsic parameters for different devices lead to different scales [95]. The parameterization

of affine transformations is somewhat simpler than Euclidean.

We use the augmented matrix representation

M · v =

A b

0 1

v

applied to homogeneous coordinates (x, y, z, 1) of the points. Therefore, nine parameters

are used (a, b, c, d, e, f, g, h, i) and can be differentiated through the objective function

directly by reading off the action above.

Action of Affine Transformations on the Normal Vector.

As above, affine transformations act on normal vectors by dropping the translation.

This leads to stretching of normal vectors. The straightforward derivative with respect to

the affine transformation acting on the normal vector w is

∂Mij(M · w) = Eijw.

49

It is also possible to implement an action that preserves the unit magnitude of the

normal vectors. The natural way to make the affine group act as SO(d) is to use the

polar decomposition. Here we consider only the behavior of the linear part A of the affine

transformation M . The polar decomposition of A, A = PQ, yields a positive and an

orthogonal part P and Q, respectively. We simply let M act on the normal vectors by

implementing the action for the orthogonal part Q of A as outlined above. This introduces

a complication when updating the estimated transformation: we must scale the derivative

above by the inverse of the positive part.

Note that in the case of the thin plate spline an affine transformation must be

factored out. Therefore, this section also applies to the transformation involved in the

thin plate spline. In that setting we use the former transformation model for the normal

vectors.

3.1.3 Nonrigid Transformations and Regularization

The most common parameterizations of nonrigid transformation are the Thin-plate

spline (TPS) and the Gaussian Radial Basis Function (GRBF) parameterizations.

Both arise from a Reproducing Kernel Hilbert Space with a norm furnishing a natural

regularization of the transformation. Before going forward with the specific kernels used

here, we provide an explanation of how and why these spaces are sufficient for certain

spline problems.

Suppose we start with an arbitrary r.k.H.s. V that is canonically embedded in

L2(Ω,Rd) ∩ C0(Ω,Rd). Denote by w ⊗ δx the linear form that takes v ∈ V and outputs

(w ⊗ δx|v) = wTv(x). From Riesz’s Theorem there exists Kwx ∈ V such that for any

v ∈ V linear evaluation of (Kwx |v) = (w ⊗ δx|v). Since w 7→ Kw

x is linear, K(y, x) is a

matrix such that K(y, x)a = Kax(y). This typifies a matrix-valued kernel, necessary for the

vector-valued spline problems considered here.

50

We put the following assumption on V : for any given set x1, . . . , xn ∈ Ω and any given

α1, . . . , αn ∈ Rd if for all v ∈ V we have∑n

i=1 αiv(xi) = 0 then α1 = . . . = αn. This is

essentially an injectivity requirement.

Now consider the following problem: Given x1, . . . , xn ∈ Ω, λ1, . . . , λn ∈ Rd find v ∈ V

minimizing ||v||V such that v(xi) = λi. What we would like to know is that the span of

K(·, xi) is sufficient for modeling the deformations required by such spline problems. What

can be shown is that

Lemma 1. If there exists a solution v of the problem above, then v ∈ v ∈ V : v(xi) =

0∀i⊥ = V ⊥0 . Furthermore, if v ∈ V ⊥

0 is a minimizer of the norm restricted to this set, it is

also a minimizer on all of V . Finally, V T0 = v =

∑ni=1K(·, xi)αi, αi ∈ Rn.

A detailed presentation of the preceding argument (for one dimensions, as well as a

sketch for d−dimensions) can be found in the shape literature [107].

One takeaway from this argument is that the coefficients α (which consists of

row-vectors α1, . . . , αn) can be solved for linearly given the knots x1, . . . , xn and the

displacements λ1, . . . , λn for the exact spline problem. The solution will simply be

α = K−1λ. When the approximate spline problem replaces the problem above, we

can substitute (K + id /C) for K in the equation above—where C is the penalty applied to

the ℓ2 distance between v(xi) and λi.d

3.1.3.1 Thin-plate spline radial basis functions

Thin-plate splines arise as the mathematical model for deformation of a thin plate of

uniform elasticity. It is most common to approach this problem of plate deformation as

a spline problem. The thin-plate model represents a specific regularizer on the (ill-posed)

spline problem for deforming a finite set of points onto corresponding target locations

[cite]. The corresponding Euler equation for this spline problem is

E(ϕ) =N∑i=1

||yi − ϕ(xi)||2 +∫Rd

||Hϕ(x)||2Fdx, (3–2)

51

where xi, yiNi=1 is a set of correspondences pre- and post-deformation and || · ||2F is the

squared Frobenius norm (applied to the Hessian of ϕ). If we compute the Euler-Lagrange

equation for the norm in the objective E then we get

δ||ϕ||H = 0 implies

0 = 2∂xx(∂xxϕ) + 2∂yy(∂yyϕ) + 4∂xy(∂xyϕ)

0 = ∆2ϕ.

Then we can view Equation (3–2) as a forced Euler equation. This involves compute the

Green’s function for the biharmonic equation.

The observation that the norm of ϕ in the Equation (3–2) is invariant to affine

transformations and therefore rotations of the original coordinate frame suggests that

a shift-invariant kernel (corresponding to an appropriately chosen RKHS) will lead to

a fundamental solution for the spline problem. In 2−d and 3-d there are simple radial

solutions for this:

G2(x) = g2(r) = r2 log(r)

G3(x) = g3(r) = r,

where r = ||x||. We further note that the spatial derivatives of G are

G′2(x) = x(2 log(||x||) + 1),

G′3(x) =

x

||x||.

The resulting action on points in Rd (for d = 2, 3) is thus

ϕ(x) =n∑i=1

αiK(xi, x) + Ax+ b,

where αi, ı = 1, . . . , n is a collection of coefficient vectors in Rd, (A, b) forms an affine

transformation, and K(x, xi) = Gd(x − xi) = Gd(||x − xi||). Note that the biharmonic

equation can be taken as the defining property of the r.k.H.s. by building the hypothesis

52

space up from an operator standpoint [107]. From this perspective, the polyharmonic

splines—which use ∆k where k is chosen appropriately based on the dimension of the

problem—can be constructed.

To understand the behavior of ϕ on oriented points, we recall that the spatial

derivative of a map is also related to the tangent map or pushforward, which acts on

tangent vectors. To understand the use of this formalism, we consider the diffeomorphism

ϕ as acting on a submanifold of codimension 1 embedded in the ambient space Rd and

mapping it onto its image. Thus the deformation ϕ acts on the oriented point-set (m, ν)

like

ϕ · (m, ν) = (ϕ(m), ϕ′mν),

where ϕ′m is the spatial derivative of ϕ evaluated at m.

3.1.3.2 Gaussian radial basis functions

In the discussion at the beginning of this subsection we pointed out that essentially

the only requirement for defining a nonlinear deformation space for splines is a proper

kernel. On of the most common kernels in use is the Gaussian kernel

G(x, y) = id exp−||x− y||2

2σ2,

which has the variance parameter σ which can be tuned for the specific problem. K is

clearly radial in this formulation. In this setup, the direct solution

ϕ(x) =n∑i=1

G(x, xi)αi

is applied. Again, the problem is to solve for α that minimize the appropriately chosen

objective function (discussed below).

53

The action of the Gaussian r.b.f. on oriented point-sets is evaluated in the exact same

fashion as above. With this kernel, the spatial derivative is

G′xi(x, xi) = −2

x− xiσ2

G(x, xi).

Note that in this setting the parameter σ occurs at multiple scales: a linear scaling of the

vector in the derivative and an exponential scaling of the influence of a control point on

the deformation of its neighborhood.

3.2 Introducing Normal Variables for the Target Oriented Point-set

Oriented point-set matching assumes an additional feature: normal directions for each

point of the template and the target. Template normals are estimated offline: a standard

approach to estimation involves the fitting of curves and surfaces to the template features

followed by a sampling of the curves (or surfaces) into an oriented point-set. We assume

template curves (surfaces) do not self-intersect in order to preserve normal uniqueness.

This leaves the target normals. To recover a reconstruction of the surface underlying

the target point-set, we augment the objective with variables for the target normals

W = ωiNi=1. This normal estimation component has no counterpart in the density

matching literature. Adding these parameters does not increase overfitting of ϕ, since the

parameterization of ϕ is independent of the normals. However, interaction of the normal

vectors in the template and target provides an additional regularization on ϕ. We give

conceptual and empirical justification of this below.

To summarize, first we assume that we are in possession of an oriented template

point-set S. This template point-set is deformed onto an un-oriented (and un-organized)

target point-set T via the action of a non-rigid deformation minimizing Equation (3–4).

Since the target point-set is un-organized, we estimate a set of target normals at each

point during the matching process, thereby obtaining an oriented target point-set denoted

T (W ) = (qi, ωi)Ni=1. This simultaneous matching and reconstruction approach is enabled

by a closed-form distance measure between template and target complex wave mixtures.

54

Care must be taken here to deform normals accordingly with the deformation of the target

centers.

3.3 Choosing a Suitable Distance Function

Minimizing D(ψϕ·S , ψT (W )) w.r.t. ϕ and W is a difficult optimization problem

regardless of the choice of D—symmetries and local minima stand in the way. In

the literature, we have seen different choices (geodesic [28], Cauchy-Schwarz [46],

Kullback-Leibler [104]) as well as different choices for the Parzen kernel (Gaussian [52],

Schrodinger [90]). This cross product space of distances, kernels and algorithms is an

active area of research.

We use the L2 distance. The L2 distance for density function registration was studied

in [52] as a specialization of the density power divergence [3]. It strikes a balance between

robustness to sampling and computability. L2 is robust to small Gaussian perturbations in

the location parameters: Eδ[||ψS − ψS+δ||2

]→ 0 as var(δ) → 0 by Fubini’s theorem [34].

While behavior under resampling is harder to examine theoretically, a certain amount

of robustness is borne out in Section 4.4. Now, note that if ||ψS − ψT ||2 < ϵ then

||ψS/CS − ψT /CT ||2 < ϵ′ (CS , CT are normalization constants), and so 1 − ϵ′/2 <

|⟨ψS/CS , ψT /CT ⟩|. Continuing the line of reasoning in Section 2.3, if we pass to the

normalized versions of ΨS and ΨT then we see that the signed distances bS and bT must be

approximately aligned in the near field (of S and T ). Otherwise, destructive interference

would cause cancellations in the product field, decreasing the correlation.

We evaluate the squared L2 distance between the deformed template and target

complex wave mixtures, subsequently minimized w.r.t. the unknown matching and normal

parameters. The action of the spatial non-rigid deformation results in deformed template

points and normals. Contrast this to the typical density matching situation in which

only the template points are deformed. The squared L2 distance between the deformed

55

(a) (b)

(c) (d) (e)

Figure 3-2. An example of surface reconstruction by RDM. (a) and (b) are the inputsto RDM, the points in (a) are the unorganized target and (b) is the OPStemplate. (c) shows the estimated normal vectors from RDM and the truenormal vectors. 99% of the normal vectors are recovered to within π/4 angularerror. (d) shows the reconstructed surface (the zero level set of the phase ofψ) from the true normal vectors and (e) the reconstructed surface from RDM.The protrusions from the ears are due to mis-oriented normals in the highcurvature area near the ear lobe.

56

(a) (b)

Figure 3-3. An example of curve reconstruction by RDM. (a) The target points areshown in black ×’s, with the level-sets of the unsigned distance functionshown as contours. (b) After RDM, the level sets of the target set using theestimated normal vectors are shown. Abutting point-sets make this particularreconstruction problem difficult (Section 4.4.)

template ψϕ·S and target ψT (W ), D(ψϕ·S , ψT (W )), is given by

∫RD

|M∑a=1

e−∥x−ϕ(ma)∥2

2σ2 +iϕ·νTa (x−ϕ(ma))

λ −N∑b=1

e−∥x−qb∥

2

2σ2 +iωTb (x−qb)

λ |2dx (3–3)

where the target wave mixture has been specified for the oriented point-set T (W ) =

(qb, ωb)Nb=1. Note that their cardinalities M and N can differ. When evaluating the L2

distance, we are required to determine the inner product between terms which may differ

in their location and frequency (with common scale and frequency parameters σ and λ

respectively).

The inner product, denoted I(q,ω)(m,ν) = ⟨ψ(m,ν), ψ(q, ω)⟩, is given by the integral

∫RD

e−∥x−m∥2−∥x−q∥2

2σ2 +iνT (x−m)−ωT (x−q)

λ dx =e−

∥m−q∥2

4σ2 −σ2∥ν−ω∥2

4λ2+i

(ν+ω)T (m−q)2λ

(2πσ2)D2

.

If m = q, then the spatial term goes to 1 and weights the Gaussian corresponding to

the frequency term heavily. If m ≈ q + δω⊥ this weighting is dampened, but we obtain

constructive interference provided the normals ν and ω are aligned. When the normals are

not aligned, we get destructive interference. This can either force the normal estimates in

57

line with the template or influence the template movement, and prevent unnecessary local

rotation of the template normals.

The objective function minimized in this work is therefore

E(ϕ,W ) = D(ψϕ·S , ψT (W )) + βL(ϕ). (3–4)

In Equation (3–4), ϕ and W are the desired spatial deformation and target normal

set respectively. Additionally, β is a regularization parameter and L a suitable spline

regularization (chosen to be the thin plate spline bending energy). Assuming a set of fixed

centers pbPb=1 on the template, the thin plate spline [8, 103] maps the location x ∈ RD to

the location A(x) +∑P

b=1CTb K(x − pb) where A is an affine transformation, K is a radial

basis spline kernel and CbPb=1 is the set of spline parameters. The mapping is linear in

each Cb and A and therefore so is ϕ′. The regularization term depends on the choice of

kernel.

We can characterize the asymptotic behavior of our matching objective. Examining

the wave mixture, we see that the wave flattens out as λ → ∞—eventually approaching 1.

This intuitively results in the Gabor tending to the Gaussian. This is made more precise in

the following Proposition, essentially a consequence of the dominated convergence theorem

[34].

Proposition 3.1. Let ma, νaMa=1, qb, ωbNb=1 be a pair of oriented point-sets. As λ → ∞

the distance in Equation (3–3) converges to∫RD

|M∑a=1

e−∥x−ma∥2

2σ2 −N∑i=1

e−∥x−qi∥

2

2σ2 |2dx.

3.4 Gradient Computation and Optimization Details

We derive the gradient for the TPS parameterization discussed above. The penalty

term is easy to differentiate with respect to C:

∂Cβ trCT KC = 2βKC

58

Original OPSTransformed OPS

(a) (b) (c)

...

..0.05

.0.1

.0.15

.0.2

.0.25

.−1 .

−0.5

.

0

.

0.5

.

1

.

σ.

X-Translation

..

... 0.2.

0.25

.

0.3

.

0.35

.

0.4

.

0.45

.

0.5

.

0.55

.

0.6

.

0.65

.

0.7

.

0.75

.

0.8

.

0.85

.

0.9

.

0.95

.

1

.

1.05

.

1.1

.

1.15

.

1.2

.

1.25

.

1.3

.

1.35

.

1.4

(d) ...

..0.1

.0.2

.0.3

.0.4

.0.5

.−1 .

−0.5

.

0

.

0.5

.

1

.

λ

.

X-Translation

..

... 0.2.

0.3

.

0.4

.

0.5

.

0.6

.

0.7

.

0.8

.

0.9

.

1

.

1.1

.

1.2

.

1.3

.

1.4

.

1.5

.

1.6

.

1.7

.

1.8

(e)

...

..0.05

.0.1

.0.15

.0.2

.0.25

.

−0.5

.

0

.

0.5

.

σ.

RotationAngle

..

... 0.1.

0.15

.

0.2

.

0.25

.

0.3

.

0.35

.

0.4

.

0.45

.

0.5

.

0.55

.

0.6

.

0.65

.

0.7

.

0.75

.

0.8

.

0.85

.

0.9

.

0.95

.

1

.

1.05

.

1.1

.

1.15

.

1.2

(f) ...

..0.1

.0.2

.0.3

.0.4

.0.5

.

−0.5

.

0

.

0.5

.

λ

.

RotationAngle

..

... 0.2.

0.25

.

0.3

.

0.35

.

0.4

.

0.45

.

0.5

.

0.55

.

0.6

.

0.65

.

0.7

.

0.75

.

0.8

.

0.85

.

0.9

.

0.95

.

1

.

1.05

.

1.1

.

1.15

.

1.2

.

1.25

.

1.3

.

1.35

.

1.4

(g)

...

..0.05

.0.1

.0.15

.0.2

.0.25

.0.5 .

1

.

1.5

.

σ.

X-Shear

..

... 0.2.

0.25

.

0.3

.

0.35

.

0.4

.

0.45

.

0.5

.

0.55

.

0.6

.

0.65

.

0.7

.

0.75

.

0.8

.

0.85

.

0.9

.

0.95

.

1

.

1.05

.

1.1

.

1.15

.

1.2

(h) ...

..0.1

.0.2

.0.3

.0.4

.0.5

.0.5 .

1

.

1.5

.

λ

.

X-Shear

..

... 0.2.

0.3

.

0.4

.

0.5

.

0.6

.

0.7

.

0.8

.

0.9

.

1

.

1.1

.

1.2

.

1.3

.

1.4

.

1.5

.

1.6

(i)

Figure 3-4. Profile of the L2 distance function over several transformations and choicesof parameters σ, λ. The original set of oriented-points is deformed and thedistance is computed. a), b), c) show the pose of the original (black) anddeformed (red) in each of the translational, rotational, and shear deformations.d) and e) show the behavior of the distance under X−translations andvariations in σ and λ, respectively. f) and g) show the same under rotations,and h) and i) under X−shear. For the d), f), and h) λ = .25 and for e), g), andi) σ = .1.

59

by differentiating the trace and using the symmetry of K. The derivative of the inner

product with respect to the parameters C is

∂CIϕC·(m,ν)(q,ω) = ∂ϕC(m)I

ϕC·(m,ν)(q,ω)

∂ϕC(m)

∂C+ ∂ϕC·νI

ϕC·(m,ν)(q,ω)

∂ [ϕC · ν]∂C

. (3–5)

Recall that ϕC acts on the normal ν at point m by ϕC · ν = ϕ′C|mν [93] where ϕ′

C|m

is the Jacobian at m. Note that now ∂∂CϕC is the derivative w.r.t. the TPS parameters,

not the spatial variable. We use ∂· and∂∂· interchangeably. Let R ∈ RP×N be given

by Rij = K(pi − mj), the kernel matrix pairing template and control points. Then

∂∂C

[ϕC(mj)]a = Rjea (the superscript a indicates the ath coordinate, ea ∈ Rd the ath basis

row vector) with Rj the jth column of R. Differentiating,

∂[ϕC(mj)]a

∂C= Rjea, [

∂IϕC·(m,ν)(q,ω)

∂ϕC(m)]a =

[−ϕC(m)− q

2σ2+ i

ϕC · ν + ω

2λ

]aIϕC·(m,ν)(q,ω) .

When applying the entire gradient update (through all points), this is simply an outer

product of the derivatives of the inner product and R.

The latter factor in Equation (3–5) is not typically seen in the point matching

literature, and arises due to the transformation in Equation (3–1). We must differentiate

ϕ′C|mj

ν with respect to C, where ′ denotes differentiation with respect to the domain

variable. First, to see how ϕ′C|mj

acts in the TPS parameterization, denote by [R′]k the

matrix of derivatives in the kth coordinate of the kernel function at each point in the

template set. Then

[ϕ′C|mj

νj]a

=[[Rj

′]a

T

C]νj, and so ∂C

[ϕ′C|mj

νj]a

= ([Rj

′]a)νTj ,

by treating C as a scalar form acting on ([Rj

′]a, ν). So the second term in Equation (3–5)

is

∂IϕC(mj ,νj)

(q,ω)

∂C= I

ϕC(mj ,νj)

(q,ω)

D∑a=1

([−σ2ϕC · νj − ω

2λ2+ i

q −m2λ

]a[Rj

′]a)νjT .

60

Therefore, the descent direction for the TPS parameters is given by

∇CD =2M∑i=1

M∑j=1

[∂ϕC(mi)I

ϕC(mi,νi)(mj ,νj)

∂ϕC(mi)

∂C+ ∂ϕC·νiI

ϕC(mi,νi)(mj ,νj)

∂ [ϕC · νi]∂C

]

− 2M∑i=1

N∑j=1

[∂ϕC(mi)I

ϕC(mi,νi)(qj ,ωj)

∂ϕC(mi)

∂C+ ∂ϕC·νiI

ϕC(mi,νi)(qj ,ωj)

∂ [ϕC · νi]∂C

].

To complete the picture, we return to the principal themes of this work—simultaneous

registration and reconstruction. Recall that we began by pointing out that there was

a paucity of literature on non-rigid SDF matching in comparison to density matching.

We zeroed in on the difficulty of estimating SDFs as the principal reason. Rather than

estimate an SDF for the target point-set with the aid of a deformed template, we chose

to estimate target normals as we deformed the template. To do this, we apply the descent

direction for each ωi in terms of combinations of ∂ωiIϕC(mi,νi)(qj ,ωj)

during each round. Further

details of the optimization algorithm are provided below. To obtain the signed distance

from these normals one may use previously developed methods [53, 63] or use the phase

of the resulting wave-function directly (Figure 3-2 and Figure 3-3). The result is an

integrated probability density and SDF approach to simultaneous deformable template

matching and multiple curve (or surface) reconstruction.

3.5 A Brief Comparison with Currents

In the introduction the method of curve matching based on the geometric measure

theoretic concept of currents was mentioned. I’ll go into a slightly more detailed analysis

of this framework here, attempt to provide an argument that the representation field

underlying the currents model is a spline-based vector field representation, and contrast

the resulting registration algorithms.

The mathematical theory of currents extends the theory of generalized functions

into the representation of surfaces embedded in an ambient space [29]. The space of

2−dimensional currents on R3 is the dual W ∗ to W = Ω2c(R3), with an element [S] ∈ W ∗

61

representing a surface S → R3 by acting on ω ∈ W as

[S](ω) =

∫S

ω(x)(u, v)dA(x)

where u, v span TxS. The continuity condition for [S] requires that the limit of a sequence

of 2−forms converging to 0 (in the norm topology of all derivatives of the 2−forms)

vanishes in the limit under [S]. The topology on W ∗ is the weak-∗ (Schwartz) topology.

Note that since ω is a 2−form on R3 we can associate it at each point with a vector w(x)

by w(x) · (u × v) = ω(x)(u, v). In this sense, the space of currents can represent a broad

class of embedded surfaces in Euclidean space.

In previous work on diffeomorphic measure matching [100] the fact that the

push-forward of a current agrees with the current associated to the deformation of the

underlying surface (again, everything is embedded so we are considering the action of

diffeomorphisms of Rd on the surface)

ϕ∗S(ω) = (ϕ(S))(ω) = S(ϕ∗ω)

is exploited for registration. The pointwise vectorial representation of 2−form on R3 with

the normal vectors of the underlying surface a parameterization can be used. For elements

of the r.k.H.s. we identify the space of 2−forms with the space of compactly supported

vector-valued functions (hence the kernel will be matrix-valued) with smoothly varying

derivatives, and the natural inner product. The evaluation functional δξx(ω) = w(x) · ξ

belongs to W ∗ and we obtain the formula

⟨δξx, δηy⟩ = ηTkW (x, y)ξ.

The currents model for surface registration implies that the norm between the

oriented point-set A = (ma, ξa)Na=1 and another oriented point-set B = (qb, ηb)Mb=1

62

should be evaluated in this r.k.H.s. and has the form

||φ(A)− φ(B)||W =∑a,b

ξTb kW (ma,mb)ξa +∑c,d

ηTd kW (qc, qc)ηc − 2∑a,c

ηTc kW (ma, qc)ξa.

(3–6)

The authors of [100] also develop a deformation formalism based on the vector space

of C∞ vector fields, which is unnecessary for understanding the analogy between the

two methods under consideration. The authors use isotropic, shift-invariation kernels in

their experimental evaluation (kW (ma, qc) = g(|ma − qc|/σ) id). In terms of Equation

(3–6), we see that the standard norm on the space of vector fields on Rd evaluated on

spline-interpolants of the form described above in Equation (1–2) leads to a similar

data-fidelity term.

Our method can be viewed as performing matching of oriented point-sets as

embedded in L2 under the feature map defined by the CWR. Another way to express the

norm is by using the r.k.H.s. associated with the completion of the image H = Ψg(L2(Rd))

[105]. The evaluation functional δηx ∈ H∗ results in

δηx(f) = f(x, η).

This results in the reproducing kernel formula

KH(x, ξ; y, η) = ⟨gx,ξ, gy,η⟩,

where g is the window function for the Gabor transform under consideration. The

difference is that both the normal and the location vector factor into the feature space

representation of the oriented point-set, unlike with the currents representation where

the kernel between points defines an inner product between the oriented points by a

transformation of the euclidean inner product.

63

3.6 Analysis of the RDM Objective Function

In this section we provide the detailed form of the RDM objective function and

analysis the asymptotic behavior of the objective. We show that it is related to a

correlation-based registration approach as the frequency parameter λ ↑ ∞.

3.6.1 Inner Product of CWRs

Thus far we have focused on isotropic CWRs: waves with a uniform influence field.

Here we outline the difference on the registration objective by evaluating the inner product

between anisotropic CWRs. This allows us to apply non-uniform emphasis on the oriented

points and have distinct directional effects at different points.

3.6.1.1 Isotropic CWRs

First we derive the expression for the inner product between two complex waves, as

an ingredient necessary to compute the distances. Recall the objective function

D(ψS , ψT ) =

∫Rd

|M∑a=1

e−∥x−ma∥2


λ −N∑i=1

e−∥x−qi∥

2

2σ2 +iωTi (x−qi)

λ |2dx.

By using the inner product expansion

D(ψS , ψT ) = ||M∑a=1

e−∥x−ma∥2


λ ||2

− 2M∑a=1

N∑b=1

Re⟨e−∥x−qb∥

2


λ , e−∥x−ma∥2


λ ⟩

+ ||N∑b=1

e−∥x−qb∥

2


λ ||2 (3–7)

we can reduce the integral in Equation (3–3) to a sum of integrals of the form of pairwise

inner products. So we must solve

σ,λI(m,ν)(q,ω) =

∫Rd

e−∥x−m∥2

2σ2 +iνT (x−m)

λ e−∥x−q∥2

2σ2 −iωT (x−q)

λ dx.

64

We suppress the superscript σ, λ for now. The first step is to group real factors together

and complete the square, obtaining

I(m,ν)(q,ω) = e−

∥m−q∥2

4σ2

∫Rd

e−∥x−m+q

2 ∥2

σ2 +iνT (x−m)

λ−iω

T (x−q)λ dx.

Then we can remove the constant modulation eiνTm−ωT q

λ and get

I(m,ν)(q,ω) = e−

∥m−q∥2

4σ2 eiνTm−ωT q

λ

∫Rd

e−∥x−m+q

2 ∥2

σ2 −i (ω−ν)T xλ dx.

This clearly corresponds to a Fourier transform of e−∥x−m+q

2 ∥σ2 at ω−ν

λ. We can extract the

translation by m+q2

as a modulation and obtain

I(m,ν)(q,ω) = e−

∥m−q∥2

4σ2 ei(ν+ω)T (m−q)

2λ

∫Rd

e−∥x∥2

σ2 −i (ω−ν)T xλ dx.

The last factor is a Fourier transform of a Gaussian, and so

I(m,ν)(q,ω) = (2πσ2)d/2e−

∥m−q∥2

4σ2 ei(ν+ω)T (m−q)

2λ e−σ2∥ω−ν∥2

4λ2 . (3–8)

Note that we only take the Real part of these equations (see Equation (3–7)), but we work

with the complex version throughout.

3.6.1.2 Anisotropic CWRs

The inner product of the anisotropic Gaussian CWR has a similar derivation to the

above, with change of variables playing a role in the calculation of the normalization

coefficient and a more delicate completing of the square. For completeness, we include the

inner product here, along with the derivatives.

Im,ν,Σq,ω,H =

√(2π)d

|Σ +H|exp

− [Q(A, p)]

2

− [Q(Σ,m)] + [Q(H, q)]

2+ i

νTm− ωThλ

A = Σ+H, p = (ν − ω)− i(Σm+Hq),

Q(B, u) = uTB−1u.

65

The partial derivatives with respect to m and ν are

∂Im,ν,Σq,ω,H

∂m=[iΣA−1p− Σm+ i

ν

λ

]Im,ν,Σq,ω,H

∂Im,ν,Σq,ω,H

∂ν= −

[Ap+ i

m

λ

]Im,ν,Σq,ω,H .

The derivative with respect to Σ has a subtlety. Often Σ will be considered as a

function of ν: since by the definition of ν we expect to observe more points lying along ν⊥

it is appropriate to weight this region more heavily in the model. This also allows us to

maintain frame properties when pursuing the Gabor analogy. The complete derivative with

respect to ν includes ∂I∂Σ

∂Σ∂ν. Another way to achieve this relationship is to apply the same

transformation rule to Σ as to ν. We simply update ν using ∂I∂ν

instead of ∂I∂ν

+ ∂I∂Σ

∂Σ∂ν, and

then recompute Σ with each iteration based on the current ν (rather than using the action

Σ′ = ϕ · Σ where ϕ acts on S+d by multiplication by Jϕ). We also include the derivative ∂I

∂Σ

here for update purposes.

∂Im,ν,Σq,ω,H

∂Σ= −1

2

[−(ν − ω)(ν − ω)TA−2 −mmT ∂J

∂Σ−mqT ∂K

∂Σ− det(Σ +H)−1I

]Im,ν,Σq,ω,H

J = Σ(Σ +H)−1Σ, K = Σ(Σ +H)−1H

∂J

∂Σ= 2(Σ +H)−1Σ + ΣA−2Σ

∂K

∂Σ= (Σ +H)−1H + ΣA−2H.

3.6.2 Asymptotic Behavior of the RDM Objective Function

The limiting case as λ ↑ ∞ results in a wave with approaching infinite frequency,

essentially a negligible modulation. The following proposition cements the intuitive result

that the objective function tends towards a distance between Gaussian mixtures.

Proposition 3.2. Let S = ma, νaMa=1, T = qb, ωbNb=1 be a pair of oriented point-sets. As

λ→∞ Equation (3–3) converges to∫Rd

|M∑a=1

e−∥x−ma∥2

2σ2 −N∑i=1

e−∥x−qi∥

2

2σ2 |2dx. (3–9)

66

Proof. By turning the tables on the parameters we can view the sequence as the function

D(λ;ψσ,λS , ψσ,λT ). That is, for each pair (ma, νa), (qb, ωb) we have

σ,λI(ma,νa)(qb,ωb)

→ (2πσ2)d/2 exp−||ma − qb||2

4σ2 as λ→∞.

Comparing Equation (3–8) and Equation (3–9) the result is clear.

67

CHAPTER 4REGISTRATION: EMPIRICAL ANALYSIS

In this Chapter, the performance of the registration algorithm outlined in Chapter 3

is evaluated. We compare with the state of the art in density field matching (such as

gmmreg, abbreviated to GMM) [52], generalized function matching (diffeomorphic measure

matching abbreviated DIFF) [37], point-based matching (CPD) [72], and graph-matching

(FGM-U) [113]. While other methods [77, 92] are appropriate for further comparison,

handling the asymmetry in representation is not possible in their current formulation. The

corresponding results are indicated by the appropriate marker and color combinations (see

legend in Figure 4-7). We investigate the performance of RDM on a variety of datasets and

conditions, outlined below.

We also evaluate the performance of the RDM algorithm on normal recovery against a

commonly used pipelined approach. The results show that RDM is a preferable method to

simultaneously aligning and estimating normal parameters for curves and surfaces.

4.1 Experimental Validation

To validate the performance of the RDM registration algorithm we test the accuracy

of recovery of transformation parameters over a range of rigid, affine, and non-rigid

deformations. A series of randomly generated transformations of the appropriate type

lying within a specified range are created and then the oriented point-set is transformed.

In some situations points may be dropped or occluded. For missing points with outliers

points are dropped from the target uniformly at random whereas for occlusion a region of

the shape is dropped from the target. A specified number of trials is performed with each

setting of outliers, occlusion, noise, etc. The absolute error of recovery is reported in a

graph in each section. Errorbar plots represent the standard deviation of the absolute error

of recovery by the height of the bars.

In the non-synthetic experiments, a different measure of recovery must be done. For

the Gatorbait dataset we use known curve-wise correspondences and report the total

68

Frechet distance across all curves. For the CMU House dataset, TOSCA, FAUST we

use known correspondences and report precision-recall curves. For the IBSR subcortical

structure dataset we report the DICE coefficient of the subcortical volumes before the

transformation estimated by RDM as well as after, for each structure. For more details see

the explanation in each section.

4.2 Rigid and Affine Registration

In this section the performance of RDM for rigid and affine registration is tested.

The parameterization of the groups is discussed in Section 3.1. Experiments probing the

robustness to noise, outliers, and range of the greedy descent algorithm are performed and

recorded here.

4.2.1 Range of Rotation

In this experiment a set of curves from the Gatorbait dataset is used as a template

and target. As a target, it is progressively rotated more at each stage (in the same

direction). For each angle the error in recovery of the angle parameter is reported.

We found that RDM performed comparably to GMM and CPD in terms of

convergence basin for rotations. We note here that two simple and straightforward

improvements that can be done for rigid alignment is to try multiple initializations and

to use rigid invariant feature matching as a pre-processing step and then fine-tune the

alignment using an algorithm like RDM.

4.2.2 Gaussian Noise on Points

In this section we experiment with the performance of the rigid registration algorithm

when the target is affected by point-wise Gaussian noise. We do this for both rigid and

affine motion of the target. First we generate a random rigid-body motion, (R∗, t∗), to an

element of the Gatorbait dataset of size N and generate a target oriented point-set of size

N . Then Gaussian noise of the specified variance is added to each point. The tasks are to

recover (R∗, t∗) and the normal vectors on the target. To randomize the process, the trials

at each level of noise consist of random rotations R∗ with the rotation angle θ ∈ [−π3, π3]

69

Table 4-1. Range of convergence for rotations. a) shows the rotation (about (0, 0)) of−4π

9radians in blue quiver arrows, the initial template in red circles, and the

registered template in green quiver arrows. b) shows relative errors for the rangeof angles. After 2.2 radians the performance degraded. c) shows the recoveredregistration for −π

3radians. d) depicts relative errors for the range of angles

tested. The algorithm performed well up to π/2 radians away from the baselinepose. σ0 = 20 for both experiments.

(a)

−50 0 50 100 150 200 250 300−300

−200

−100

0

100

200

300

(b)

Angle λ0 = 100 Error λ0 = 1000 Error−2π

39E-5 3E-10

−4π9

2E-4 9E-10−2π

93E-4 1E-8

0 .00 .002π9

3E-4 1E-74π9

2E-4 3E-102π3

1E-4 1E-7

(c)

−200 −150 −100 −50 0 50 100 150−150

−100

−50

0

50

100

150

(d)

Angle λ0 = 100 Error λ0 = 1000 Error−π

21.1 1.0

−π3

6E-5 6E-5−π

62E-4 1E-4

0 .00 .00π2

2E-4 1E-4π3

6E-5 6E-5π6

1.1 1.1

and a small Gaussian translation of unit variance. In this experiment a coarse level of

registration is performed followed by a fine level. In the coarse level σ = 20 and λ = 200.

In the fine level, σ = 10 and λ = 150.

For the affine experiment, we apply an affine transformation to the template with

linear part having eigenvalues in [.75, 1.25] and translational component up to 100% of

70

...

..0.

2.

4.

6.

8.

10.

12.0 .

0.5

.

1

.

1.5

.

2

.

2.5

.

3

.

σ of Gaussian noise

.

Absolute

Error

onTransformationParam

eters

.

. ..Angle

. ..X

. ..Y

(a)...

..2

.4

.6

.8

.10

.

50

.

100

.

σ of Gaussian Noise

.

AngularError

onNormal

Vector

(b)

...

..1

.2

.3

.4

.5

.

·10−2

.

0.5

.

1

.

1.5

.

2

.

2.5

.

3

.

3.5

.

·10−2

.


.

Absolute

Error


eters

.

. ..q1

. ..q2

. ..q3

. ..q4

. ..X

. ..Y

. ..Z

(c)...

..1

.2

.3

.4

.5

.

·10−2

.0 .

50

.

100

.

150

.

σ on Gaussian Noise

.

AngularError

onNormal

Vectors

(d)

Figure 4-1. Median error and variance for rigid transformation with pointwise Gaussiannoise. a) shows the recovery of the transformation parameters while b)shows the recovery of the angle of the normal vector. In a) the blue linethat indicates the recovery of the rotation parameters is on the bottom,very close to zero through the range of experiments.c) and d) show the sameexperiment repeated on the TOSCA dataset. 10 trials at each level of noisewere performed.

71

the diameter of the object. Then Gaussian noise is applied of the specified range shown in

the plots. The Gatorbait dataset is a shape spanning [−150, 150] × [−150, 150] while the

TOSCA dataset in the Centaur, normalized to lie within [−1, 1]3. Thus the different ranges

of the parameters.

4.2.3 Missing Points With Outliers

In this section we perform experiments that showcase the robustness of the RDM

registration algorithm. We generate a random rigid-body motion, (R∗, t∗), to an element of

the Gatorbait dataset of size N and generate a target oriented point-set of size N . Then

points are dropped from the target at random, until the remaining number of points is

Nr. We also drop the normal vectors from the target. The tasks are to recover (R∗, t∗) as

well as the normal vectors on the target. Finally, we inject K randomly generated points

into the target so that Nr + K = N . This is the missing points with outliers (MPO)

experiment.

To randomize the process, the trials at each level of noise consist of random rotations

R∗ with the rotation angle θ ∈ [−π6, π6] and a small Gaussian translation of unit variance.

For the 3-D experiments the Euler angles all fall within the interval [−π/8, π/8]. In this

experiment a coarse level of registration is performed followed by a fine level. In the coarse

level σ = 40 and λ = 100. In the fine level, σ = 20 and λ = 40.

For the affine MPO experiment, we use the same randomly generated translation

parameters and the same inlier/outlier rates. To generate an affine transformation we

decompose it as RSR∗ = A and generate R as above (by a random angle or random unit

quaternion), and S diagonal with entries in [.75, 1.25] so that the scaling component has

a range of .5. In the affine alignment setting we found that normal vectors are harder to

estimate correctly using RDM.

4.3 Synthetic Normal Recovery, Warps, and Occlusions

This section consists of two sets of experiments. First, we compared our algorithm to

a pipeline approach to normal estimation:

72

...

..60.

70.

80.

90.

100.

110.

120.

130.

140.0 .

2

.

4

.

6

.

8

.

10

.

12

.

Number of Points in Target

.

Absolute

Error


eter

.

. ..Angle

. ..X

. ..Y

(a) ...

..60.

70.

80.

90.

100.

110.

120.

130.

140.10 .

20

.

30

.

40

.

50

.

60

.

70

.

80

.

90

.


.

Absolute

AngularError

onNormal

Vector

(b)

...

..70.

80.

90.

100.

110.

120.

130.

140.

150.

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.

0.6

.


.

Absolute

Error

ofTransformationParam

eters

.

. ..q1

. ..q2

. ..q3

. ..q4

. ..X

. ..Y

. ..Z

(c)

...

..70

.80

.90

.100

.110

.120

.130

.140

.

0

.

20

.

40

.

60

.

80

.

100

.


.

Average

AngularError

onNormal

Vector

(d)

Figure 4-2. Median error and variance for rigid transformation with dropped inliers andoutliers added. a) shows the recovery of the transformation parameters whileb) shows the recovery of the angle of the normal vector. In a) the blue linethat indicates the recovery of the rotation parameters is on the bottom, veryclose to zero through the range of experiments. 10 trials at each level of inlierdropout were performed. c) and d) show the results for the 3-D dataset.

73

...

..70.

80.

90.

100.

110.

120.

130.

140.

150.

0.5

.

1

.

1.5

.

2

.

2.5

.

3

.

3.5

.

·10−2

.


.

Absolute

Error


eters

.

. ..a11

. ..a12

. ..a21

. ..a22

. ..a31

. ..a32

(a) ...

..70

.80

.90

.100

.110

.120

.130

.140

.20 .

30

.

40

.

50

.

60

.

70

.

80

.

90

.

100

.


.

AngularError

onNormal

Vectors

(b)

...

..2

.4

.6

.8

.10

.

0.5

.

1

.

1.5

.

2

.

2.5

.

3

.


.

Absolute

Error

onTranform

ationParam

eters

.

. ..a11

. ..a12

. ..a21

. ..a22

. ..a31

. ..a32

(c)...

..2

.4

.6

.8

.10

.

50

.

100

.


.

AngularError

onNormal

Vectors

(d)

Figure 4-3. Median error and variance for affine transformations. a) shows the recovery ofthe transformation parameters (shown in different colors) while b) shows therecovery of the angle of the normal vector for different levels of missing pointswith outliers. In c) and d) the corresponding results for Gaussian noise trialsare shown. 10 trials at each level of inlier dropout were performed.

74

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.51

1.5

2

2.5

3

3.5

4

4.5

5·10−2

Deformation Level

Error

onPoints

GMM+NN RDM

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.52

4

6

8

10

12

Deformation Level

AngularError

onNormals(degrees)

Figure 4-4. Top: the recovered normal vectors for GMM+NN (left) and RDM (right)and the True normals (both) are attached at the points. Bottom: Averageregistration error (left) and the median angle error between correspondingnormal vectors (right) for RDM and GMM+NN. 50 trials per level areperformed. Even when GMM matches equally well, it does not provide robustnormal estimates.

1 Register a template point-set to a target by an estimated deformation.

2 Let the deformation act on the normal vectors of the template.

3 Use a nearest-neighbor approach to infer normal vectors onto the corresponding

points in the target set.

We used GMM as the matching algorithm. A single curve (a body curve consisting of

80 points) from the multi-curve GatorBait dataset was used. For synthetic deformation,

a diffeomorphism is fit to point perturbations by solving a 3-D flow problem [12]. This

has the advantage that the Jacobian is positive definite everywhere and so relative normal

orientations are preserved. The deformation level corresponds to supx∈RD ||ϕ(x) − x||

(evaluated on the test points). Results are shown in Figure 4-4. In this experiment GMM

and RDM only used a single initialization. This experiment shows that a substantial gain

in normal recovery is obtained by using RDM relative to imposing template structure after

matching.

75

In the second set of experiments, we tested the performance of our algorithm in the

2-D and 3-D synthetic settings against 4 other methods: CPD, DIFF, FGM-U, and GMM.

For 2-D we used a point-set consisting of 5 curves (Figure 3-3) from the aforementioned

dataset, while for 3d the Stanford bunny and TOSCA datasets were used. We create

the target by randomly perturbing points lying along a grid and solving for a TPS with

identity affine component. No information about the target normal vectors is known

beforehand. After registration, the mean distance to the corresponding point [average error

(AE)] is recorded. For the occlusion trials, an approximate fixed deformation level is used,

.3 in norm for 2-D and .15 in 3-D. While operating at similar degrees of generalization

to occlusion, RDM performs much better at medium and high levels of deformation than

the competitors. Robustness to outliers and noise is also studied. For these experiments

FGM-U was run at 50 scales, due to runtime limitations. One can see from 4.4 that

FGM-U struggles with nonrigid deformation. A plot showing the percentage of recovered

normal orientations is included as well.

4.4 Non-Synthetic Matching Experiments

We perform intra-class matching experiments on the TOSCA [10], FAUST [7],

and Gatorbait datasets. TOSCA and FAUST represent the 3-D performance gauge

on real matching experiments and GatorBait for real multi-curve datasets. The same

statistic as above—average error to correspondent—is collected for the sets with known

correspondence. We present recall (percentage of correct correspondences within a

threshold) for matching pose 0 to poses 1, 2, 4, and 5 (smaller deformations) of the

FAUST training registrations over all 10 subjects and present comparisons with GMM

and CPD. For TOSCA we match the first cat, dog, and gorilla to the remaining poses.

We have foregone benchmark comparisons here because in the large deformation regime

extrinsic matching is prone to local minima, and we restrict the comparisons to relative

performance among other extrinsic matching techniques. We simply use these datasets as a

baseline for comparison with GMM and CPD.

76

The GatorBait dataset does not have known correspondences. Furthermore, it consists

of nearly abutting curves (Figure 3-3)—organizing points into their appropriate curve

components is made much harder by the existence of neighborhood points on different

curves. The Frechet Distance [102] between corresponding parts in the final registration

and the target is recorded. This allows us to measure how accurately each part of the

template is matched to the target. The first fish species is used as the template and

matched to 23 other species. We also perturb the fish with noise and add outliers as

uniformly drawn additional points as shown in 4.4. For large 3-D datasets, DIFF and

FGM were found to be impractical from a runtime perspective (for a runtime comparison

see Figure 4-7). For the GatorBait dataset, FGM was not competitive.

4.5 CMU House Dataset

The CMU House dataset consists of a sequence of image frames and keypoints. The

task is to perform point-matching and recover correspondences between points. From a

correspondence standpoint, FGM [113] with Delaunay triangulation (FGM-del) is the

state of the art on this dataset. However, FGM is sensitive to the graph structure—with

2-nearest neighbors FGM’s performance suffers. Should a large set of correspondences be

needed, graph matching becomes impractical—even the 30 correspondences here represent

significant computational effort for graph matching. To initialize normals for RDM, we

extract the gradient of the image I at each of the keypoints in the frames. This is a

departure from the usual consideration of ‘normals’—we sample a vector field (∇I) at

discrete points.

4.6 3-D Subcortical Structure Registration

The IBSR dataset2 consists of a collection of 256 × 256 × 128 MRI volumes scanned

at varying resolutions, including marked analysis images. Each scan comprises over 75

marked regions, corresponding to different neuro-anatomical components. From such

2 http://www.cma.mgh.harvard.edu/ibsr

77

http://www.cma.mgh.harvard.edu/ibsr

Table 4-2. Average (Standard deviation) initial DICE and final DICE scores over a setof four subcortical structures registered along the boundary. RDM produces asubstantial increase in DICE compared to the default alignments. It performsroughly as well as CPD when we use the same transformation basis (Gaussianr.b.f.) and better than GMM in each setting.

Part Initial DICE RDM-TPS DICE RDM-GRBF DICE GMM-TPS DICE CPD DICELeft Thalamus .80 (.02) .88 (.02) 91 (.01) .86 (.02) .91 (.02)Left Putamen .74 (.04) .79 (.01) .77 (.02) .77 (.05) .77 (.01)Right Thalamus .78 (.04) .92 (.01) .95 (.03) .87 (.02) .94 (.01)Right Putamen .64 (.08) .81 (.03) .79 (.02) .79 (.04) .81 (.03)

data-sets atlases can be build that encompass the primary modes and variations of

common structures in the brain. These can later be used for computer-aided diagnostics.

In this experiment, we selected 8 sub-cortical structures from the mid-brain: the left

and right Thalamus, Putamen, Hippocampus, and Amygdala. A boundary mesh for each

substructure was extracted using marching cubes on the labeled analysis image. Then the

resulting mesh was transformed to oriented points using barycenters and face-centered

normal vectors. We therefore end up with a simple collection of oriented points, rather

than a complex of meshes. To form a aligned set of structures, we register the resulting

oriented point-sets with a minimal bending energy. The resulting transformation is used

to deform the MRI volumes, and then the resulting transformed image is treated as a

candidate segmentation from which a DICE score is computed. We report the mean (and

standard deviation) of the DICE coefficients across n = 5 trials of matching for each

structure. The average bending energy (of the estimated TPS) is .6.

4.7 Maximum Likelihood Registration with |ψ|2 as a Density

In this Section an alternative alignment procedure to RDM is evaluated for affine

alignment. The probabilistic interpretation of ψ naturally has the flavor of a mixture of

Gaussians. However, ψ has curve normal information and the squared magnitude of ψ is

not actually a mixture of Gaussians. The latter indeed has eigenvector information in the

covariance matrix but this cannot be interpreted as the normal to a curve: eigenvectors

have direction but not orientation since ek and −ek are the same eigenvector.

78

In the complex wave ψ, the normal is directly encoded into the representation, and we

can solve for it in a number of ways. Our preliminary results for curve normal parameter

estimation using maximum likelihood suggest that the interesting (computationally hard)

problem of orienting the normals will be an exciting new route to the signed distance

function problem. We must emphasize here that the CWR lends itself to multiple avenues

of the parameter estimation process: probabilistic, geometric, and data driven (see below).

The unnormalized function |ψ(x)|2 is

|ψ(x)|2 ∝N,N∑j,k=1

cos

(νj(x− µj)− νk(x− µk)

λ

)exp

−∥x− µj∥

2

2σ2− ∥x− µk∥

2

2σ2

. (4–1)

Note that this is not the L2 norm but the squared magnitude of ψ at location x. It is not

obvious from the expression above, but as |ψ(x)|2 is the magnitude squared of a complex

number, it is nonnegative everywhere. When suitably normalized, |ψ(x)|2 can be treated

as a probability density function which immediately connects it to the plethora of shape

density functions used in the literature.

Here we consider the shape registration problem under a maximum likelihood

formulation. C = µk, νkN1k=1 is given as a template and the task is to find a mapping

from P = xjN2j=1 to C within a class of admissible maps H. The maximum likelihood

optimization problem

maxf∈H

N2∏j=1

|ψ(f(xj);C)|2, (4–2)

is robust to Gaussian noise on C (Figure 4-8). Here H consists of a rotation followed by a

shear. Note that this is not the same as maximizing the likelihood of a Gaussian mixture

on a test point-set since the cross terms of |ψ|2 interact. Instead, it is uniquely suited to

situations in which an oriented template is registered to an unoriented point-set.

79

...

..0.

0.1.

0.2.

0.3.

0.4.

0.5.

0.6.0 .

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.

Deformation Level

.

Error

.

. ..FGM . ..DIFF . ..GMM . ..CPD . ..RDM

..

..0.

0.1.

0.2.

0.3.0 .

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.

Occlusion Rate

.

Error

...

(a)

...

..0.

0.5.

1.

1.5.

2.

2.5.

3.

3.5.

4.

4.5.

5.

5.5.

·10−2

.0.2 .

0.25

.

0.3

.

0.35

.

0.4

.

0.45

.

0.5

.

0.55

.

0.6

.

0.65

.

0.7

.

0.75

.

0.8

.

0.85

.

0.9

.

Noise Level

.

Frechet

Distance

..

..0.

0.1.

0.2.

0.3.

0.4.

0.5.

0.6.

0.7.

0.8.

0.9.

1.

1.1.0.1 .

0.15

.

0.2

.

0.25

.

0.3

.

0.35

.

0.4

.

0.45

.

0.5

.

0.55

.

0.6

.

0.65

.

0.7

.

0.75

.

0.8

.

Outlier/Inlier Ratio

.

Frechet

Distance

...

(b)

Figure 4-5. Experimental comparison of RDM and several point and graph matchingalgorithms on a 2-D dataset. a) Average error for 2-D synthetic experiments.The GatorBait Dataset is deformed as explained in Section 4.3. The left plotshowcases robustness to moderate deformation levels and the right plot showsrobustness to occlusion (dropping points from a randomly placed circular disc).b) Average Frechet Distance for 2-D real experiments. The Frechet distancesbetween the registered template and the target curves are reported. On the leftthe target has added noise of the indicated standard deviation and on the rightoutliers are added.

80

...

..0.

0.05.

0.1.

0.15.

0.2.

0.25.

0.3.

0.35.

5 · 10−2

.

0.1

.

0.15

.

Deformation Level

.

Error

..

..0.

0.05.

0.1.

0.15.

0.2.

0.25.

0.3.

0.35.0 .

0.03

.

0.06

.

0.09

.

Occlusion Rate

.

Error

..

..0.

0.05.

0.1.

0.15.

0.2.

0.25.

0.3.

0.35.

0.5

.

0.75

.

1

.

Deformation Level

.

%of

Recovered

Orientations

.

. ..localPCA . ..RDM

...

(a)

...

..0.

5.

10.

15.

20.

25.

30.0 .

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.

0.6

.

0.7

.

0.8

.

0.9

.

1

.

Percentage of Diameter

.

Recall

..

..

5

.

10

.

15

.

20

.

25

.

30

.

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.

0.6

.

0.7

.

0.8

.

0.9

.

1

. Percentage of Diameter.

Recall

...

(b)

Figure 4-6. Experimental comparison of RDM and several point and graph matchingalgorithms on 3-D datasets. (a) The same experiments as 4.4 b) are carriedout in 3-D on the Stanford Bunny. The legend is consistent between that figureand this one. Here we also report the percentage of normal vectors recoveredto within a cone of π/3 radians.The left plot showcases robustness to moderatedeformation levels and the right plot shows robustness to occlusion (droppingpoints from a randomly placed circular disc). (b) Recall for TOSCA (left)and FAUST (right). Recall graphs for a subset of TOSCA and FAUST arereported. See Section 4.4 for more details.

81

0 0.01 0.02 0.030

0.2

0.4

0.6

0.8

1

Distance

Rec

all

Matching to Frame41

0 0.01 0.02 0.030

0.2

0.4

0.6

0.8

1

Distance

Rec

all

Matching to Frame61

0 0.01 0.02 0.030

0.2

0.4

0.6

0.8

1

Distance

Rec

all

Matching to Frame81

0 0.01 0.02 0.030

0.2

0.4

0.6

0.8

1

Distance

Rec

all

Matching to Frame101

RDM CPD GMM DIFF FGM NONE

AUCAlgorithm Frame 40 60 80 100FGM-del 1.00 1.00 1.00 1.00FGM-2NN 1.00 .867 .800 .500RDM .931 .871 .857 .833CPD .888 .819 .731 .681GMM .862 .795 .738 .671DIFF .836 .791 .688 .403

RuntimesAlgorithm Frame 40 60 80 100FGM-del 8.8s 11s 17s 15sRDM 5.9s 5.1s 6.5s 5.2sCPD .32s .35s .44s .38sGMM 6.2s 7.5s 8.8s 11sDIFF 12s 11s 12s 15s

Figure 4-7. Recall graphs and area under the curve for the CMU House. For FGM,triangulation yields excellent matches but nearest-neighbor graphs arepoor. All experiments run on an AMD X2 B22 with 8Gb of RAM. Ourimplementation is not optimized for runtime but is competitive. AlthoughGMM has quicker function evaluation, RDM converges faster for this dataset.We report AUC in the table.

82

σerr/σdata |θ∗ − θ| |s∗1 − s1| |s∗2 − s2|.045 0.006 0.076 0.046.075 0.008 0.088 0.027.09 0.008 0.052 0.052

Figure 4-8. Maximum likelihood alignment using |ψ(x)|2 as a density.The blue circles() are a noiseless template with accurate normal data while the red points(×) are points sampled from the template with Gaussian noise added. For arange of noise parameters, an alignment of the noisy data to the template wasfound by maximizing the likelihood in (4–2). The unknown transformationparameters were drawn uniformly with θ ∈ [0, 2π) and s1, s2 ∈ [.5, 2]. θ is thetotal rotation angle of the template, s1, s2 scale in the respective directions ofthe rotated basis. σdata is the spatial standard deviation of the points in thetemplate, and σerr is the standard deviation of the added Gaussian noise to thetest point-set. Root mean squared relative error is reported over 25 trials ateach noise threshold.

83

CHAPTER 5THEORY OF THE REPRESENTATION: CONNECTEDNESS, COMPLETENESS, AND

CONTRIBUTIONS TO THE GABOR EXPANSION

In this Chapter theoretical results relating to the CWR as an implicit shape field are

developed along two axes:

Direct analysis of the connectedness of the zero level-sets of the phase,

Asymptotic approximation of the modular distance field by families of GaborWavelets.

5.1 Connectedness of Pairs of Complex Waves

This section contains a direct proof of the connectedness of the zero level-sets of

the Complex Wave Representation for symmetrically oriented points. In some cases

explicit specification of the geometry of the level-sets is also available. The proofs of these

theorems only go through when there are two atoms present in the field. In Section 5.2 a

pertubation-based argument proves the existence of an extension to more than two atoms

in certain cases.

The main goal of this section is to show the connectivity of a 0−level-set of the phase

of a superposition of waves. The appropriate set runs through a neighborhood near each

of the wave source locations. Throughout, m1,m2 will refer to the spatial locations in the

Euclidean plane of two oriented point centers and ν1, ν2 will refer to their respective unit

normal vectors. We will use ϕ1, ϕ2 to refer to the respective angles of ν1, ν2. ψ will be the

CWR of the two oriented points. Unless otherwise stated m1,m2 will be assumed to take

values (1, 0) and (−1, 0). To obtain the theorem for general symmetric configurations relies

on the behavior of ψ and the parameters under the action of similarities.

5.1.1 Zeros of ψ

The non-vanishing of Reψ implies the existence of θψ (the phase of ψ) and it will be

important to establish it in the sequel. We will characterize the zeros of θψ in terms of

Imψ, so we will need to prohibit co-occuring zeros. We characterize the set where ψ can

vanish for a pair of oriented points.

84

Lemma 2. Let m(1)1 > 0 and m2 = −m1. Then |ψ| = 0 lies entirely within x1 = 0.

Proof. ψ(x) = exp− ||x−m1||22σ2 + iνT1 (x − m1) + exp− ||x−m2||2

2σ2 + iνT2 (x − m2).

|ψ| = 0 ⇐⇒ |ψ|2 = 0. And

|ψ|2 = [Reψ]2 + [Imψ]2 = exp−||x−m1||2

σ2+ exp−||x+m1||2

σ2

+ 2[sinνT1 (x−m1)/λ sinνT2 (x+m1)/λ+ cosνT2 (x−m1)/λ cosνT2 (x+m1)/λ

]exp−||x−m1||2

2σ2− ||x+m1||2

2σ2

= exp−||x−m1||2

σ2+ exp−||x+m1||2

σ2+

2 cosνT1 (x−m1)/λ− νT2 (x+m1)/λ exp−||x−m1||2

2σ2− ||x+m1||2

2σ2

≥ exp−||x−m1||2

σ2+ exp−||x+m1||2

σ2 − 2 exp−||x−m1||2

2σ2− ||x+m1||2

2σ2.

Suppose that x∗1 > 0 (the first component of x∗ is positive). Then ||x∗−m1||2 < ||x∗+m1||2

and so exp− ||x−m1||22σ2 > exp− ||x+m1||2

2σ2 . By factoring the last equation as

(exp−||x−m1||2

2σ2 − exp−||x+m1||2

2σ2)2,

which is nonnegative by the preceding observation, we see that

|ψ(x∗)| > 0.

The same argument shows that |ψ(x)| > 0 for x ∈ R− × R. Therefore, zeros can occur only

when ||x−m1|| = ||x+m1||, so x must lie on x1 = 0.

It is also clear from the argument above that zeros can only occur when cosνT1 (x −

m1)/λ− νT2 (x+m1)/λ = −1, which implies

Theorem 5.1. Zeros of |ψ| can only occur along the Voronoi boundary between m1,m2 at

points x : νT1 (x−m1)− νT2 (x−m2) = λkπ such that k ∈ 2Z+ 1.

Zeros of magnitude are one possible form of “disconnection” in the level sets of the

phase of ψ. These correspond to real zeros crossing imaginary zeros. Another type of

disconnection that can occur is a disconnection caused by endpoints in the zero level-set.

85

Part of any proof of connectedness will involve either excluding these possibilities or

explicitly showing a parametric connected curve arising from the implicit functions. This

subsection divides along these lines. In the symmetric case (when m1 = −m2 = −(1, 0)

and ϕ1 = π − ϕ2) we can provide a closed form parametric curve for the zeros. In the

asymmetric case we can characterize when the disconnections occur as we let ϕ1 or ϕ2

move away from the symmetric configuration.

5.1.2 Connectedness of θψ = 0 for Symmetric Configurations

It is simpler to prove the connectedness of the zero level set of the phase under special

configurations. The approach taken here is to first prove it for a class of configurations and

then expand out from there. The configuration that is most amenable to connectedness

is one in which the normal vectors and the locations “agree.” By agree we mean that the

equation of the lines

ℓ1 = x : ν1 · (x−m1) = 0,

ℓ2 = x : ν2 · (x−m2) = 0,

intersect at the interface of equal distance between the points. We also call this the

“coherent” case. It is depicted visually by the red and blue lines in 5.1.2. We further

simplify this by assuming symmetry of m1,m2 about the y−axis. This assumption is

without loss of generality of conditions for the connectedness, as will be shown below. This

case is characterized by the following equations

m(1)1 = −m(1)

2 , ν(1)1 = −ν(1)2 , ν

(2)1 = ν

(2)2 . (5–1)

Note that the above conditions imply the existence of a point p ∈ x1 = 0 such that

d(p,m1) = d(p,m2) and ν1(p−m1) = 0 = ν2(p−m2).

Theorem 5.2. Under the conditions of Equation (5–1), the zero level-set of the phase of ψ

is connected from a small set near m1, thru p, and to a small set near m2. It is symmetric

about the line of equidistance. If ν(1)/λ ∈ Zπ/2 then the line goes through m1 and m2.

86

.. y..π2

.π

.3π2

.2π

Figure 5-1. Visualization of g along a vertical slice of the set containing a zero of Imψ. Thered function is the second term and the blue is the first term of g. The greenfunction is the resulting tempered sinusoid. The tan region contains the zerolevel-set of g for all values of x ∈ [−1, 0].

The following claim will help to prove the Theorem.

Claim 1 If Imψ = 0 and Reψ = 0 on A ⊂ R2 then θψ = 0 on A.

Proof. As θψ = tan−1(

ImψReψ

)and the argument to arctan is defined and zero, the claim

follows.

Now we are ready to prove Theorem 5.2.

Proof. Factor out a term from Imψ and define a function f as follows

f(x) = Imψ/(exp−(x2 + y2 − 2x+ 1

2σ2)

= exp−2x/σ2 sin(ν1((x, y)−m1)/λ) + sin(ν2((x, y)−m2)/λ).

Since m1,m2, ν1, ν2 obey Equation (5–1), we can write

f(x) = exp−2x/σ2 sin((ν(1)(x+ 1) + ν(2)y)/λ) + sin((−ν(1)(x− 1) + ν(2)y)/λ).

The zeros of f coincide with the zeros of Imψ, as the factor removed from Imψ in

constructing f is nonzero.

Fix x ∈ [−1, 0]. So exp−2x/σ2 ≥ 1. In the y direction about the point (x, ν(1)x−ν(1)ν(2)

),

which falls on the line ℓ1 perpindicular to ν1 going through m1, the first term in f acts line

87

a sine without a shift. We can view f about this point by rewriting f as

g(y) = exp−2x/σ2 sin(ν(2)y/λ) + sin(ν(2)y/λ+ θ). (5–2)

By inspecting Equation (5–2) one can see that in the y direction g is periodic with period

2πλ/ν2. Note that

g(y) = f(x,ν(1)x− ν(1)

ν(2)+ y).

Now we wish to show that at each x, g has a single zero near ℓ1.

g has the form A sin(ωt) + sin(ωt+ θ) and so it is possible to write g as a single sine,

g(y) =√1 + exp−4x/σ2+ 2 exp−2x/σ2 cos(θ) sin(ν(2)y/λ+ ϕ),

θ = ν2(x−m)/λ

ϕ = tan−1(sin(θ)

exp−2x/σ2+ cos(θ)). (5–3)

First note that the factor in the square root is nonzero and nonnegative since if θ = π

then exp−2x/σ2 > 1 and so for some ϵ > 0 we have exp−4x/σ2 = (1 + ϵ)2 and

1 + (1 + ϵ)2 − 2(1 + ϵ) = ϵ2 > 0. The next important fact is that the denominator of the

argument to tan−1 in the equation for ϕ is nonzero for all x ∈ [−1, 0], and so all points

along a zero level-set of this function belong to the same branch of the arctangent. Thus

γ(x) =

(x,−ν

(1)x+ ν(1)

ν(2)− λϕ/ν(2)

)defines a smooth curve of zeros of the function f . γ is connected since γ(2) varies smoothly

with x (as exp−2x/σ2+ cos(θ) is nonzero).

In this symmetric configuration ν1(x − m1) − ν2(x − m2) = 0. So it follows that

there are no zeros of ψ along x = 0. Thus, by Claim 1 these zeros of f are also zeros

of the phase of ψ. It follows that this line of zeros passes within λ exp−2/σ2ν(2)(1−exp−2/σ2) of m1 as

ddx

tan−1(x) < 1. This bound is rough, as it does not take into account the relationship

88

between sin and cos functions but simply extremizes both independently, a tighter bound

can be obtained by substituting cos−1(exp−2/σ2) into θ.

Since all of the parameters have reflection symmetry about x = 0 so does Imψ.

Thus the embedded curve γ can be extended from [−1, 0] to [−1, 1] symmetrically, defining

zeros of Imψ on the interval connecting m1 to m2. Once we do this, we can write the

extended curve as

γ(x) =

x,−ν1(|x| − 1) + λ tan−1(

sin(−2ν1|x|/λ)cos(−2ν1|x|/λ)+exp2|x|/σ2

)ν2

.

If ν(1)/λ ∈ Zπ/2 then at x = −1, 1, the arctangent is zero. So γ runs through m1,m2.

Let q denote a line in R2, and Rq denote the reflection operator about q — a

Euclidean transformation. Recall the action of Euclidean transformations on oriented

points from Chapter 3. Since the preceding argument does not depend on the choice of

σ, λ the following Theorem holds

Theorem 5.3. Let m1,m2, ν1, ν2 be a collection of two oriented points such that there

exists a line z about which Rz(m1, ν1) = (m2, ν2). Then the CWR for (m1, ν1), (m2, ν2) has

a connected zero level set going between small neighborhoods of m1,m2 through the point of

intersection of normal lines defined by the oriented points.

Proof. There is a similarity transforming the oriented points in an arbitrary configuration

subject to the conditions above to the configuration of Theorem 5.2. If the action of the

similarity changes the magnitude of ν1, ν2 then adjust λ to make them have unit norm

again—since the Theorem’s conditions do not depend on λ this can be done without

changing the result. In the new coordinate system Theorem 5.2 holds. Similarities are

smooth bijections, so they preserve connectedness of sets. Since the similarity maps the

zeros of the transformed function to the zeros of the original, the result follows.

89

5.1.3 Connectedness of θψ = 0 for Asymmetric Configurations

The technique of re-writing an equation proportional to Imψ does not result in an

explicit formula of a cuve for the asymmetric case. It is possible to obtain a pure sinusoid

in the direction ν1+ν22

, but this introduces a phase shift with an (x, y) dependence which

does not help to characterize the zeros of the resulting function.

Therefore, we tried two approaches to characterize the zeros in this case which led

to success. First, a numerical approach was used to explore the different configurations

of ν1, ν2 and the effect on connectedness. Based off of this a few conjectures about the

behavior of an analytic relationship were established. Then after deriving a geometric

relationship describing the transition from a connected to disconnected state, a collection

of nonlinear equations is derived characterizing the transition. From here, an implicit

characterization that is directly interpretable can be derived. The situations can be

characterized when asymptotics in σ, λ are considered. In most non-asymptotic cases the

functional characterization is not directly solvable. However, applying numerical means to

it results in a more effecient and robust characterization procedure than initially devised.

The result is a fast algorithm that can determine connectedness given parameters.

5.1.3.1 Numerical analysis of asymmetric connectedness

A numerical approach was taken to characterize the connectivity, with the hope

that this would lead to a hypothesis about the underlying behavior of the CWR for

2-atom asymmetric configurations. For a range of σ, λ values, we computed the CWR for

m1 = (−1, 0), m2 = (1, 0) at every possible configuration for ν1, ν2 (up to symmetry for

ν1). The CWR was computed on a grid of [−3, 3]× [−10, 10] with spacing h1 = .03, h2 = .1.

Then, the zero level-sets of the resulting image were extracted using marching squares.

Each of these corresponds to a series of locations in the image plane, and the magnitude

at these locations was summed. This was taken as an approximation to the line integral of

|ψ| along the zero level-sets. There may be several such level-sets, so this is done for each.

Finally, the level-sets are ranked by this line integral. If there is a single dominant (at least

90

10 times the next highest value) value then this is taken as a symmetric configuration,

otherwise it is taken as disconnected. The results are shown in Figure 5-2.

Based on the results in Figure 5-2 a trend emerging is that for a fixed σ−value,

increasing the value of λ eventually results in a connected configuration so long as the

angles of the normal vectors belong to the same interval [0, π]. The following argument

as to why led to subsubsection 5.1.3.2, which provides an analytic characterization of the

event of a disconnection.

Suppose we fix ϕ1. View Imψ as a function of (x, y, ϕ2) for now. Equivalently, consider

the function f : R2 × R → R : (x, y, ϕ2) 7→ Imψ(x, y;ϕ1, ϕ2). We use the notation ϕ1, ϕ2 to

indicate the angle of the normal vector, i.e. ν1 = (cos(ϕ1), sin(ϕ1)).

By Subsection 5.1.2 we know that at ϕ2 = π − ϕ1 there is a single connected curve

joining a region near m1 to a region near m2. This is the symmetric setup. Then if we vary

ϕ2 smoothly on T then we get a smooth change of f . By the way the symmetric argument

is structured, it is clear that if a small variation of one of the normal vectors causes a

disconnection then that disconnection must occur at the intersection point of the lines

at the zero of Imψ. If there is an isolated ϕ2 then ∂f∂ϕ2

= 0. However, if we work out the

derivative of f with respect to ϕ2 we get

exp−|x− 1|2 + y2

2σ2 cos(ν2 · (x− 1, y)/λ)

ν⊥2 (x− 1, y)

λ,

which is nonzero along ν2(x − 1, y) = 0, except at x = 1, y = 0. Thus, it follows that in a

small neighborhood of π− ϕ1 that the zero level-set remains connected as we vary ϕ2. Note

that solving for ∂f∂ϕ2

= 0 does not depend on σ in the preceding argument. Considering

the surface of the zero level-set of f , it must be shaped like a saddle about a point of

intersection. Thus, we can look for points which a saddle point occurs to characterize the

disconnection.

91

5.1.3.2 An analytical condition for asymmetric connectedness

By inspecting visualizations of the imaginary part of ψ under varying ν2 conditions,

one notices a pattern. It is also possible to predict theoretically that for fixed values of

ν1, σ, λ moving the value of ν2 away from the symmetric case will lead to a disconnection,

and that at the point of disconnection a saddle point emerges. This is clear theoretically

because there are two possibilities of what can happen to the line connecting regions near

m1,m2 as we alter ν2: the line can become part of a ridge of extremal values resulting

in an endpoint, or the line can meet with another zero level-set. We will discuss these

possibilities in greater detail below, but it is clear to see that the former condition should

not be expected from the superposition of two sinusoids.

Thus, the saddle point emerges at the point where a disconnection (or connection)

occurs. Saddle points correspond to ∇Imψ = 0 with |HImψ| < 0. The gradient condition

provides us with two equations

∇Imψ = 0 or

0 =− (x+ 1, y)

σ2exp−(x+ 1)2 + y2

2σ2 sin(ν1 · (x+ 1, y)/λ)

− (x− 1, y)

σ2exp−(x− 1)2 + y2

2σ2 sin(ν2 · (x− 1, y)/λ)

+ ν1/λ exp−(x+ 1)2 + y2

2σ2 cos(ν1 · (x+ 1, y)/λ)

+ ν2/λ exp−(x− 1)2 + y2

2σ2 cos(ν2 · (x− 1, y)λ). (5–4)

For exposition we use the notation

θj = νj · (x−mj)/λ.

92

Note that we also expect the saddle point to show up on the Imψ = 0 level-set. This

means we have a third equation

Imψ = 0

exp−2x

σ2 = −sin(θ2)

sin(θ1), (5–5)

after simplifying a bit. We seek a ν2, x, y satisfying these conditions: since ν2 = (cos(ϕ2), sin(ϕ2))

these three equations can be used to predict a point (x, y) and an angle ϕ2 that produces a

disconnection.

To summarize this approach: we expect all values of ϕ2 between π−ϕ1 and the nearest

value of ϕ2 such that Imψ = 0,∇Imψ = 0 to result in a connected component going

through near m1 to near m2. We will solve explicitly for x, y given ϕ2 to get x(ϕ2), y(ϕ2).

Then we will use Imψ(x(ϕ2), y(ϕ2)) = 0 to solve for the nearest value of ϕ2 which results in

a disconnection.

First, we will manipulate Equation (5–4) into a useful form for further analysis. The

key to doing so will be invoking Equation (5–5). If we multiply both gradient equations by

expx2+2x+y2+12σ2 we get

− (x, y)

σ2exp−2x/σ2 sin(θ1)−

(x, y)

σ2sin(θ2) +

(2, 0)

σ2sin(θ2)

+ ν1/λ exp−2x/σ2 cos(θ1) + ν2/λ cos(θ2) = 0. (5–6)

Now we invoke Equation (5–5) to replace the exponential decay with a ratio of sines. Note

that this only applies away from zeros of sin(θ1).

(x, y)

σ2sin(θ2)−

(x, y)

σ2sin(θ2) +

(2, 0)

σ2sin(θ2)

− ν1/λ sin(θ2) cot(θ1) + ν2/λ cos(θ2) = 0.

93

The first two terms cancel and after reorganizing we get the following pair of equations

cot(θ1) =ν(1)2

ν(1)1

(cot(θ2) +

2λ

σ2ν(1)2

),

cot(θ1) =ν(2)2

ν(2)1

cot(θ2).

Now we set the two RHS equations equal to each other and rearrange

tan(θ2) =ν2 · ν⊥1ν(2)1

σ2

2λ.

Similarly for tan(θ1)

tan(θ1) =ν2 · ν⊥1ν(2)2

σ2

2λ.

So we can solve for x, y by solving

ν(1)1 (x+ 1) + ν

(2)1 y = λ tan−1

(σ2

2λ

ν2 · ν⊥1ν(2)2

),

ν(1)2 (x− 1) + ν

(2)2 y = λ tan−1

(σ2

2λ

ν2 · ν⊥1ν(2)1

). (5–7)

We obtain

x =

λ

(ν(2)2 tan−1

(σ2

2λ

ν2·ν⊥1ν(2)2

)− ν(2)1 tan−1

(σ2

2λ

ν2·ν⊥1ν(2)1

))− ν(1)1 ν

(2)2 − ν

(2)1 ν

(1)2

ν⊥1 ν2,

y =

λ tan−1

(σ2

2λ

ν2·ν⊥1ν(2)2

)− ν(1)1 (x+ 1)

ν(2)1

. (5–8)

There are a few important features to address in this equation. First, the tan−1

has branches spaced at distance π. Determining which branch corresponds to the zero

and saddle point that indicates a disconnection is the first problem we need to solve

to predict parameters that lead to disconnections. Fix a branch of the arctangent and

let x∗, y∗ denote the corresponding solution to Equation (5–8). While it is possible that

Imψ(x∗, y∗) = 0 for this solution, some other branch may solve the necessary equation.

94

Looking at Equation (5–7), it seems that the first few cuts will correspond to the most

significant level set, as the LHS of these equations corresponds to the line perpindicular

to the parameters of the oriented points. This mean that taking a greater branch will

push the solution in Equation (5–8) further away from the centroids and decrease the

likelihood of the resulting curve. We have found empirically the +π branch for the first

and third arctangents results in similar zero crossings to the numerical results reporting in

subsubsection 5.1.3.1. To get more insight into this reasoning see Chapter 6 Section 6.1.

Second, we must consider viability of the Equations. The substitution of Equation

(5–8) requires sin(θ1) = 0. Note that, if sin(θ1) = 0 then for Imψ = 0 it follows that

sin(θ2) = 0. Also, the cos terms must both be 1. Then the gradient equation results in

ν1 = exp−2x/σ2ν2, which can only happen if ν1 = −ν2 and at x = 0. Equations (5–7)

require ν(2)1 , ν

(2)2 be nonzero, and ν⊥1 · ν2 be nonzero. These conditions are each discussed

below.

We first focus on the ν⊥1 · ν2 → 0 asymptote. This asymptote occurs when ν1 = ±ν2.

If ν(2)1 = 0 then the result is a disconnected pair of vertical curves. Observe that from

Equation (5–6) if we replace ν2 by ν1 then the y−equation corresponds to Reψ = 0. This

leads to the equation ν(1)1 = λkπ for some k ∈ 2Z + 1. Thus if λ > 2/π then these cases

remain connected for ν1 in the upper half arc.

Finally, returning to the dependence on the parameters σ, λ. We can predict the

dynamics of the disconnections as we vary σ, λ. First, if we fix everything but λ we

note that the function λ tan−1(1/λ) has a zero and point of C1 discontinuity as λ ↓ 0.

Intuitively, as the wavelength λ ↓ 0 the disconnections happen more rapidly as we vary

ν2. Indeed, we can see that as λ ↓ 0 that unless we are in a symmetric configuration that

eventually the condition in Equation (5–8) will be satisfied for any given branch of arctan.

On the other hand, as λ ↑ ∞ we get that

ν1 · (x−m1) =σ2ν2 · ν⊥12ν

(2)2

, ν2 · (x−m2) =σ2ν2 · ν⊥12ν

(2)1

.

95

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = .45, λ = .45

(a)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = 1.6, λ = .45

(b)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = .45, λ = 1.6

(c)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = 1.6, λ = 1.6

(d)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = .45, λ = 3.2

(e)

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = 1.6, λ = 3.2

(f)

Figure 5-2. Numerical experiments showing the connectedness and non-connectednessat different values of parameters. White cells indicate a single connectedcomponent in a subdomain with total magnitude at least 10 times the nexthighest components magnitude. This is the situation in which the unwrappingalgorithm outlined below has a high confidence of success.

96

0.5 1 1.5 2 2.5 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = .45, λ = .45

(a)

0 0.5 1 1.5 2 2.5 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = 1.6, λ = .45

(b)

0.5 1 1.5 2 2.5 30

0.5

1

1.5

2

2.5

3

Angle of ν1

Angleofν 2

σ = .45, λ = 1.6

(c)

0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Angle of ν1

Angleofν 2

σ = 1.6, λ = 1.6

(d)

0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Angle of ν1

Angleofν 2

σ = .45, λ = 3.2

(e)

0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Angle of ν1

Angleofν 2

σ = 1.6, λ = 3.2

(f)

Figure 5-3. Plots showing the zero crossings of interest for the analytical solution to thedisconnection problem. White cells indicate positive values for Imψ (computedby assuming reflection symmetry about ϕ1 = π/2 and ϕ2 = π/2). The dotsalong the anti-diagonal are due to numerical instability as ν1 · ν2 ↓ 0. Insubfigure b) a flaw of the numerical experiments is exposed: the rankingalgorithm becomes less effective when the spatial covariance variable is veryhigh and so ordering the significance of curves becomes harder. Anecdotally,the result indicated by these plots is more suggestive of the real situation.

97

If we then apply a small angle approximation to the sines in the Imψ = 0 equation we

get

exp−2x/σ2 = −ν(2)1 /ν(2)2 .

This clearly has no solution when ν1 and ν2 both point in the same upper or lower

half-space. This justifies the suggestion that for fixed σ, ν1, ν2 sending λ ↑ ∞ tends

to produce a connected contour as long as the normal vectors are not in a degenerate

configuration.

As σ ↑ ∞ we get θ1 → π/2λ, θ2 → π/2λ. This leads to exp−2x/σ2 = 1 as the

equation for Imψ = 0, since one of θ1, θ2 must be negative and one must be positive for a

zero to occur.

As σ ↓ 0 both equations go to zero, and we expect the break to happen near the line

x = 0 as the exponential in the imaginary zero condition blows up away from this line.

This results in

sin(θ1) + sin(θ2) = 0,

θ1 = θ2 + π + 2kπ,

since sine is 2π periodic and flips sign by a shift of π. This results in the same conditions

of Theorem 5.1 as σ ↓ 0.

5.2 Going Beyond Two Atoms with Imψ

In this subsection, the subset of the class of curves for which the representation is

complete is outlined. If we relax the representation to have arbitrary R-valued coefficients

(instead of unit-valued coefficients) in front of each wave component then the closure of

the class of curves for which the representation is complete is the set of all curves arising

as the level-sets of phases of functions in the Modulation Space of functions with bounded

Gabor expansions.

However, this result alone is not fully satisfying for several reasons:

98

1. The unit-valued coefficient can be interpreted better than R−valued coefficients,

2. The type of convergence may not be amenable to engineering techniques or

applications,

3. We may need to use an arbitrarily small value of σ,

4. There may not be a straightforward approach to constructing curves in the closure of

a family of implicit functions.

We provide a more detailed analysis of this general form of completeness in the following

Section, where we analyze the contributions to the Gabor expansion of an auxiliary field

to the signed distance function. This field represents an element of the equivalence class

of curves embedded inM that corresponds to some fixed curve. It is privileged in some

sense: it has large values of Gabor expansion at points drawn from the boundary with

frequency vectors drawn from the normal bundle at that point.

The class of curves we focus on in this Section arises in a natural way from the proof

of the connectedness in the two-atom case above. Fix a length L > 0. Consider the

following constraints on polygons P = (ni, vi, ei)Ni=1 (where vi, ei represent the polygon

and ni are nodes representing the midpoints of the ei).

(D1) The length of each edge of P is an integer multiple of L,

(D2) The set of nodes N defined by midpoints along piece-wise linear components of

length L can be numbered in order (along the boundary of P) as 1, 2, . . . , n and each

pair meets at some vertex vi of P at distance L/2 from both node ni and node ni+1,

(D3) Let (ni, ni+1) be the nodes adjacent to vertex vi. No other node is as near to vi. No

other node is nearer to ni or ni+1 besides ni−1 or ni+2 respectively.

We will show that the set of polygons P is approximable in the directed Hausdorff sense

by the zero level set of ImψP where Pi = ni, (vi−ni)⊥

||vi−ni|| N−1i=1 ∪ ni+1,

(ni+1−vi)⊥||ni+1−vi|| . Given a

metric space (M,d), the direct d−Hausdorff distance (or just directed Hausdorff distance)

99

is a distance between two sets X, Y ⊂M . It is given by the following min/max formula

dH(X,Y ) = supx∈X

infy∈Y

d(x, y).

We consider the curves in this section as embedded in (R2, || · ||∞) and so the supremum

norm induces the inner distance function in our definition of the directed Hausdorff

distance.

The proof of this is sketched as follows:

1. The 2-atom case is shown above, corresponding to a pair of nodes and a vertex at

the intersection point of their normal lines,

2. Each successive pair of atoms in P obeys the conditions for the 2-atom case,

provided P obeys Conditions (D1), (D2) and (D3), since the locations are the

midpoints of straight line segments and the normal vectors are the corresponding

perpindiculars.

3. We show that given a 2-atom configuration obeying the conditions for the 2-atom

case, adding a perturbation of sufficiently small magnitude and gradient magnitude

results in a zero level-set close to the original in the H∞ metric.

4. We show that for a collection of atoms drawn from the polygon described above, for

each pair of consecutive atoms representing two nodes joined at vertex i there is a

function ηi that is an upper bound for the magnitude of the remaining atoms.

5. Therefore, there exists a σ sufficiently small (since it controls the magnitude of

ψP\Pi) so that the contribution from all oriented points beyond a fixed 2-atom case is

small.

6. Since there are a finite number of edges to consider in the polygon, and each one

has a σ such that the remaining part of Imψ , ImψP\Pi, causes a small perturbation

near it, by taking the minimal σ over all successive pairs we can bound the overall

directed Hausdorff distance between the new curve(s) and the polygon. Since

100

Imψ is smooth the curves transition smoothly between pairs the entire polygon is

approximated by this (these) curve(s).

Note that we do not prove the connectedness of the resulting curve. We also give anecdotal

verification of this class with examples, and provide examples of degenerate configurations.

5.2.1 Stability of Level-sets of Imψ

In this subsection it is shown that under the symmetric case and appropriate

additional conditions ψ admits a family of small perturbations that do not excessively

shift the zero crossings of Imψ. The main condition used here is Condition (D3). We will

show that for δ > 0 the directed Hausdorff distance from the polygon to the level-set is

less than δ.

Take (mi, νi) and (mi+1, νi+1) as the pair of atoms under consideration, call them

Q collectively. As outlined above, we will consider each consecutive pair of atoms as if

they were in symmetric configuration. This can be done by a similarity transformation

of the atoms. The resulting function will have scaled values of σ, λ corresponding to the

similarity. The similarity has determinant between 2/L and 4/L since a pair of atoms is

between L/2 and L apart and we will map them to 2 apart.

With the appropriate similarity transformation applied, we first make an observation

about the polygonal arc joining mi, vi,mi+1. Due to Condition (D3) no nodes are as near

to vi as mi,mi+1. These points have distance L/2 to vi since they are midpoints on a

linear segment ending at vi. Since Voronoi cells are convex, it follows that the whole curve

mi → vi → mi+1 falls in the union of the Voronoi cells of mi and mi+1. Call this curve Vi.

So there is some ϵ > 0 such that

L/2 + ϵ < minj∈1,...,i−1,i+2,...,N

d(mj, Vi).

By the triangle inequality, there are no other nodes within ϵ of mi,mi+1. So at each

point x within ϵ of Vi one of mi or mi+1 is nearer to x than any other point. Recall that

the sinusoidal part of the imaginary part of the symmetric configuration has constant

101

...

..−2.

−1.5.

−1.

−0.5.

0.

0.5.

1.

1.5.−2 .

−1.5

.

−1

.

−0.5

.

0

.

0.5

.

1

.

1.5

.

L/2

.

T

.

dmin > L/2 + ϵ

.

. ..imψ = 0 or γi

. ..upper and lower boundaries for γTi

. ..atoms

. ..nearest other atom

. ..Voronoi Boundary b/w mi, mi+1

. ..Vi

Figure 5-4. An explanatory figure to accompany the proof of approximation for themulti-atom case. By taking T = sin−1(κ) for κ ≤ sin( ϵν

2

λ), it follows that

T < ϵ. This is an important step in choosing σ, λ resulting in stable level-sets.

phase at all points: ν2/λ in the direction of the Voronoi boundary (recall ν2i = ν2i+1 in this

configuration). We will show that there is a value of σ such that after the perturbation by

ImψP\Pithe zero level set remains within distance δ of Vi.

For some value of the sinusoidal part, κ, there is a fixed T such that for every point z

along γi (the zero level-set of the resulting Imψ for these atoms) the value at z ± T (0, 1) is

±κ, provided that κ ∈ [0, 1]. The set γTi = (x, y + z)|(x, y) ∈ γi, z ∈ [−T, T ] falls within

δ′ = min(δ, ϵ) of Vi if we take

κ = sin(ν2T

2λ) and

λδ′ >σ2

2

we get that supx∈[−1,0]

sin(θ2(x))exp−2x/σ2+cos(θ2(x))

< δ′/2. Since T is at most δ′/2 and going from Vi

to γi by the overestimate of the arctangent yields a bound on the distance of points in γTi

to Vi of δ′/2, the total distance is at most δ′. The denominator of the overestimate of the

arctan does not get too small because of the bound on λ (see below). This provides a T

102

such that γTi falls within a dilation of Vi by δ′. We will bound the values of ImψPi

from

below and bound the values of ImψP\Pifrom above in this region.

Imψ is increasing through the zero level set by definition, so if necessary we can find

a sufficiently small value of κ such that Imψ is increasing in each vertical slice of γTi . We

need to find a lower bound for Imψ along the boundary. It is helpful to rewrite Imψ. We

can write Imψ as

Imψ(x, y) = F (x, y) sin

(ν22y

λ+ ϕ(x, y)

),

where F is defined as

F (x, y) =

(exp−(x+ 1)2 + y2

σ2+ exp−(x− 1)2 + y2

σ2

+2 exp−(x+ 1)2 + y2

2σ2 exp−(x+ 1)2 + y2

2σ2 cos(2ν11x/λ)

)1/2

and ϕ(x, y) is defined as ϕ in Equation (5–3), extended to be even in x, and also factors

in the shift associated with the opposing θi (depending on which half-space you are in).

To define F use the angle-sum formula, cos(α) cos(β) + sin(α) sin(β) = cos(α − β). If the

cosine term in F is positive then it is bounded below by

maxexp−(x− 1)2 + y2

2σ2, exp−(x+ 1)2 + y2

2σ2,

by droping all but the maximal value term and taking the square root directly.

Suppose that the cosine term in F is negative. Then 2|ν11x| > λπ/2, as the cosine is

nonnegative away from this set. So x2 >(λπ4ν11

)2. Suppose without loss of generality that

x > 0, so that x > λπ4ν21

. Thus, −(x+ 1)2 + (x− 1)2 < −λπ/ν11 . F can be written as

F (x, y) = exp−(x− 1)2 + y2

2σ2

·√1 + exp−(x+ 1)2 − (x− 1)2

σ2+ 2 cos(2ν11x/λ) exp−

(x+ 1)2 − (x− 1)2

2σ2.

103

Suppose we take λ ≥ 2σ2/π. Then, recalling ν11 = 0,

log(3/8) ≥ −1 ≥ − λπ

2σ2ν11and so

3/8 ≥ exp−(x+ 1)2 + (x− 1)2

2σ2, and

1/4 ≤ 1− 2 exp−(x+ 1)2 + (x− 1)2

2σ2.

Thus the lower bound for Imψ along the boundary is

Ji(σ) = minx∈∂γTi

Imψ ≥ κ/2 exp−(L/2 + T )2

2σ2.

Now we will construct a bound for the contributions from the remaining portions

of the polygon P \ Pi. We will refer to this function as ηi(x; σ). ηi(x;σ) will be strictly

bounded above by Ji(σ). Perturbing Imψi,i+1 by ηi does not cause points along the

boundary to change sign. Therefore, a zero level-set of the resulting perturbation still falls

within γTi . This consists of two steps:

1. Bound |ImψP\Pi| from above by ηi(x;σ),

2. Find a σ such that Ji > maxx∈γTi

ηi(x;σ).

First, we want to build a bound for |Imψ| from the atoms besides (i, i + 1). The

following inequalities provide a useful bound

|Imψ| = |∑

j =i,i+1

exp−d(x,mj)2

2σ2 sin(νj(x−mj)/λ)|,

≤ maxj =i,i+1

exp−d(x,mj)2

2σ2∑

j =i,i+1

| sin(νj(x−mj)/λ)|,

≤ (N − 2) maxj =i,i+1

exp−d(x,mj)2

2σ2

≤ (N − 2) exp−(L/2 + ϵ)2

2σ2.

We let

ηi(σ) = (N − 2) exp−(L/2 + ϵ)2

2σ2.

104

Now we will bound ηi(x; σ) < Ji on γTi . Suppose that (L/2 + ϵ)2 − (L/2 + T )2 >

2σ2 log(2N−2κ

). Then it follows that

κ exp(L/2 + ϵ)2 − (L/2 + T )2

2σ2/2 ≥ N − 2

Ji(σ) ≥ ηi.

and so Ji > ηi.

By the construction of γTi it has upper and lower boundaries at γTi ± T , as shown

in Figure 5-4. The effect of this result is to show that the values of Imψ along upper and

lower boundaries of γTi do not go below or above zero respectively (if ν2 > 0, above or

below resp. if ν2 < 0) by adding in additional points obeying Condition (D3). Therefore,

there is some value of σ such that the remaining portion of the imaginary part of ψ does

not push the zero level-set corresponding to i, i+ 1 outside or to the boundary of γTi .

In review, the following inequalities were required to be satisfied

ν1σ2/2 < δ′λ,

κ = sin(ν2T

2λ),

2σ2

π< λ,

ϵ >4σ2

Llog(

2

κN − 2).

These can all be satisfied by taking σ small enough. Let the zero level-set for the whole of

ψ be denoted by γ. By the argument above for every x ∈ Vi (for ψ as a whole) there is a

y ∈ γi within distance ϵ. Thus

supx∈Vi

infy∈γi

d∞(x, y) ≤ ϵ.

By taking δ′ to be less than the minimal ϵj over all j, we can ensure that the inequalities

are satisfied for all γj, Vj simultaneously. Since the directed Hausdorff distance obeys

105

dH(A ∪B,C ∪D) ≤ maxdH(A,C), dH(B,D) we get

dH(P , γ) ≤ maxj∈[1,2,...,N ]

dH(Vj, γj) ≤ δ.

Since δ was arbitrary, we conclude that the zero level-set of the imaginary part of ψ can be

made arbitrarily close to the polygon in the H∞ metric.

5.2.2 The Class of Curves Approximated By F

Lemma 3. Let γ : [0, 1] → R2 be a C∞ Jordan curve in the plane and ϵ > 0. Then there

exists a polygon γ of finitely many rational-length sides, such that

d∞H (γ, γ) < ϵ.

Claim 2 Polygons obeying Conditions (D1), (D2), (D3) for any L contain all Jordan

polygons of rational side lengths.

Proof. Let Q = (vi, ei, ni)qi=1 be an arbitrary Jordan polygon of rational side lengths.

Let (ai : bi) be the side lengths of the edges ei. Define L = supp:bi|p∀i

1/p. Note that we can

re-write each of the sides of Q as a collection of sides with step L = 1/p, creating new

vertices and nodes at each step, and the new polygon Q′ is equivalent to Q under any

standard metric. So now Q′ satisfies Conditions (D1), (D2).

Finally, we need to ensure that Condition (D3) is obeyed. To enforce this condition,

compute for each vertex vi the minimal distance to nodes outside of its neighbors di =

minj ∈i−1,i+1 d(vi, nj). If di ≤ L/2 for any i then repeatedly subdivide the intervals k times

until L/2k+1 < di, call this new length L and call this new polygon Q. During this process,

if np is the offending node and d(ep, vi) = dp then k = ⌈log(L/dp)⌉ is enough iterations to

terminate. Now the nearest that any node outside of its neighbors comes to vi is strictly

greater than L. And so Condition (D3) is obtained. Q does not have any vertices of nodes

that are not on Q and so they are equal as curves.

106

5.3 Asymptotic Approximation of Modular Distance Fields by Gabor Atoms

In this section we provide a proof of the claims in Section 2.3. That is,we show that

the point-wise behavior of the Gabor Transform of the modular distance field decays as

the parameters σ, λ ↓ 0. Furthermore, if σ2 = O(λ) then we provide a sketch that suggests

that |GgσΨσ,λ(m, ν/λ)| = O(1).

Claim 3 Suppose that S is a nonempty, bounded open set in Rd such that ∂S is a

smooth, connected, orientable manifold. Let (m, ν) ∈ Rd × Sd−1 be an oriented point. Let

J(σ, λ) = |

[Ggσ

Ψσ,λS

||Ψσ,λS ||

](m, ν/λ)| = ⟨

exp− ||x−m||22σ2 + iν

T (x−m)λ

(2πσ2)d/2,

Ψσ,λS

||Ψσ,λS ||⟩.

If (m, ν) is not drawn from the boundary of S with ν = n∂S, then |J(σ, λ)| → 0

super-polynomially as σ, λ→ 0 with λ = poly(σ).

Remark 1 Although this example does not quite fit into the statement of the Claim, an

analytical example of this type of decay is the integral in Equation (3–8). Clearly, when

λ ≥ O(σ2) this integral decays rapidly away from m = q, ν = ω.

Proof. The proof works in two cases. Let (m, ν) be an oriented point. First, suppose

m /∈ ∂S. This resulting integral decays exponentially (regardless of λ/σ) quickly by the

following bounds:

|J(σ, λ)| = |∫Rd

exp− ||x−m||2+bS(x)22σ2 + iν

T (x−m)−bS(x)λ

||Ψ||(2πσ2)d/2

dx|

≤∫Rd

exp− ||x−m||2+bS(x)22σ2

||Ψ||(2πσ2)d/2dx ≤ O(

1

exp 1σ2σ2d

). (5–9)

The first bound follows from taking absolute values inside the integral. The second bound

arises by the bound on ||Ψ|| from the text (both Equation (2–7) and Equation (2–8)) and

the following bound∫Rd

exp−||x−m||2 + bS(x)

2

2σ2dx

≤∫Rd\Bδ(m))

exp−||x−m||2

2σ2dx+

∫Rd

exp−||x−m||2 + c

2σ2dx, (5–10)

107

where c is a constant. The first term decays exponentially with σ by the rate of decay of

the erfc function (taking σ/δ to be sufficiently large). For the second term, we can bound

the contribution from exp− bs(x)2

2σ2 by the maximum value over the boundary of the ball

Bδ(m), which is of order exp− cσ2. The resulting integrals produce the normalization

coefficient which cancels the coefficient above, leaving the exp− cσ2 term which clearly

decays exponentially quickly as σ ↓ 0.

When m ∈ ∂S the previous bounds are insufficient to make the integral vanish, as

the second term in Equation (5–10) remains O(poly(σ)). To employ stationary phase

[106], we first need to cut down on the domain of integration by choosing a sufficiently

small σ so that the integral is approximated reasonably well by restriction to a compact

subset E that depends on all of S. First, let H to be the largest open ball about m that

contains no elements of the singular set, whereby bS is C∞ on E [25]. Since ∂S is C∞, H is

nonempty, and diam(H) < diam(S). Note that this implies that H ∩ ∂S consists of a single

connected component: as otherwise a midpoint between components would be contained

in H, and thus a skeletal point. Note that H may contain points x such that ∇bS(x) = ν.

The last restriction we require removes this case, ensuring non-stationarity of the phase:

since ∇bS|m = ν consider the preimage of ∇bS of an open neighborhood around ∇bS|m,

such that the closure does not contain ν, call it G. If G contains multiple connected

components, pare down to the one containing m. Then let F be G intersected with H. F

is open, so close it, let E = F . As it is clearly bounded, E is compact. Finally, we replace

the Gaussian factor by the Gaussian times a bump function that takes the value 1 on

V ⊂ E. When d is small (finite) we can choose a radius for V that is sufficiently snug

to E so that the product is an arbitrarily close estimate to the Gaussian that vanishes

smoothly on the boundary of E. Note that the error between the mollified integral and the

actual integral is bounded by the mass of the Gaussian in a thin shell which decays like

exp(− δ2

σ2 )δσ

d(|δ−η|) where η is the radius of V and δ = rad(E). Depending on the diameter

108

of E we must choose the appropriate σ, and we assume that O(λ/poly(σ)) = 1. Then the

following Lemma from [106] applies

Lemma 4. If ∇f = 0 in the compact domain D, and if g vanishes C∞ smoothly on ∂D,

then

I(λ) =

∫D

g(x) expif(x)λdx = O(λN) as λ→ 0, for all N ≥ 1. (5–11)

So the integral J(σ, λ) = O(λN) for all N . The integral can be bounded by the

appropriate Gaussian integral, which contributes at most a polynomial factor in σ−1.

Hence the integral decays super-polynomially.

Contributions for Remaining Portion of Phase-Space.

While it is not proved here, based on the Stationary phase approximation it seems

that the contributions from terms arising from the boundary with frequency vectors

pointing in the normal direction remain O(1) as the ratio O(λ/σ2) = 1 is maintained. Note

that the following argument is not rigorous, but provides a sketch of the behavior of points

on the boundary with appropriately facing normal vectors. This is the appropriate ratio

for producing non-vanishing stationary-phase contributions at points in neighborhoods of

zero curvature. In 2-D the ratio is O(λ/σ3) = 1 for non-saddle-type points with nonzero

curvature. In 3−d the ratio depends on the point-type: depending on the point-type the

hypersurface of stationary points is 1− or 2−dimensional, and so the approximation

of the Gaussian integral has a different contribution. Carrying this analysis through is

more difficult, since rather than upper bounds we need to compute lower bounds for the

Stationary phase coefficients. For a signed distance function (even with the caveats we

have added in our Claim above) degenerate stationary points and surfaces of stationary

points can both occur; therefore, computing the integrals is a very delicate task. We

simply comment that, under the appropriate considerations for using the stationary phase

109

approximation that the expansion looks like

J(σ, λ) ≈∫Rd

exp− ||x−m||2+bS(x)22σ2 + iν

T (x−m)−bS(x)λ

||Ψ||(2πσ2)d/2

dx

≈ (2πλ)d/2

(2πσ2)d

∑Z=z:∇bS(z)=ν

exp− ||z−m||2+bS(z)22σ2 + i

λg(z)√

| det(HbS(z))|

+(2πλ)d/2

(2πσ2)d

K∑k=1

∫Mk

exp− ||γ−m||2+bS(γ)22σ2 √

tr(HbS(γ))dσMk

(γ)

+O((2πλ)(d+1)/2

(2πσ2)d) (5–12)

and the exponentially decaying terms must fall on a point z or across a curve (or

hypersurface) Mk such that the product of the Gaussian decay factors hits 1. The

hypersurface Mk = x ∈ Rd : ∇bS(x) = ν. Note that the curve or hypersurface will be

of diameter O(1) relative to the shrinking σ, and so the contribution from the numerator

of the integral in the second term of the expansion of Equation (5–12) will be O(1). Since

bS is a signed distance, it also follows that the hypersurface will be hyperplanar within a

sufficiently small radius around the point m. Furthermore, under certain conditions we

expect√tr (HbS(γ)) =

√κ(P (γ)) where κ(P (γ)) the mean curvature at the projection of

γ to the near point on the curve ∂S. Note that the non-degeneracy of the denominators

of the terms corresponds to a meaningful geometric condition for the surface, which we

do not delve into here. We have avoided claiming this as a proof because the management

of the degeneracy condition for a signed distance must be done very carefully to ensure

correctness of the stationary phase. Integrals of this sort remain an active area of applied

mathematical research [4].

110

CHAPTER 6FURTHER EXPLORATIONS AND APPLICATIONS

In this Chapter we explore how the representation can be used for shape statistics.

The approximate linearity should allow approximate shape averaging, principal component

analysis, and more applications that can aid in shape modeling and analysis pipelines.

We also explore applications to graphics and image processing. In graphics, rendering

3−d models from a sparse set of observations is extremely useful. To this end We provide

a fast and accurate algorithm for surface reconstruction from partial observation. We also

explore the possibility of using Gabor approximation of an appropriately transformed

image for shape extraction from images.

6.1 Curve Extraction from ψ

Since the modular distance function is phase-wrapped, the zero level-set consists of

an infinite collection of connected components: the pattern repeats at coarser and coarser

scales. Therefore, extracting a single zero level-set entails choosing among these scales one

which best represents shape. Through the analogy with the CWR, we want the level-set

that best represents the oriented point-set. Only once we establish a clear mechanism to

choosing the best level-set can we say that the CWR bridges a gap between Mathematical

shape representations and perceptual grouping.

In the setting of the MDF, the correct zero level-set is obvious: the magnitude of the

correct zero level-set will be the highest of all of the magnitudes. Since the argument to

the Gaussian is the same as the argument to the complex factor, the zero of the complex

wave that indexes into the true zero will have the least argument to the Gaussian. Thus,

unwrapping the modular distance function is as easy as ordering the zero level-sets and

adding an appropriate offset of k2π for the kth largest value of the magnitude.

In the setting of discrete samples from the normal bundle, the CWR does not

maintain the correspondence between the magnitude and the phase (plus some wrapping

offset). However, provided that the samples are dense enough we can model the discrepancy

111

of the magnitude of the MDF compared with the CWR by a small perturbation. Then,

we can apply an appropriate estimator across pixels or points that index into the same

level-set of the phase to come up with an approximate value of the magnitude associated

with a given level-set.

We associate each level-set ℓi of the phase of the CWR with some level-set γi of the

MDF as follows. First we extract level-sets of the phase of the CWR and label them. If

we assume Gaussian error of the CWR (relative to the MDF) then the MSE estimate for

the magnitude-squared associated with a given level-set ℓi will be the expected value of

the magnitude squared of the CWR over the level-set. This is a line- or surface-integral

depending on the number of dimensions involved. Note that this procedure can be carried

out in a continuous fashion whenever ℓi is given by a spline-curve and in a discrete fashion

by performing marching squares or cubes. The MSE estimate of the magnitude for ℓi, ρi,

provides a statistic that is extremely useful for ordering the connected curves in the zero

level-sets of the phase of the CWR. We expect the kth largest value of the estimated MDF

magnitudes to correspond to the kth wrapped repetition of the true level-set.

Note that even in the case of the MDF it is possible that the kth level-set has a larger

integral value due to the fact that the length may be larger. Thus, a bound on the length

of kth level-set is necessary to derive a value of σ that yields a fast enough decay so that

the integral estimate provides the correct ranking. The key observation for making this

work is to note that the “offset curve (surface)” at distance k to the 0−level-set will

have an arc-length (surface area) that is greater than the arc-length (surface area) of the

k−level-set. This is because the level-sets are contained in the offset curves (surfaces)

[25]. So a bound on the length of the offset curve provides a bound on the length of the

corresponding level-set. Let γ be the 0−level-set of the single connected component of the

shape. The k−offset curve to γ is defined as

γ(t) = γ(t) + kN(t),

112

where N(t) is the normal vector to γ at t. The arc-length at t is the norm of the

t-derivative of the curve. So the arc-length at t is

∂tγ(t) = T (t) + kκ(t)T (t),

by the definition of curvature. Thus

||∂tγ(t)||2 ≤ (1 + kκ(t))2||∂tγ(t)||2,

by the triangle inequality. This pointwise bound provides the following global bound on

the arc-length

1∫0

||∂tγ(t)||dt ≤ (1 + kκmax)

1∫0

||∂tγ(t)||dt.

So to ensure that the wrapped level-sets have appropriately ordered integral magnitudes,

estimate the magnitudes by the Gaussian evaluated at the distance λ and bound the

integral

1∫0

|ψ(γ(t))|2dt ≤ L(γ)|ψ(γ(0))|2,

≤ (1 + λκmax)L0|ψ(γ(0))|2.

Using the previous bound, we get the following bound for admissible values of σ, λ for

unwrapping

λ ≥ σ√2 log((1 + λκmax)L0). (6–1)

6.1.1 Mean Shortest-Path Error Evaluation on 2-D Data: MPEG7 Dataset

In this subsection we evaluate the performance of the surface reconstruction using

a surface reconstruction measure based on matching a proposed surface to the true

underlying mesh. The surface is proposed by the algorithm outlined above. Then points

are taken at random from the initial mesh and a shortest path between the points is

113

Table 6-1. An algorithm for extracting the shape corresponding to a collection of orientedpoints.

Require: S = (ma, νa)Na=1 an OPs. k a number of contours.function Reconstruct(S, k)

2: σ, λ ← Admissible values based on Equation (6–1)

ψ ←∑

a exp−||x−ma||2

2σ2 + iνT (x−ma)

λ ▷ Build the CWR

4: C ← ℓa : ℓa is a connected component of θψ = 0 ▷ Get these with e.g. marchingcubes/squares

for i← 1 to |C| do6: αi ←

∫ℓi|ψ(x)|2dσ(x) ▷ Either pixel-wise estimate or by actually computing

CWR on ℓiend for

8: return ℓimkm=1 such that αip > αip+1 .end function

Figure 6-1. Zero level-sets of subject 5 of the FAUST sequence. The range of the implicitfunction is [−.03, .17] × [.44, .79] × [−.16, .14] and σ = .0003, λ = .0175. Thisresults in 3 wrapped zero level-sets appearing in the frame. They are orderedin increasing order by surface integral, with values ranging from 1 × 10−10 to1× 10−1.

114

computed. Corresponding points on the proposed surface are found by minimizing distance

in the ambient space over points in the proposal, and the path along the proposed surface

is computed. The difference in the length of the paths is reported.

In this subsection we provide a suggestive empirical result on the validity of a

reconstruction algorithm that leverages the rapid decay of the magnitude with the limited

growth of the length (or area, in the surface case) of the contours as the phase wraps. The

curve reconstruction algorithm is shown in Equation 6.1.

The validation methodology is as follows. The dataset used is the MPEG-7 dataset,

with shapes taken from the Bats, Birds, and Chickens subsets. Each consists of 20

silhouettes. A single contour was extracted from the images by choosing the boundary of

the indicator function that corresponds to the nominal curve. The resulting curves have

approximately 1000 points each and represent an outline of the pictured bird, chicken,

or bat. For each such image, 250 random pairs of points are chosen from the bounding

curve and the shortest distance between the two is computed. Then subsample the curve

at the appropriate sampling rate and extract 1 zero level-contour from the CWR. The

corresponding points chosen in the previous step are then projected down to the contour

and the shortest distance path on the contour is computed. We report the average error

between the two on 150, 000 trials spanning 60 curves, sampled at 10 rates, over 250 trials

at rate rate.

6.1.2 Hausdorff Distance-based Evaluation of 3-D Data: Spheres, Bunny,FAUST Datasets

Note that in 3-D the paths tend to have alternate routes when a surface defect

is generated, and so the arc-length measurement may not be as robust. Therefore, we

also compare the Hausdorff distances (or H-distances) [1, 32, 102] between the CWR

reconstruction and the Poisson surface reconstruction for a sphere, heads in the FAUST

dataset, and the Bunny dataset. We compare the reconstruction from the unwrapped

115

...

..0.

5.

10.

15.

20.

25.0 .

5 · 10−2

.

0.1

.

0.15

.

0.2

.

0.25

.

0.3

.

0.35

.

0.4

.

0.45

.

0.5

.

Sampling Frequency

.

Error

onShortest

Path(%

ofDiameter)

.

. ..Birds

. ..Bats

. ..Chickens

...

Figure 6-2. Average error on shortest path between 250 pairs of points (randomly chosen)in the estimated mesh at 10 sampling rates. σ = 170, λ = 50, and the averagediameter is 250. Note that both the bats and chickens have regions of veryhigh curvature (eg. in the wings of the bats or the feathers of the chickens).

CWR with Poisson surface reconstruction [53] on the basis of Hausdorff distance as a

percentage of diameter.

A standard measure for the performance of a spline is the “circle test”. This refers

to the principle that adding points on a circle to the data should lead to more accurate

estimates of the underlying circle. In other words, the resulting curve should have constant

curvature between points being connected. We verify this performance for the CWR and

compare with the behavior of Poisson surface reconstruction. Since the underlying surface

is the 2−sphere in this case, the shortest distance between a point on the estimated

surface and the underlying surface S2 will be the projection of the estimated point to

the unit norm vector lying in the same direction. So we report the Hausdorff distance

computed analytically in this case. We note that since Poisson surface reconstruction is

designed to handle outliers and low sampling-rates that it does not perform as well under

this standard spline test, but achieves reasonably good reconstruction metrics throughout

the lower sample regime.

116

Sampling Rate CWR H-distance Poisson H-distance.9 0.01 0.081.74 0.023 0.053.57 0.031 0.040.43 0.063 0.040.26 0.145 0.044

Figure 6-3. Sphere reconstruction over different sampling rates. From top to bottom, theset of 926 oriented points is sampled at a rate of .9, .74, .57, .43, and .26 ofthe initial point-set. Note that Poisson surface reconstruction handles theformation of edges better in the low sampling-rate case, but tends to estimatea slightly inflated sphere. The table shows the Hausdorff distances as a fractionof radius averaged over 10 trials at each rate..

117

Sampling Rate CWR H-distance Poisson H-distance.9 0.070 0.073.74 0.070 0.074.57 0.070 0.076.43 0.070 0.077.26 0.070 0.077

Figure 6-4. Face reconstruction over different sampling rates. From top to bottom, theset of 1470 oriented points is sampled at a rate of .9, .74, .57, .43, and .26 ofthe initial point-set. The table shows the Hausdorff distances as a fraction ofdiameter averaged over 10 trials at each rate.

We also compared the performance of the CWR reconstruction to the Poisson

surface reconstruction on the FAUST and Bunny datasets. FAUST and the sphere

above are both relative low sampling-rate datasets. The bunny is a high-rate dataset,

with 37K points. Since the FAUST dataset considered here is restricted to heads (for

rendering and precision purposes) the boundary conditions are important. We clip the

reconstructed surface just below the chin level for metric computation, since Poisson

surface reconstruction estimates a closed surface by design. We point out here that this

is a weakness of Poisson surface reconstruction, since there is no natural way to extract

multiple connected components from this method using a setting that enforces closed

surfaces—there is no “priviledged level-set”.

We performed 2 sets of experiments with the bunny. First, we experiment with adding

noise to the original mesh points. We added pointwise Gaussian noise with standard

deviation ranging from 1% to 25% of the diameter of the mesh. Then the algorithms are

run on the resulting perturbed oriented point-clouds. The results are shown below, with

the average Hausdorff distance over 10 trials at each noise level. We find that Poisson

reconstruction is much more robust to noise, but at low and moderate levels of noise the

two perform comparably.

Next, we compare the reconstruction performance at different sampling rates, as

detailed above. We find that Poisson and CWR reconstruction perform comparably for

high to moderate sampling rates.

118

Noise CWR H-distance Poisson H-distance.01 0.043 0.038.09 0.048 0.039.17 0.050 0.045.25 0.282 0.046

Figure 6-5. Bunny (closed surface) reconstruction over different noise levels. From top tobottom, the set of 6200 oriented points has noise added at σ = .01, .09, .17, .25times the diameter of the mesh. The table shows the Hausdorff distances asa fraction of diameter averaged over 10 trials at each rate. Note the surfaceartifact on the last shape in the CWR column, which causes a high Hausdorffdistance between the reconstruction and the original.

119

Sample rate CWR H-distance Poisson H-distance.9 0.045 0.040.57 0.045 0.043.43 0.049 0.044.26 0.052 0.045

Figure 6-6. Bunny (closed surface) reconstruction over different sampling rates. From topto bottom, the set of 31,335 oriented points is sampled at a rate of .9, .57, .43,and .26 of the initial point-set. The table shows the Hausdorff distances as afraction of diameter averaged over 10 trials at each rate.

120

6.2 ψ for kPCA on Curves

In kernel PCA (kPCA) [88], the goal is to build a linear basis of functions, B =

eiN−1i=1 , out of the features observed during a training phase that minimizes the

reconstruction error of the observations. The difference with PCA is that the basis

may not be linear in the underlying space where the observations are points, but rather in

the space corresponding to the feature functions. Often, evaluating inner products directly

in high-dimensional feature spaces is difficult: due to the size of the dimensionality or

numerical precision. In lieu of this brute-force approach to implementing PCA in feature

space, kPCA proposes using a reproducing kernel instead of directly evaluating the inner

products. While kPCA suffers from some computational setbacks (see blow) it offers firm

theoretical footing for performing nonlinear dimensionality reduction for moderately sized

data sets.

The theory of Reproducing Kernel Hilbert Spaces provides sufficient conditions for

posing the problem correctly, specifically Mercer’s Theorem which establishes conditions

on the validity of the choice of kernel.

To perform kPCA, first the mean in the ambient (not feature) space is removed

from the data set. Since the kernel performs a linear evaluation, this transfers to a linear

centering in feature space. The new feature vectors are denoted ψ. Next, the Gram matrix

Kij = 1n⟨ ˜ψ(x;mi, νi), ˜ψ(x;mj, νj)⟩ is formed. Since requires n2 iterations of the flops

required for computing the kernel. The final step in kernel PCA is eigendecomposition

of K. The reason why this corresponds to forming a basis of maximum covariance is as

follows. If we write the eigenvalue problem for the covariance matrix as λV = CV where C

is the feature covariance matrix, then note that V lies in the span of the features observed.

Therefore, we may write V =∑n

i=1 αi˜ψ(x;mi, νi). Lastly, we may consider the “weak”

version of the equations (by applying the linear evaluation functional at (mk, νk) for all

121

k ∈ 1, . . . , n) and form a system of n equations that reads

K2α = nKλα. (6–2)

If α′ is an eigenvector of K then it also satisfies Equation (6–2).

The result of diagonalizing K produces αiN−1i=1 containing the kernel principal

components. Then the collection of αi corresponding to nonzero λi are normalized and

used as a basis for test patterns.

The first problem we propose to solve with kPCA is to estimate the underlying point

density model for a shape, through the magnitude squared of its expansion on a linear

space of basis functions provided as training data. As an added and surprising benefit, we

show that the expansion itself contains closed curves. The perceptual gains of the CWR

are conserved under kPCA encoding. Moreover, since we are working on L2 rather than

PL2 the CWR is therefore uniquely suited to PCA-based compression unlike probability

densities which are positive and integrate to one. In summary, we can start with a set of

oriented point-sets, go to our feature basis and build a linear subspace, and accurately

approximate a closed curve and a probability density corresponding to an unseen, test,

oriented point-set in terms of a few basis coefficients. In Figure 6-7 the closed curve

estimated by the kernel is shown on the left of the figure. We construct this close curve

as explained in Section 6.1. This is a novel aspect of the CWR directly leveraging the

properties of linearity and superposition to construct a basis representation. We show that

the absolute error of this representation serves as a discriminative measure for classifying

oriented point-sets. An additional novel aspect is a framework that also yields a generative

approximation corresponding to the classification—the wave function that emerges from

the approximation of the data in the kPCA basis.

We also provide anecdotal evidence of the performance of kPCA for reconstructing

closed curves on the Gatorbait dataset. We used 20 oriented point-sets as training data

for kPCA and held back 5 test samples. Each involved about 300 oriented points. The

122

(a) (b)

(c) (d)

Class 1 err1 err21 1.7 2.42 0.73 2.03 0.67 2.94 0.42 2.75 0.54 2.7

(e)

Class 2 err1 err21 2.4 0.722 2.2 0.743 2.0 1.554 2.1 0.345 2.3 1.59

(f)

Figure 6-7. (a) and (b) Closed curves from linear combinations of closed curves. Linearbases B1 = e1i 25i=1, B2 = e2i 25i=1, each consisting of linear combinationsof 25 patterns were used to predict the features for 5 unseen patterns. Thefirst 24 principal components were used as a basis for the point-sets shown.(c) and (d) Density estimation from a learned dictionary through PCA. Wesimultaneously estimated a point density and a closed curve from the spanof 24 training patterns with coefficients derived from ≈ 100 elements of theoriented point-set. (e) and (f) Reconstruction error. The PCA bases performwell as a subspace classifier evaluated on the MPEG-7 dataset[96].

reconstruction of a training sample and several test samples shows recovery of closed

curves from mixtures of CWRs.

123

(a)

(b)

Figure 6-8. Recovery of closed curves in training and testing samples for the Gatorbaitdataset. a) Recovery of a training sample from all of the principal components.The subframes show the progress of the reconstruction as more principalcomponents are added. b) Recovery of a test sample.

124

6.3 Generalization of the CWR to Embedded Surfaces

Since the majority of the work in this Thesis is of an applied flavor, and since the

majority of the applications for shape registration and statistics occur in ambient space,

the generalizations in this section have been relegated to the end of this document.

These generalizations are important since they provide some key features that most

representations lack. Two key features, which we will discuss briefly below, is that one can

use these generalizations to generate manifolds with co-dimension greater than one such

as a curve in 3-D and also to perform registration of rigid bodies registering curves on the

bodies.

Recall the relationship between the Schrodinger equation and the signed distance

featured in Equation (2–2). Here we show how this equation generalizes to the sphere.

Recall that the sphere is a Riemannian manifold with metric

gS2 =

1 0

0 sin(θ)2

,

so that arc length is given by

ds2 = (dθ, dϕ)Tg(dθ, dϕ) = dθ2 + sin(θ)2dϕ2.

We will need some concepts from Riemannian geometry. A metric is a pointwise

smooth assignment of positive definite matrices to a manifold M , usually denoted g,

where smoothness is understood to be componentwise in the matrix. This metric matrix

at p acts on vectors in the tangent space TpM to allow length measurements in the

tangent space. We also need the definition of the gradient, connection, divergence, and

the Laplace-Beltrami operator. We view the gradient ∇ as a linear map from C∞(M) →

Γ∞(M), where Γ∞(M) denotes the set of smooth vector fields on M . In local coordinates

∇f = gik ∂f∂xk

∂∂xi

where gik is the ikth entry of the inverse of g. The connection on (M, g),

D is a bilinear map Γ(M) × Γ(M) → Γ(M) such that D is C∞ homogeneous in the first

125

entry and obeys a Leibniz property in the second entry. The divergence operator div acts

on a vector field V by inserting it into the second entry of D and evaluating the trace of

the remaining linear operator. Finally, the Laplace-Beltrami operator is given by

∆f = −div∇f = −trD∇f.

The negative sign appears to keep the operator positive semi-definite. For more details on

these definitions see [36].

If we consider the Laplace-Beltrami operator on the sphere then we get the equations,

following the development of Chapter 2,

−ℏ2(∆S2ψ) = ψ

−ℏ2( ∂

sin(θ)∂θ

sin(θ)∂ψ

∂θ+

1

sin(θ)2∂2ψ

∂ϕ2) = ψ

−ℏ2(O(1) +O(1

ℏ) +

1

ℏ2

(1

sin(θ)2(∂S

∂ϕ)2 + (

∂S

∂θ)2)R expiS/ℏ) = R expiS/ℏ.

From the last line we deduce the asymptotic equality

||∇gS||2 ≈ 1

which is exactly the eikonal equation on S2 [57].

A general definition of the CWR on a compact, simply connected Riemannian

manifold of genus 0 will be given below. First, the definition for S2 is given to make things

concrete. Let γ be a curve lying along a sphere sitting inside R3. We will model γ ∈ R3

as γ ∈ S2. Let miNi=1 ∈ S2 be locations on the unit sphere S2 ⊂ R3. Let νiTmiS2

be tangent vectors at mi, defining normal vector for a codimension 2 curve γ ∈ R3 and

a codimension 1 curve γ ∈ S2. Note that mi defines a binormal for γ at mi. Call this

collection A = mi, νiNi=1. Then the spherical-CWR for A on S2 is

ψA(θ, ϕ;σ, λ) =N∑i=1

exp−d(m, (θ, ϕ))2

2σ2+ i

(mi, νi) · (θ, ϕ)λ

126

Figure 6-9. A fish drawn on the surface of a sphere using the spherical CWR. On the leftwe see the magnitude of the CWR, and on the right the phase plot decoratedby different lines. The red line indicates the initial data. Blue and gray linesshow various level-sets of the resulting phase of the CWR.

where (m, ν) acts by integration of the inner product along the parallel transport

(m, ν) · (θ, ϕ) =∫γ(θ,ϕ)m

Dγν · γdσ

with γ(θ,ϕ)m the geodesic arc joining m to (θ, ϕ) and d the distance along this arc:

d(m, (θ, ϕ)) = cos−1(sin(ϕ) sin(m2) + cos(ϕ) cos(m2) cos(|m1 − θ|)).

In computational settings, d can be computed by going to extrinsic coordinates and using

the definition of inner product v · w = |v||w| cos(∡vw) or by various formulas that depend

only on intrinsic coordinates such as shown above. Note that Dγν is the parallel transport

of ν along γ, which for the sphere acts like rotation around the origin of the vector ν

affixed at m (or whatever point Dγ is evaluated at). Note that this also corresponds to

rotating the sphere so that m is at (0, 0) and then rotating along the z−axis to align ν

and (1, 1)/√2 in TmS

2.

The previous definition does not depend on the structure of the sphere beyond two

things:

127

1. Geodesic completeness,

2. The formula for d.

On a Riemannian manifold (M, g) the Laplace-Beltrami operator is given in coordinates as

∆f =1√|g|∂i

(√|g|gij∂jf

).

Thus

−λ2∆gψ =− 1√|g|∂i

(√|g|gij(∂jR+ iR∂jS/λ) expi

S

λ),

≈− 1√|g|

(O(λ2) + λ2

√|g|gij(∂ijR+ i∂iR∂jS/λ+ iR∂ijS/λ−R∂iS∂jS/λ2) expi

S

λ),

≈O(λ) + g(∇gS,∇gS)R expiSλ.

If we are willing to sacrifice closed-form access to the construction of d then we can define the

CWR on any mesh by either computing geodesics or computing the heat kernel and using this as

an approximation to the Gaussian factor. We still need to compute the parallel transport, which

is in general ill-conditioned on an arbitrary triangulated surface. However, theoretically we can

now write down the CWR on a geodesically complete, smooth Riemannian manifold (M, g)

ψM (x; C = mi, νi) =N∑i=1

exp−dM (x,mi)2

2σ2+ i

(mi, νi) · xλ

.

One can use Dijkstra’s algorithm for compute dM (x,mi)2 and use the resulting shortest paths

as geodesics γxmifor computing (mi, νi) · x. Note that the result is an approach that computes

the signed distance function on a mesh as opposed to the unsigned distance. Complexity-

wise, this approach is dominated by the need to compute geodesics from m source points,

which requires O(mN log(N)) calculations. One can also use Varadhan’s formula [22, 101] for

computing dM (x,mi)2, but this does not allow one to compute the signed distance. In practice,

we find using the approach of [22] when many source points are needed is faster than Dijkstra’s

algorithm. This method requires a linear solve for the backward Euler equation, which has quick

solutions for sparse matrices [16].

128

Figure 6-10. A diamond drawn on a mesh from FAUST using 4 oriented points. The redlines indicate zero level sets. The red lines on the arms are due to phasewrapping. The zero level set on the torso can be selected by extractingconnected components and ordering them by line integral of the modulusof ψ.

129

Registration of Spherical Data Using the Spherical CWR.

Here we explore two techniques for registering oriented point-sets on the sphere. The first

approach is essentially RDM restricted to the sphere: the oriented points are viewed as points

and normals on the sphere. This results in a technique based on numerical integration of the

difference of spherical CWRs. The second approach is based on Karcher Means of the oriented

points viewed as points on (S2)2. This results in an approach that is robust to Noise but suffers

from missing points.

First we consider spherical RDM. A naıve optimization approach would simply be to use

brute-force and explore the whole space of rotations, re-computing the spherical CWR of the

template at each iteration and taking as the minimizing argument the rotation angles minimizing

the distance. However, since the domain in question (S) is fixed it is possible to do better than

this. The approach uses the symmetry of the sphere: first, we compute the Frechet means by

minimizing∑N

i=1 d(m,xi) over m. One can do this with a simple recursive procedure

M1 = x1

Mn =Mn−1 ⊕1/n xn,

where ⊕1/n denotes a point 1/n of the way along the along the geodesic joining Mn−1 to

xn. We implement this with spherical linear interpolation. In certain contexts this algorithm

has been shown to provide good estimates for the Frechet mean [50]. From this point, we

subdivide the interval of the remaining degree of freedom into small portions and re-compute the

template spherical CWR at each setting. The result is an algorithm that aligns spherically-bound

curves represented by a sampling of oriented points. This can be compared to the results in

Subsection 4.2.2, see Figure 4.2.2. Note, however, that this algorithm only applies to spherical

data.

The second algorithm is dubbed bi-means. The approach involves computing the means

of the point locations and the means of the normal locations, both on the sphere, and solving

a least squares equation for the full rotation. The problem with applying this algorithm to

unorganized oriented points is that Karcher mean computation can depend on the ordering of the

points.

130

...

..0.02

.0.06

.0.1

.0.14

.0.18

.0.22

. 0.

0.25

.

0.5

.

0.75

.

1

.

1.25

.

1.5

.

Stdev of Noise

.

RelativeError

onTransformation

.

. ..spherical RDM

. ..bi-means

(a) ...

..75.

85.

95.

105.

115.

125.0 .

0.25

.

0.5

.

0.75

.

1

.

1.25

.

1.5

.

1.75

.


.

RelativeError

onTransformation

.

. ..spherical RDM

. ..bi-means

(b)

...

..0.02

.0.06

.0.1

.0.14

.0.18

.0.22

. 0.

0.25

.

0.5

.

0.75

.

1

.

1.25

.

1.5

.

1.75

.

Stdev of Noise

.

RelativeError

onTransformation

.

. ..spherical RDM

. ..bi-means

(c) ...

..75.

85.

95.

105.

115.

125.0 .

0.25

.

0.5

.

0.75

.

1

.


.

RelativeError

onTransformation

.

. ..spherical RDM

. ..bi-means

(d)

Figure 6-11. Registration of spherical oriented point-sets. In a) the robustness to vonMises-Fisher noise is tested and in b) the robustness to missing points istested. The bi-means algorithm is shown to be sensitive to ordering. Whenordering is the same between both point-sets the result is more comparable tospherical RDM, as shown in c) and d).

131

CHAPTER 7CONCLUSION AND FUTURE WORK

7.1 Contributions

In this thesis a new implicit function representation for embedded curves and surfaces

was developed. Based on the literature review in Chapter 1 the represention is the

first linearly composable representation to tout a direct proof of connectedness under

appropriate conditions. We emphasized applications for embeddings into Euclidean space.

The representation is modular, in that it contains as zero level-set several copies of the

curve or surface. As an application, an algorithm for extracting curves from this modular

representation was developed. In addition, an approach to shape statistics and a new

algorithm for simultaneous registration and reconstruction of curves and surfaces was

developed.

The Complex Wave Representation (CWR) uses superposition of local linear

approximation to a curve or surface to stitch together a global implicit function. We

showed that the representation corresponds to the non-vanishing phase-space atoms

of the Modular Distance Function (MDF), a function from which one can recover the

signed distance function, as the parameters of the MDF grow asymptotically. This allows

one to write down a simple parametric equation for a local approximation of the signed

distance—a new contribution to implicit function modeling. Furthermore, we showed that

provided that the samples are structured appropriately that one can recover a closed curve

or set of closed curves that approximates the original curve from the representation.

Beyond these theoretical contributions we developed several useful application of

the CWR. Shape statistics can be done easier with the CWR than with classical implicit

function representations. Towards this end, we applied the kernel arising from the Gabor

frame for kPCA. This kPCA between oriented point-sets is equivalent to doing PCA on

the corresponding CWRs. We showed that one can build a useful simple classifier from

132

the first few principal components of a shape set and showed how to reconstruct new

point-sets from the principal components.

The focus of this thesis is Resonant Deformable Matching (RDM). RDM uses the

closed form L2 distance between CWRs for oriented point-sets to do registration. During

the registration process, additional normal variables can be estimated on the target set.

This allows for simultaneous registration and reconstruction—where the registration

model feeds back in the current reconstruction directly. This subtlety is important for any

approach fusing reconstruction and registration and often leads to pipelined approaches,

but here we use a single framework and the reconstruction happens for free using the

CWR. We showed reconstruction results for surfaces and collections curves—including

challenging sets with abutting curves. We also developed an approach for registration

that uses maximum likelihood on the joint oriented point variables. While only partially

developed here, it is worth noting that the MLE approach to nonrigid registration suffers

from a O(k2n) runtime whereas the L2 is O(nk + n2 + k2). This work, along with other

related work performed during my Ph.D., can be found in papers published in major

conferences (ECCV, ICPR, and QIP) and journals (submitted to PAMI and SIIMS)

[20, 21, 42, 44, 68].

Finally, generalizations of the CWR were developed. By extending beyond linear

modulation factors we can obtain a more expressive frame of shape atoms. In the case of

quadrics, it turns out that the inner products between the resulting atoms are available

in closed form. This can be leveraged for fitting. Finally, the entire framework can be

implemented on geodesically complete embedded Riemannian surfaces. The relationship

between the eikonal and the Schrodinger equation was revisited from this standpoint and

practical uses, like registering shapes on a sphere and representing co-dimension 2 curves

was developed.

133

7.2 Future Work

There are many avenues of research that we did not have time to explore during my

thesis. A few directions that we think are worthwhile for future work by others in the

Shape analysis community.

The models of deformation used throughout this thesis have been traditional,

spline-based approaches. The deformation of the normal feature arises as a byproduct

of deforming the point-set, but can be implemented in closed form. This is done by using a

first-order model that uses the derivative of the transformation map at the point where the

normal is anchored. While this method ensures compatibility between the two components

of the deformation, it privileges the points over the normals. This is because the knot

points are in the embedding space. One might try to develop a model that deals with

this asymmetry. The simplest approach is to allow the normals to deform free of the

spatial model, or possibly according the spatial model plus a free-form deformation in

the normal space. The goal of using this type of model is to allow the normal to deform

free of influence of the spatial transformation while maintaining a certain amount of

fidelity to the spatial transformation. A more sophisticated approach would be to lift the

transformation to a deformation on Rd × Sd−1 and add the constraints

||ϕ′,(1)|mν||2 = 1,

ϕ′,(1)|mν = ϕ(2)ν, (7–1)

for each oriented point. Neither approach was explored directly in this thesis, and

exploring and comparing both would be a nice contribution to oriented point-set

registration and spline deformation fitting. We note that previous work in kriging [64],

spline modeling with uncertainty intervals, has explored a similar set of constraints as

Equations (7–1).

Finally, a note on a simple engineering problem and a very low-hanging avenue

of development. As mentioned above, the CWR model is easily parallelizable. Since

134

there is no inherent ordering of the terms in the summation they can be recorded in

arbitrary order. Indeed, as suggested by the work in Chapter 6 only a narrow band should

be necessary for reconstruction a surface. The narrow band should be drawn from a

neighborhood near the observations. Further, a non-rectangular mesh would probably

provide better resolving ability to a surface reconstruction pipeline. Indeed, a simple

filtering of mesh faces based on the ideas of Chapter 6 would be to drop mesh faces with

low surface integral of the magnitude of the CWR. This would allow one to avoid some

of the artifacts observed in the noisy cases studied above. However, this could result

in surfaces with unexpected boundaries and holes. It may be possible to use the kPCA

approach developed in this work to help patch the holes. Yet another approach is to look

for coherence and reject as outliers or excessive noisy those points which do not exhibit a

certain threshold of coherence. Based on the theoretical work of Chapter 5 connectivity

is a predictable feature of the CWR. This suggests that the coherence feature (how well

the normals of a point line up with nearby normals) should provide a clue of how well

an oriented point will serve as a surface point. However, in regions of high curvature we

expect this measure to break down and so a similar problem to the above arises.

135

REFERENCES

[1] N. Aspert, D. S. Cruz, and T. Ebrahimi, MESH: measuring errors be-tween surfaces using the hausdorff distance, in IEEE International Conference onMultimedia and Expo, 2002, pp. 705–708, http://dx.doi.org/10.1109/ICME.2002.1035879.

[2] G. Aubert and P. Kornprobst, Mathematical Problems in Image Processing:Partial Differential Equations and the Calculus of Variations, no. 147 in AppliedMathematical Sciences, Springer, 2006.

[3] A. Basu, I. R. Harris, N. L. Hjort, and M. Jones, Robust and efficientestimation by minimising a density power divergence, Biometrika, 85 (1998),pp. 549–559.

[4] A. Benaissa and C. Roger, Asymptotic expansion of multiple oscillatoryintegrals with a hypersurface of stationary points of the phase, in Proc. R. Soc. A,vol. 469, The Royal Society, 2013, p. 20130109.

[5] P. J. Besl and N. D. McKay, A method for registration of 3D shapes, IEEETransactions on Pattern Analysis and Machine Intelligence (PAMI), 14 (1992),pp. 239–256.

[6] J. F. Blinn, A generalization of algebraic surface drawing, ACM Transactions onGraphics (TOG), 1 (1982), pp. 235–256.

[7] F. Bogo, J. Romero, M. Loper, and M. J. Black, FAUST: Dataset andevaluation for 3D mesh registration, in Proceedings IEEE Conf. on Computer Visionand Pattern Recognition (CVPR), Piscataway, NJ, USA, June 2014, IEEE.

[8] F. Bookstein, Principal warps: Thin-plate splines and decompositions of deforma-tions, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 11(1989), pp. 567–585.

[9] K. L. Boyer and S. Sarkar, Perceptual Organization for Artificial VisionSystems, vol. 546, Springer Science & Business Media, 2012.

[10] A. M. Bronstein, M. M. Bronstein, and R. Kimmel, Numerical Geometryof Non-Rigid Shapes, Monographs in Computer Science, Springer, 2009, http://dx.doi.org/10.1007/978-0-387-73301-2.

[11] J. Butterfield, On Hamilton-Jacobi theory as a classical root of quantum theory,in Quo Vadis Quantum Mechanics?, Springer, 2005, pp. 239–273.

[12] V. Camion and L. Younes, Geodesic interpolating splines, in EnergyMinimization Methods in Computer Vision and Pattern Recognition (EMMCVPR),Springer, 2001, pp. 513–527.

136

http://dx.doi.org/10.1109/ICME.2002.1035879

http://dx.doi.org/10.1109/ICME.2002.1035879

http://dx.doi.org/10.1007/978-0-387-73301-2

http://dx.doi.org/10.1007/978-0-387-73301-2

[13] E. Candes, L. Demanet, D. Donoho, and L. Ying, Fast discrete curvelettransforms, Multiscale Modeling & Simulation, 5 (2006), pp. 861–899.

[14] J. C. Carr, R. K. Beatson, J. B. Cherrie, T. J. Mitchell, W. R. Fright,B. C. McCallum, and T. R. Evans, Reconstruction and representation of 3dobjects with radial basis functions, in Proceedings of the 28th annual conference onComputer graphics and interactive techniques, ACM, 2001, pp. 67–76.

[15] N. Charon and A. Trouve, The varifold representation of nonoriented shapes fordiffeomorphic registration, SIAM Journal Imaging Sciences, 6 (2013), pp. 2547–2580,http://dx.doi.org/10.1137/130918885.

[16] Y. Chen, T. A. Davis, W. W. Hager, and S. Rajamanickam, Algorithm887: Cholmod, supernodal sparse cholesky factorization and update/downdate, ACMTransactions on Mathematical Software (TOMS), 35 (2008), p. 22.

[17] M. Cho and K. M. Lee, Progressive graph matching: Making a move of graphsvia probabilistic voting, in IEEE Conference on Computer Vision and PatternRecognition (CVPR), IEEE, 2012, pp. 398–405.

[18] H. Chui and A. Rangarajan, A new point matching algorithm for non-rigidregistration, Computer Vision and Image Understanding (CVIU), 89 (2003),pp. 114–141.

[19] W. Commons, Tait-brian angles, zyx convention, 2013, https://en.wikipedia.org/wiki/File:Taitbrianzyx.svg.

[20] J. Corring and A. Rangarajan, Shape from phase: An integrated level-set andprobability density shape representation, in International Conference on PatternRecognition (ICPR), IAPR, 2014, pp. 46–51.

[21] J. Corring and A. Rangarajan, Resonant deformable matching: Simultaneousregistration and reconstruction, in European Conference on Computer Vision(ECCV), Springer, 2016, pp. 51–68.

[22] K. Crane, C. Weischedel, and M. Wardetzky, Geodesics in heat: A newapproach to computing distance based on heat flow, ACM Transactions on Graphics(TOG), 32 (2013), p. 152.

[23] D. Cremers, S. J. Osher, and S. Soatto, Kernel density estimation andintrinsic alignment for shape priors in level set segmentation, International Journalof Computer Vision, 69 (2006), pp. 335–351.

[24] I. Daubechies, The wavelet transform, time-frequency localization and signalanalysis, IEEE Transactions on Information Theory (IT), 36 (1990), pp. 961–1005.

[25] M. Delfour and J. Zolesio, Shapes and Geometries, Society for Industrialand Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA, USA,second ed., 2011, http://epubs.siam.org/doi/abs/10.1137/1.9780898719826.

137

http://dx.doi.org/10.1137/130918885

https://en.wikipedia.org/wiki/File:Taitbrianzyx.svg

https://en.wikipedia.org/wiki/File:Taitbrianzyx.svg

http://epubs.siam.org/doi/abs/10.1137/1.9780898719826

[26] M. Delfour and J.-P. Zolesio, Shapes and Geometries: Metrics, Analysis,Differential Calculus and Optimization, Advances in Design and Control, Springer,2011.

[27] A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood fromincomplete data via the em algorithm, Journal of the royal statistical society. SeriesB (methodological), (1977), pp. 1–38.

[28] Y. Deng, A. Rangarajan, S. Eisenschenk, and B. C. Vemuri, A Rieman-nian framework for matching point clouds represented by the Schrodinger distancetransform, in IEEE Conference on Computer Vision and Pattern Recognition(CVPR), IEEE, 2014, pp. 3756–3761.

[29] G. deRham, Varietes differentiables, formes, courants, formes harmoniques, Act.Sci. Indust., 1222 (1955).

[30] E. W. Dijkstra, A note on two problems in connexion with graphs, Numerischemathematik, 1 (1959), pp. 269–271.

[31] H. Edelsbrunner, Shape reconstruction with Delaunay complex, in LATIN’98:Theoretical Informatics, Springer, 1998, pp. 119–132.

[32] V. Estellers, M. Scott, and S. Soatto, Robust surface reconstruction, SIAMJournal on Imaging Sciences, 9 (2016), pp. 2073–2098.

[33] V. Estellers, D. Zosso, R. Lai, S. Osher, J. Thiran, and X. Bresson,An efficient algorithm for level-set method preserving distance function, IEEETransactions on Image Processing (TIP), 21 (2012), pp. 4722–4734.

[34] G. B. Folland, Real analysis, Pure and Applied Mathematics (New York), JohnWiley & Sons, Inc., New York, second ed., 1999. Modern techniques and theirapplications, A Wiley-Interscience Publication.

[35] D. Gabor, Theory of communication. Part 1: the analysis of information, Journalof the Institution of Electrical Engineers-Part III: Radio and CommunicationEngineering, 93 (1946), pp. 429–441.

[36] S. Gallot, D. Hulin, and J. Lafontaine, Riemannian Geometry,Springer-Verlag, Heidelberg, Germany, 3 ed., 2004.

[37] J. Glaunes, A. Trouve, and L. Younes, Diffeomorphic matching of distribu-tions: A new approach for unlabelled point-sets and sub-manifolds matching, in IEEEConference on Computer Vision and Pattern Recognition (CVPR), vol. 2, IEEE,2004, pp. 712–718.

[38] S. Gold and A. Rangarajan, A graduated assignment algorithm for graphmatching, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI),18 (1996), pp. 377–388.

138

[39] J. Gomes and O. Faugeras, Reconciling distance functions and level-sets, Journalof Visual Communication and Image Representation, 11 (2000), pp. 209–223.

[40] L. Gorelick, M. Galun, E. Sharon, R. Basri, and A. Brandt, Shaperepresentation and classification using the Poisson equation, IEEE Transactions onPattern Analysis and Machine Intelligence (PAMI), 28 (2006), pp. 1991–2005.

[41] U. Grenander, A unified approach to pattern analysis, Advances in Computers, 10(1970), pp. 175–216, http://dx.doi.org/10.1016/S0065-2458(08)60436-2.

[42] B. H. Guan, J. Corring, M. Sethi, S. Ranka, and A. Rangarajan, Imagestack surface area minimization for groupwise and multimodal affine registration, inInternational Conference on Pattern Recognition (ICPR), IAPR, 2016, pp. 250–256.

[43] R. Guler, S. Tari, and G. Unal, Screened Poisson hyperfields for shape coding,SIAM Journal on Imaging Sciences, 7 (2014), pp. 2558–2590.

[44] K. S. Gurumoorthy, A. Rangarajan, and J. Corring, Gradient densityestimation in arbitrary finite dimensions using the method of stationary phase, arXivpreprint arXiv:1211.3038, (2012).

[45] G. Guy and G. Medioni, Inferring global perceptual contours from local features,in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE,1993, pp. 786–787.

[46] E. Hasanbelliu, L. Sanchez Giraldo, and J. C. Principe, Informationtheoretic shape matching, IEEE Transactions on Pattern Analysis and MachineIntelligence (PAMI), 36 (2014), pp. 2436–2451.

[47] R. J. Hathaway, Another interpretation of the em algorithm for mixture distribu-tions, Statistics & Probability Letters, 4 (1986), pp. 53–56.

[48] C. Heil, J. Ramanathan, and P. Topiwala, Linear independence of time-frequency translates, Proceedings of the American Mathematical Society (AMS), 124(1996), pp. 2787–2795.

[49] C. E. Heil and D. F. Walnut, Continuous and discrete wavelet transforms,SIAM Review, 31 (1989), pp. 628–666.

[50] J. Ho, G. Cheng, H. Salehian, and B. C. Vemuri, Recursive karcher expecta-tion estimators and geometric law of large numbers, in Proceedings of the SixteenthInternational Conference on Artificial Intelligence and Statistics, AISTATS 2013,Scottsdale, AZ, USA, April 29 - May 1, 2013, 2013, pp. 325–332.

[51] X. Huang, N. Paragios, and D. Metaxas, Shape registration in implicit spacesusing information theory and freeform deformations, IEEE Transactions on PatternAnalysis and Machine Intelligence (PAMI), 28 (2006), pp. 1303–1318.

139

http://dx.doi.org/10.1016/S0065-2458(08)60436-2

[52] B. Jian and B. C. Vemuri, Robust point set registration using Gaussian mixturemodels, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 33(2011), pp. 1633–1645.

[53] M. Kazhdan, M. Bolitho, and H. Hoppe, Poisson surface reconstruction, inProceedings of the Fourth Eurographics Symposium on Geometry Processing, vol. 7,2006.

[54] M. Kazhdan and H. Hoppe, Screened poisson surface reconstruction, ACMTransactions on Graphics (TOG), 32 (2013), p. 29.

[55] I. Kezurer, S. Z. Kovalsky, R. Basri, and Y. Lipman, Tight relaxationof quadratic matching, Comput. Graph. Forum, 34 (2015), pp. 115–128, http://dx.doi.org/10.1111/cgf.12701.

[56] B. B. Kimia, A. R. Tannenbaum, and S. W. Zucker, Shapes, shocks, anddeformations I: the components of two-dimensional shape and the reaction-diffusionspace, International Journal of Computer Vision (IJCV), 15 (1995), pp. 189–224.

[57] R. Kimmel and J. A. Sethian, Computing geodesic paths on manifolds,Proceedings of the National Academy of Sciences, 95 (1998), pp. 8431–8435.

[58] A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, K. Glashoff, andR. Kimmel, Coupled quasi-harmonic bases, Comput. Graph. Forum, 32 (2013),pp. 439–448, http://dx.doi.org/10.1111/cgf.12064.

[59] T. S. Lee, Image representation using 2D Gabor wavelets, IEEE Transactions onPattern Analysis and Machine Intelligence (PAMI), 18 (1996), pp. 959–971.

[60] B. Levy, Laplace-Beltrami eigenfunctions: Towards an algorithm that “understands”geometry, in IEEE International Conference on Shape Modeling and Applications,IEEE, 2006, pp. 13–21.

[61] W. E. Lorensen and H. E. Cline, Marching cubes: A high resolution 3d surfaceconstruction algorithm, SIGGRAPH Computer Graphics, 21 (1987), pp. 163–169,http://doi.acm.org/10.1145/37402.37422.

[62] S. G. Mallat, A theory for multiresolution signal decomposition: the waveletrepresentation, IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI), 11 (1989), pp. 674–693.

[63] J. Manson, G. Petrova, and S. Schaefer, Streaming surface reconstructionusing wavelets, Comput. Graph. Forum, 27 (2008), pp. 1411–1420, http://dx.doi.org/10.1111/j.1467-8659.2008.01281.x.

[64] K. V. Mardia, J. Kent, C. Goodall, and J. Little, Kriging and splines withderivative information, Biometrika, (1996), pp. 207–221.

140

http://dx.doi.org/10.1111/cgf.12701



http://doi.acm.org/10.1145/37402.37422

http://dx.doi.org/10.1111/j.1467-8659.2008.01281.x

http://dx.doi.org/10.1111/j.1467-8659.2008.01281.x

[65] S. Miraku, Volumentric shape description of range data using “blobby model”,vol. 25, july 1991, pp. 227–235.

[66] E. Mjolsness, G. Gindi, and P. Anandan, Optimization in model matching andperceptual organization, Neural Computation, 1 (1989), pp. 218–229.

[67] P. Mordohai and G. Medioni, Tensor voting: a perceptual organization approachto computer vision and machine learning, Synthesis Lectures on Image, Video, andMultimedia Processing, 2 (2006), pp. 1–136.

[68] M. Moyou, J. Corring, A. M. Peter, and A. Rangarajan, A grassmanniangraph approach to affine invariant feature matching, CoRR, abs/1601.07648 (2016),http://arxiv.org/abs/1601.07648.

[69] P. Mullen, F. de Goes, M. Desbrun, D. Cohen-Steiner, and P. Alliez,Signing the unsigned: Robust surface reconstruction from raw pointsets, ComputerGraphics Forum, 29 (2010), pp. 1733–1741, http://dx.doi.org/10.1111/j.1467-8659.2010.01782.x.

[70] J. R. Munkres, Elementary differential topology, vol. 54, Princeton UniversityPress, 1966.

[71] R. M. Murray, Z. Li, S. S. Sastry, and S. S. Sastry, A mathematicalintroduction to robotic manipulation, CRC press, 1994.

[72] A. Myronenko and X. Song, Point set registration: Coherent point drift, IEEETransactions on Pattern Analysis and Machine Intelligence (PAMI), 32 (2010),pp. 2262–2275.

[73] Y. Ohtake, A. Belyaev, M. Alexa, G. Turk, and H.-P. Seidel, Multi-levelpartition of unity implicits, in ACM SIGGRAPH 2005 Courses, ACM, 2005, p. 173.

[74] S. Osher and R. Fedkiw, Level-set methods and dynamic implicit surfaces,vol. 153, Springer Science & Business Media, 2006.

[75] S. Osher and J. A. Sethian, Fronts propagating with curvature-dependentspeed: algorithms based on Hamilton-Jacobi formulations, Journal of ComputationalPhysics, 79 (1988), pp. 12–49.

[76] M. Ovsjanikov, M. Ben-Chen, J. Solomon, A. Butscher, and L. Guibas,Functional maps: a flexible representation of maps between shapes, ACMTransactions on Graphics (TOG), 31 (2012), p. 30.

[77] N. Paragios, M. Rousson, and V. Ramesh, Non-rigid registration usingdistance functions, Computer Vision and Image Understanding (CVIU), 89 (2003),pp. 142–165.

[78] E. Parzen, On Estimation of a Probability Density Function and Mode, The annalsof mathematical statistics, (1962), pp. 1065–1076.

141

http://arxiv.org/abs/1601.07648

http://dx.doi.org/10.1111/j.1467-8659.2010.01782.x

http://dx.doi.org/10.1111/j.1467-8659.2010.01782.x

[79] M. Pauly, M. H. Gross, and L. Kobbelt, Efficient simplification of point-sampled surfaces, in IEEE Visualization, 2002, pp. 163–170, http://dx.doi.org/10.1109/VISUAL.2002.1183771.

[80] A. Peter, A. Rangarajan, and J. Ho, Shape l’Ane Rouge: sliding waveletsfor indexing and retrieval, in IEEE Conference on Computer Vision and PatternRecognition (CVPR), IEEE, 2008, pp. 1–8.

[81] A. M. Peter and A. Rangarajan, Maximum likelihood wavelet density estima-tion with applications to image and shape matching, IEEE Transactions on ImageProcessing, 17 (2008), pp. 458–468.

[82] A. Rangarajan, Revisioning the unification of syntax, semantics and statistics inshape analysis, Pattern Recognition Letters, 43 (2014), pp. 39–46.

[83] C. E. Rasmussen, The infinite gaussian mixture model., in NIPS, vol. 12, 1999,pp. 554–560.

[84] K. Reda, A. Febretti, A. Knoll, J. Aurisano, J. Leigh, A. E. Johnson,M. E. Papka, and M. Hereld, Visualizing large, heterogeneous data in hybrid-reality environments., IEEE Computer Graphics and Applications, 33 (2013),pp. 38–48.

[85] S. Rusinkiewicz and M. Levoy, Efficient variants of the icp algorithm, inThird International Conference on 3-D Digital Imaging and Modeling, IEEE, 2001,pp. 145–152.

[86] S. Sarkar and K. L. Boyer, Perceptual organization in computer vision: Areview and a proposal for a classificatory structure, IEEE Transactions on Systems,Man and Cybernetics, 23 (1993), pp. 382–399.

[87] F. R. Schmidt, D. Farin, and D. Cremers, Fast matching of planar shapes insub-cubic runtime, in IEEE International Conference on Computer Vision (ICCV),IEEE, 2007, pp. 1–6.

[88] B. Scholkopf, A. Smola, and K.-R. Muller, Nonlinear component analysis asa kernel eigenvalue problem, Neural Computation, 10 (1998), pp. 1299–1319.

[89] M. O. Scully and M. S. Zubairy, Quantum optics, Cambridge university press,1997.

[90] M. Sethi, A. Rangarajan, and K. Gurumoorthy, The Schrodinger distancetransform (SDT) for point-sets and curves, in IEEE Conference on Computer Visionand Pattern Recognition (CVPR), IEEE, 2012, pp. 198–205.

[91] J. A. Sethian, A fast marching level set method for monotonically advancingfronts, Proceedings of the National Academy of Sciences, 93 (1996), pp. 1591–1595.

142

http://dx.doi.org/10.1109/VISUAL.2002.1183771

http://dx.doi.org/10.1109/VISUAL.2002.1183771

[92] K. Siddiqi, A. Shokoufandeh, S. J. Dickinson, and S. W. Zucker, Shockgraphs and shape matching, International Journal of Computer Vision (IJCV), 35(1999), pp. 13–32.

[93] M. Spivak, A comprehensive introduction to differential geometry, vol. 1-5, Publishor Perish, 3rd ed., 1999.

[94] M. Sussman, P. Smereka, and S. Osher, A level set approach for computingsolutions to incompressible two-phase flow, Journal of Computational physics, 114(1994), pp. 146–159.

[95] R. Szeliski, Image alignment and stitching: A tutorial, Tech. ReportMSR-TR-2004-92, Microsoft Research, September 2004.

[96] N. Thakoor, J. Gao, and S. Jung, Hidden Markov model-based weightedlikelihood discriminant for 2D shape classification, IEEE Transactions on ImageProcessing, 16 (2007), pp. 2707–2719.

[97] D. W. Thompson, On growth and form, Cambridge University Press, 1917, 1945,http://www.biodiversitylibrary.org/item/28884.

[98] Y. Tsin and T. Kanade, A correlation-based approach to robust point set reg-istration, in European Conference on Computer Vision (ECCV), Springer, 2004,pp. 558–569.

[99] J. N. Tsitsiklis, Efficient algorithms for globally optimal trajectories, IEEETransactions on Automatic Control, 40 (1995), pp. 1528–1538.

[100] M. Vaillant and J. Glaunes, Surface matching via currents, in InformationProcessing in Medical Imaging, Springer, 2005, pp. 381–392.

[101] S. R. S. Varadhan, On the behavior of the fundamental solution of the heat equa-tion with variable coefficients, Communications on Pure and Applied Mathematics,20 (1967), pp. 431–455.

[102] R. C. Veltkamp, Shape matching: similarity measures and algorithms, in ShapeModeling and Applications, SMI 2001 International Conference on., IEEE, 2001,pp. 188–197.

[103] G. Wahba, Spline models for observational data, vol. 59 of Regional ConferenceSeries in Applied Mathematics, SIAM, Philadelphia, Pennsylvania, 1990.

[104] Y. Wang, K. Woods, and M. McClain, Information-theoretic matching of twopoint sets, IEEE Transactions on Image Processing (TIP), 11 (2002), pp. 868–872.

[105] E. Wilczok, New uncertainty principles for the continuous gabor transform and thecontinuous wavelet transform, Documenta Mathematica, 5 (2000), pp. 201–226.

143

http://www.biodiversitylibrary.org/item/28884

[106] R. Wong, Asymptotic Approximations of Integrals, Society for Industrial andApplied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA, USA, 1 ed.,2001, http://epubs.siam.org/doi/abs/10.1137/1.9780898719260.

[107] L. Younes, Shapes and Diffeomorphisms, vol. 171 of Applied MathematicalSciences, Spring, New York, New York.

[108] L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, in Advances inNeural Information Processing Systems (NIPS), 2004, pp. 1601–1608.

[109] H. Zhao, A fast sweeping method for eikonal equations, Mathematics ofcomputation, 74 (2005), pp. 603–627.

[110] H.-K. Zhao, S. Osher, B. Merriman, and M. Kang, Implicit and nonpara-metric shape reconstruction from unorganized data using a variational level-setmethod, Computer Vision and Image Understanding (CVIU), 80 (2000), pp. 295–314.

[111] Y. Zheng and D. Doermann, Robust point matching for non-rigid shapes bypreserving local neighborhood structures, IEEE Transactions on Pattern Analysis andMachine Intelligence (PAMI), 28 (2006), pp. 643–649.

[112] F. Zhou and F. de la Torre, Factorized graph matching, in IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), IEEE, 2012, pp. 127–134,http://dx.doi.org/10.1109/CVPR.2012.6247667.

[113] F. Zhou and F. de la Torre, Deformable graph matching, in IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), IEEE, 2013, pp. 2922–2929.

144

http://epubs.siam.org/doi/abs/10.1137/1.9780898719260

http://dx.doi.org/10.1109/CVPR.2012.6247667

BIOGRAPHICAL SKETCH

John Corring attended the Mississippi School for Math and Science before earning

the National Merit Finalist Award and earning bachelor’s degrees in mathematics and

computer science during his undergraduate years at the University of Southern Mississippi.

He began working on image processing problems during his undergraduate years, including

medical and geospatial remote sensing applications, which eventually lead to the current

work. He has published in the International Conference on Pattern Recognition and the

European Conference on Computer Vision during his Ph.D., and is currently working on

journal publications in Transactions on Pattern Analysis and Machine Intelligence and

SIAM Imaging Science.

145

Documents

ufdcimages.uflib.ufl.eduufdcimages.uflib.ufl.edu/UF/E0/05/08/65/00001/CORRING_J.pdf · ACKNOWLEDGMENTS Thanks to my wife, Emily, for her patience, love, support, and input. She listened