Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
A COMPLEX-VALUED FIELD MODEL FOR SHAPE REPRESENTATIONWITH APPLICATIONS IN COMPUTER VISION AND GRAPHICS
By
JOHN R. CORRING
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2017
c⃝ 2017 John R. Corring
For Alexander
ACKNOWLEDGMENTS
Thanks to my wife, Emily, for her patience, love, support, and input. She listened to
me go on for hours about everything ranging from the theoretical aspects of my thesis to
specific engineering problems. For that, and her hard work, I will always be grateful.
Thanks to Anand Rangarajan for creative motivation. Anand always believed in
me throughout this long process. His breadth of knowledge helped me to craft a thesis
that reflects the scope of my own interests in Shape Analysis. I’ll never forget this, or our
extra-curricular conversations.
Thanks to Arunava Banerjee, Alireza Entezari, Michael Jury, and Paul Robinson. I
learned a lot from each member of my committee. The depth of theoretical and practical
knowledge gained from you all was invaluable.
Thanks to Joe Wilson for supporting me during the last four years. By working in the
CSI Lab I gained a lot of practical knowledge. My colleagues from the CSILab — Brandon
Smock, Ferit Toska, Gus Munoz, Maks Levental, Pete Dobbins — were great sources of
friendship and support over the years. Thanks to Nuri Yeralan, Karthik Gurumoorthy,
Subhojit Sengupta, Jan Stuehmer, Thomas Mollenhof. Stimulating conversations with
brilliant people are the principal form of payment provided to a PhD student and you were
the best sources.
Thanks to Dr. Frank Schmidt and Dr. Daniel Cremers for inviting me to Munchen,
Germany to experience working in your lab. The time spent there was inspiring and
invaluable for deciding what to do after graduation.
Thanks to my mom and dad for their encouragement. Thanks to my brothers for their
support and love.
Finally, I want to remember Grandmother Maryann Corring who passed away while I
was working on my PhD. She introduced me to great literature, architecture, history, and
arts at a young age and instilled a passion for learning in me that helped define who I am
today. She is missed.
4
TABLE OF CONTENTS
page
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
CHAPTER
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1 Prior Work on Shape Models . . . . . . . . . . . . . . . . . . . . . . . . . . 141.1.1 Implicit Shape Representations . . . . . . . . . . . . . . . . . . . . . 151.1.2 Explicit Shape Representations . . . . . . . . . . . . . . . . . . . . . 19
1.2 Prior Work on Registration and Matching . . . . . . . . . . . . . . . . . . 201.3 Prior Work on Surface Reconstruction . . . . . . . . . . . . . . . . . . . . 241.4 Outline of this Document . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2 REPRESENTING SHAPE WITH PHASE . . . . . . . . . . . . . . . . . . . . . 29
2.1 The Complex Wave Representation . . . . . . . . . . . . . . . . . . . . . . 312.1.1 Analysis of ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.1.2 ψ for Oriented Multi-curve Shapes . . . . . . . . . . . . . . . . . . . 36
2.2 Wave Mixtures as Geometric Primitives . . . . . . . . . . . . . . . . . . . . 392.2.1 A Note on Gabor Analysis . . . . . . . . . . . . . . . . . . . . . . . 412.2.2 Square-root Densities and Probabilistic Interpretation of the Complex
Wave Mixture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.3 Relationship Between Signed Distances and Complex Wave Mixtures . . . 432.4 An Embedding Theorem for Complex Wave Mixtures . . . . . . . . . . . . 44
3 REGISTRATION: RESONANT DEFORMABLE MATCHING . . . . . . . . . 46
3.1 Hypothesis Classes for Registration . . . . . . . . . . . . . . . . . . . . . . 463.1.1 Euclidean Transformations . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.1.1 Euler angles . . . . . . . . . . . . . . . . . . . . . . . . . . 483.1.1.2 Quaternions . . . . . . . . . . . . . . . . . . . . . . . . . . 483.1.1.3 Action of Euclidean transformations on the normal vector 49
3.1.2 Affine Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 493.1.3 Nonrigid Transformations and Regularization . . . . . . . . . . . . . 50
3.1.3.1 Thin-plate spline radial basis functions . . . . . . . . . . . 513.1.3.2 Gaussian radial basis functions . . . . . . . . . . . . . . . 53
3.2 Introducing Normal Variables for the Target Oriented Point-set . . . . . . 543.3 Choosing a Suitable Distance Function . . . . . . . . . . . . . . . . . . . . 553.4 Gradient Computation and Optimization Details . . . . . . . . . . . . . . . 58
5
3.5 A Brief Comparison with Currents . . . . . . . . . . . . . . . . . . . . . . 613.6 Analysis of the RDM Objective Function . . . . . . . . . . . . . . . . . . . 64
3.6.1 Inner Product of CWRs . . . . . . . . . . . . . . . . . . . . . . . . . 643.6.1.1 Isotropic CWRs . . . . . . . . . . . . . . . . . . . . . . . . 643.6.1.2 Anisotropic CWRs . . . . . . . . . . . . . . . . . . . . . . 65
3.6.2 Asymptotic Behavior of the RDM Objective Function . . . . . . . . 66
4 REGISTRATION: EMPIRICAL ANALYSIS . . . . . . . . . . . . . . . . . . . . 68
4.1 Experimental Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.2 Rigid and Affine Registration . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2.1 Range of Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.2.2 Gaussian Noise on Points . . . . . . . . . . . . . . . . . . . . . . . . 694.2.3 Missing Points With Outliers . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Synthetic Normal Recovery, Warps, and Occlusions . . . . . . . . . . . . . 724.4 Non-Synthetic Matching Experiments . . . . . . . . . . . . . . . . . . . . . 764.5 CMU House Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.6 3-D Subcortical Structure Registration . . . . . . . . . . . . . . . . . . . . 774.7 Maximum Likelihood Registration with |ψ|2 as a Density . . . . . . . . . . 78
5 THEORY OF THE REPRESENTATION: CONNECTEDNESS, COMPLETENESS,AND CONTRIBUTIONS TO THE GABOR EXPANSION . . . . . . . . . . . . 84
5.1 Connectedness of Pairs of Complex Waves . . . . . . . . . . . . . . . . . . 845.1.1 Zeros of ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.1.2 Connectedness of θψ = 0 for Symmetric Configurations . . . . . . 865.1.3 Connectedness of θψ = 0 for Asymmetric Configurations . . . . . 90
5.1.3.1 Numerical analysis of asymmetric connectedness . . . . . . 905.1.3.2 An analytical condition for asymmetric connectedness . . . 92
5.2 Going Beyond Two Atoms with Imψ . . . . . . . . . . . . . . . . . . . . . 985.2.1 Stability of Level-sets of Imψ . . . . . . . . . . . . . . . . . . . . . . 1015.2.2 The Class of Curves Approximated By F . . . . . . . . . . . . . . . 106
5.3 Asymptotic Approximation of Modular Distance Fields by Gabor Atoms . 107
6 FURTHER EXPLORATIONS AND APPLICATIONS . . . . . . . . . . . . . . 111
6.1 Curve Extraction from ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1116.1.1 Mean Shortest-Path Error Evaluation on 2-D Data: MPEG7 Dataset 1136.1.2 Hausdorff Distance-based Evaluation of 3-D Data: Spheres, Bunny,
FAUST Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1156.2 ψ for kPCA on Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216.3 Generalization of the CWR to Embedded Surfaces . . . . . . . . . . . . . . 125
7 CONCLUSION AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . . . 132
7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1327.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7
LIST OF TABLES
Table page
1-1 The Fast Marching algorithm for constructing signed distance functions. . . . . 17
2-1 Technical Layout of the Operations on ψ . . . . . . . . . . . . . . . . . . . . . . 35
4-1 Range of convergence for rotations. . . . . . . . . . . . . . . . . . . . . . . . . . 70
4-2 Average (Standard deviation) initial DICE and final DICE scores over a set offour subcortical structures registered along the boundary. . . . . . . . . . . . . . 78
6-1 An algorithm for extracting the shape corresponding to a collection of orientedpoints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
8
LIST OF FIGURES
Figure page
1-1 Visualization of the fast marching process. . . . . . . . . . . . . . . . . . . . . . 16
2-1 A simple example of composition of a distance function from oriented pointsusing the Complex Wave Representation (CWR). . . . . . . . . . . . . . . . . . 33
2-2 Visualization of the phase of ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2-3 Merging of two curves as oriented points move closer together. . . . . . . . . . . 38
2-4 Zero level-sets of the phase of ψ for subject 1 of FAUST under several differentvalues of σ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3-1 Tait-Brian angles. ψ provides the rotation about the initial z−axis, θ the rotationabout the subsequent y−axis, and Φ the rotation about the subsequent x−axis. 1 47
3-2 An example of surface reconstruction by RDM. . . . . . . . . . . . . . . . . 56
3-3 An example of curve reconstruction by RDM. . . . . . . . . . . . . . . . . . 57
3-4 Profile of the L2 distance function over several transformations and choices ofparameters σ, λ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4-1 Median error and variance for rigid transformation with pointwise Gaussian noise. 71
4-2 Median error and variance for rigid transformation with dropped inliers andoutliers added. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4-3 Median error and variance for affine transformations. . . . . . . . . . . . . . . . 74
4-4 Comparison of different techniques for estimating normal vectors from an organizedpoint-set and an oriented point-set. . . . . . . . . . . . . . . . . . . . . . . . . . 75
4-5 Experimental comparison of RDM and other matching algorithms on a 2-D dataset. 80
4-6 Experimental comparison of RDM and other matching algorithms on 3-D datasets. 81
4-7 Recall graphs and area under the curve for the CMU House. . . . . . . . . . . . 82
4-8 Maximum likelihood alignment using |ψ(x)|2 as a density. . . . . . . . . . . . . . 83
5-1 Visualization of g along a vertical slice of the set containing a zero of Imψ. . . . 87
5-2 Numerical experiments showing the connectedness and non-connectedness atdifferent values of parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5-3 Plots showing the zero crossings of interest for the analytical solution to thedisconnection problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
9
5-4 An explanatory figure to accompany the proof of approximation for the multi-atomcase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6-1 Zero level-sets of subject 5 of the FAUST sequence. . . . . . . . . . . . . . . . . 114
6-2 Average error on shortest path between 250 pairs of points (randomly chosen)in the estimated mesh at 10 sampling rates. . . . . . . . . . . . . . . . . . . . . 116
6-3 Sphere reconstruction over different sampling rates. . . . . . . . . . . . . . . . . 117
6-4 Face reconstruction over different sampling rates. . . . . . . . . . . . . . . . . . 118
6-5 Bunny (closed surface) reconstruction over different noise levels. . . . . . . . . . 119
6-6 Bunny (closed surface) reconstruction over different sampling rates. . . . . . . . 120
6-7 Closed curves and density estimates from linear combinations of CWRs. kPCAbasis of CWRs as a subspace classifier. . . . . . . . . . . . . . . . . . . . . . . . 123
6-8 Recovery of closed curves in training and testing samples for the Gatorbait dataset.124
6-9 A fish drawn on the surface of a sphere using the spherical CWR. . . . . . . . . 127
6-10 A diamond drawn on a mesh from FAUST using 4 oriented points. . . . . . . . 129
6-11 Registration of spherical oriented point-sets. . . . . . . . . . . . . . . . . . . . . 131
10
Abstract of Dissertation Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of theRequirements for the Degree of Doctor of Philosophy
A COMPLEX-VALUED FIELD MODEL FOR SHAPE REPRESENTATIONWITH APPLICATIONS IN COMPUTER VISION AND GRAPHICS
By
John R. Corring
May 2017
Chair: Anand RangarajanMajor: Computer Engineering
Shape processing and analysis is a growing area of research straddling computer
vision and computer graphics. Probability density and signed distance representations
have become central to the field. While modeling uncertainty about observations and
providing a convex representation, probability densities lack geometric precision and often
exhibit topological inaccuracies. Signed distances lack the robustness of densities and the
geometry of the feature space is very complicated.
In this thesis, we develop a parametric model for approximating signed distance
functions. This allows us to build up a useful shape representation with accurate
topological and geometric data from point and normal estimates. The representation
is approximately linear under union of parameters, thus allowing for easy computation and
conventional statistical methods to be leveraged for shape processing.
We develop an algorithm for registering oriented point-sets that leverages the
representation. We compare this algorithm with various contenders in registration. We
also develop algorithms for mesh extraction from sparsely scattered oriented points,
dimensionality reduction on a collection of shapes, and approximation of distance
functions on Riemannian surfaces. Empirical validation of all of the approaches outlined
in this work is performed. Theoretical contributions include the description of a family of
curves for which the representation converges uniformly, clarification of the regularizing
principle on the geometric component of the representation, and an analysis of the
11
asymptotic behavior of the representation with respect to the magnitude variance and
frequency variance parameters.
12
CHAPTER 1INTRODUCTION
Shape modeling is a core area of research in Computer Vision and Graphics. The
principal problems in shape modeling are
Representation: the parameters stored to reconstruct or retrieve the shape. Choosinga representation entails a variety of limitations and requirements, there is no one sizefits all representation.
– Representations can be focused on visualization,
– Easy retrieval or lookup may be a major component,
– Ease of estimation from noisy measurements and sparsity may be a majorfeature.
Deformation: this component implements the morphology of the shapes. It isusually represented as the class of maps used to map one shape object intoanother. Depending on the representation the deformations may need to havecertain invariant subgroups.
– Deformations may be rigid—limiting the degrees of freedom of the shape to anextrinsic pose,
– Deformations may also be nonrigid, with a variety of different penaltiesavailable to restrict the range of nonrigid motions.
Incorporation: once a hypothesis on the type of morphology the shape will undergois established, external influences that implement task-specific processes need to beimposed on the shape. These are usually a component in inferring an appropriatedeformation.
– By representing the shape as a field, the class of deformations acting on thefield typically must be evaluated everywhere but the modeling can be veryprecise,
– Representations that take samples from the field allow quicker evaluation ofincorporated effects but may suffer from accuracy issues.
Once these problems are addressed the shape framework for Computer Vision is available:
using prior shape knowledge to regularize ill-posed problems of Vision. Problems such as
image segmentation can be conditioned by incorporating prior shape knowledge about
13
the subject to be segmented. This thesis is about a new approach to implementing this
framework.
Shape models are useful in their own right as symbolic virtual objects, representative
of a class or a collection of observations of a form. Concretely, they can be used to
estimate otherwise difficult to measure statistics: such as range of body dimensions for a
given organism. While the mathematical study of shape spans multiple fields and has a
deep history [41, 70, 93, 97], we have a more narrow focus in this work. We will emphasize
distinctions between shape representations that influence the range of modeling situations
handled by the representations. We will also review the mathematics underlying the
representation when necessary.
This thesis outlines a new representation of shape. The representation has both
implicit components, a Complex-valued function on an ambient embedding space, and
explicit components, sampled from the normal bundle of a co-dimension 1 manifold.
It has a meaningful extension away from the set or surface of interest: the magnitude
stores the probability of observing a point while the phase is approximately the signed
distance in a narrow band around the surface. We have included several useful properties
of the representation and proof useful theorems regarding the class of shapes that can
be represented. In this thesis document we will lay out the progress on studying the
representation and describe how we have approached the problems of shape modeling
using the representation. We also present an empirical study of registration of curves
as oriented point-sets, performing statistics on collections of shapes, and extending the
representation in various ways.
1.1 Prior Work on Shape Models
The representation problem from above can be approached from an abstract
standpoint first and specified after decisions are made regarding the task type and
subservient modeling. In general, a shape representation for a set A ⊂ Rd consists of a
domain Ω (possibly an embedding space, but often simply an index set) mapped to range
14
R (which may in general consist of objects) by fA such that one can reconstruct A up
to an equivalence class. There are many vagaries in this abstract definition: what does it
mean to “reconstruct A up to an equivalence class?” what are the structures of Ω, R? etc.
These all depend on the representation. I’ll give examples below, as we explain specific
representations.
1.1.1 Implicit Shape Representations
Implicit shape representations portray embedded sets and shapes. Suppose that we
are given A ⊂ R2, with A open. Furthermore, suppose that A has a boundary given
by a family of C2 closed curves. Often, from the shape standpoint, we are interested in
representing these closed curves ∂A. An implicit representation fA : R2 → F should have
some distinguished property along ∂A that allows us to “reconstruct A”. For example, we
could say that fA should have a maximum value along ∂A. Then we could reconstruct A
by considering the maximal α such that fA ≥ α = ∅. There are a few hiccups with this
representation, and that makes it a good example to start from. First off, you couldn’t
reconstruct A in general from this since we have no further properties of fA to exploit:
we wouldn’t be able to distinguish A from Ac. Second, this representation is not injective:
there just aren’t enough constraints on fA, and there are a continuum of equivalent
functions representing A this way. Finally, very importantly for computer applications,
this representation is not very stable to small perturbations in f . I’ll return to a specific
implementation of a shape representation that deals with these problems after motivating
the registration problem.
Implicit representations abound in the literature [15, 40, 43]. The signed distance
function (SDF) is an example in which the sign encodes interior/exterior properties with
the absolute value encoding the distance to the nearest point in the set of curves (surfaces)
[74, 75, 77]. Contrast this with the unsigned distance function which lacks interior/exterior
information. Surprisingly, there is little work on registering template and target SDFs. We
address the technical reasons for this below.
15
p
∞∞−0.5
0.5∞
∞∞
∞∞
∞
∞∞
∞−0.30.7
∞∞
∞∞
∞
∞∞
∞−0.90.1
∞∞
∞∞
∞
∞∞
∞∞−0.3
0.7∞
∞∞
∞
∞∞
∞∞−0.5
0.5∞
∞∞
∞
∞∞
∞∞−0.5
0.5∞
∞∞
∞
∞∞
∞∞−0.3
0.7∞
∞∞
∞
∞∞
∞−0.90.1
∞∞
∞∞
∞
∞∞
∞−0.30.7
∞∞
∞∞
∞
∞∞−0.5
0.5∞
∞∞
∞∞
∞
∂AAΩ \A
Ω
p
FA
FA(p)
p
(a)
p
∞−1.5−0.5
0.51.5
∞∞
∞∞
∞
∞∞−1.3
−0.30.7
1.7∞
∞∞
∞
∞∞
∞−0.90.1
1.1∞
∞∞
∞
∞∞
∞−1.3−0.3
0.71.7
∞∞
∞
∞∞
∞−1.5−0.5
0.51.5
∞∞
∞
∞∞
∞−1.5−0.5
0.51.5
∞∞
∞
∞∞
∞−1.3−0.3
0.71.7
∞∞
∞
∞∞
∞−0.90.1
1.1∞
∞∞
∞
∞∞−1.3
−0.30.7
1.7∞
∞∞
∞
∞−1.5−0.5
0.51.5
∞∞
∞∞
∞
∂AAΩ \A
Ω
p
FA
FA(p)
p
(b)
Figure 1-1. Visualization of the fast marching process. a) shows the field after 20 iterationswhile b) shows the field after 40 iterations.
The signed distance bS : Rd → R for an open set S (I’ll assume S has a smooth
boundary consisting of a collection of simply connected hypersurfaces) satisfies
|∇bS| = 1 (1–1)
bS|∂S = 0, bS ∈ C1(∂S).
[26] offers a full accounting of the analysis of SDFs as a shape representation. Signed
distance functions are typically constructed via the fast marching or fast sweeping
algorithm [91, 99, 109]. All of these methods appear to be related to Dijkstra’s algorithm
[30] in that they propagate distance information outward from the seeding zone of the
boundary. Note that in general fast marching etc. may be used for computing things other
than distances. Furthermore, the unsigned distance function—in which the values are all
non-negative and typically the seeding zone is a set of points—can also be constructed by
fast marching. Here we give an overview of the fast marching algorithm as a reference for
the rest of the paper.
16
Table 1-1. The Fast Marching algorithm for constructing signed distance functions. This isthe standard way to construct implicit representations of curves and surfaces. Inthis thesis will will explore an alternative.
Require: I a domain of nodes, S ⊂ I an initial seeding zone.1: function FastMarching(I, S)2: l(i) ← Unvisited for all i ∈ I∧ /∈ S3: b(i) ← ∞ for all i ∈ I∧ /∈ S4: l(i) ← Visited for all i ∈ S5: b(i) ← 0 for all i ∈ S6: for i in l(i) = Unvisited do7: b(i) ← Update by numeric approximation of (1–1). If the new value decreasesb(i) then l(i) = visited.
8: end for9: q ← Arg min
i:l(i)= visitedb(i)
10: l(q) ← accepted11: for j neighbors of q such that l(j) is not accepted do12: b(j) ← Reestimate by (1–1)13: if b(j) decreased in the previous step then14: l(j) ← visited15: end if16: if There is a node k with l(k) visited then17: Return to step 10 with q = k18: end if19: end for
return ℓimkm=1 such that αip > αip+1 .20: end function
The signed distance function is considered a staple of shape modeling. Here are a few
reasons why.
Stability of reconstruction: since the function goes smoothly through 0 along theshape, the representation is robust to small, smooth additive noise. Standardalgorithms, such as marching squares [61], can be used for recovering the shape.
Computability: by the algorithm above the representation is constructible.
Methodology: there is a broad literature of algorithms that leverage variationalmethods to compute solutions to problems ranging from graphics to vision [74].
However, we will encounter several problems with the signed distance function when
we discuss registration below. The more common implicit representation to use for
registration is a probability density.
17
The probability density representation of shapes is very simple: given a set A with
smooth boundary ∂A, draw samples maNa=1 from ∂A that cover it evenly. Then form a
density estimate from these samples. The simplest estimate is the Parzen window estimate
[78]. Given the samples the Parzen window estimate is the function
f(x|maNa=1) =1
hdN
N∑a=1
g(x− aσ
),
where the g are probability density functions or kernels. σ is a spatial variance or radius
associated with the kernel g, for which there are reasonable estimates based on the original
data set.
Other techniques for estimating densities include K-Means, EM [27, 47], and Bayesian
methods [83]. Recall the example of the implicit shape representation with the shape
encoded in the largest level-set. Probability density estimates suffer from the same
problems that this representation does. Since peaked values of functions are not stable,
recovering geometry from densities is complicated. The marching squares approach that
can be used to recovery a mesh from the signed distance function no longer applies since
the level-sets are now shaped like the cross-sectional profile of the kernel g. Note that
unimodal kernels cannot have broad and smooth profiles for the level-sets of large values
that cross through the centroid: then the expected value under the model density g would
not represent the centroid. Bi-modal and multi-modal densities frequently arise in the
wavelet literature in the context of approximating curves [13].
There is a close relationship between Gaussian Parzen density estimators and
unsigned distance functions: as σ → 0 the logarithm of the mixture approaches the
unsigned distance. This fact as well as the ease of computation and robustness of the
densities biases one towards this representation in applications. On the other hand, the
unsigned distance lacks the crucial topological information that the signed distance has.
In this work, we try to bridge the gap between signed distances and densities by breaking
the symmetry of the Gaussian density and using Gabor functions (wavelets) as square-root
18
density estimators. We find that square-root densities are an interesting new area ripe for
exploration in the field of shape representation.
1.1.2 Explicit Shape Representations
Explicit shape representations differ from implicit representations in that the
domain of the function which is used to recover the shape is a set of indices that refers to
parameters. So explicit representations include sets of points, meshes, and graphs. This
work focuses on the implicit shape representation that emerges from a particular choice
of representing function, given an explicit set of locations and directions. This section
serves to prime the reader on competing methods for solving the problems that we use as
validate the quality of the representation.
At the very simplest end we have point-sets: Ω = I = 1, 2, . . . , n and R = Rd.
In this setting reconstruction of an embedded shape is ill-posed, but we can hypothesize
polygonal reconstructions up to a rigid transformation by choosing an ordering on the
points. Unfortunately, there is no universally accepted hypothesis test or metric for
choosing such a surface. If we suppose that we also have explicit neighborhood information
we could let Ω = I = 1, 2, . . . , n and R be a set of neighbors and distances to the
neighbors. In this setting we are specifying the shape by an adjacency list, which we
could convert into a matrix. However, there are many ways to embed a set of points
corresponding to indices 1, . . . , n into e.g. R3. For instance, if we have an embedding
of points m1, . . . ,m2 ∈ R3 and then we apply a rotation R and a translation T then
TRmii ∼ mii under this representation. Moreover, the entire class of isometries
(distance preserving maps) is an invariance class for this representation. This can be a
good thing: we may want to encode some invariance in our representation, so that we
identify shapes based on intrinsic properties.
Another example of an explicit shape representation is a mesh. Meshes can be
represented in an abstract way without specifying any locations for the simplices.
However, to do so we need to specify additional intrinsic properties: one can specify a
19
set of surrounding triangles (as embedded triangles in R3, not abstract simplicial features)
and their associated vertices in a particular order.
Neither meshes nor adjacency matrices are the primary explicit shape representation
of interest for comparison with this work. Rather, an intermediate representation with
Ω = I and R = (Rd)2 is considered, with the first entry being a point and the second entry
a normal vector to an underlying curve or surface. We refer to these as oriented point-sets.
An example of an oriented point set is the set of barycenters and normal vectors from a
mesh. Note that an oriented point may have a non-unit normal vector component with
magnitude zero, in which case it effectively acts as an un-oriented point. While the curve
estimation problem remains ill-posed with OPSs, we now have a 1st order condition for
C1 surfaces that allows us to better evaluate a hypothesized sequence of points, curve, or
spline.
1.2 Prior Work on Registration and Matching
Many problems in computer vision require us to determine correspondences between
similar sets of features. Matching generally refers to these types of problems. Despite
the frequency with which these types of problems arise, researchers are often faced with
scenarios where it is very difficult to even define what a correspondence between two
objects should be—no natural map, moreover bijection, may exist at all. This is often due
to mismatched representations. Work focused on determining point correspondences for
matching organized features has been abundant, as is highlighted below, but there remains
a clear need for handling mismatched representations. With this limitation in mind, given
features sets A = fiNi=1 and B = giMi=1 a correspondence between A and B is a function
h : 1, . . . , N → 1, . . . ,M ∪ ∅.
First note the subtle use of matching vs. registering: when working with an implicit
representation of shape registering is more appropriate since accounting for a “matching”
between two fields is intuitively difficult to check while a “registration” implies an
underlying point-wise alignment. A registration can be used to determine a matching
20
using a nearest-neighbor assignment; a matching can determine a registration once a
space of transformations is determined and a choice of estimator for the selection of a
transformation is chosen. This work provides a method for producing a registration for the
mismatched case where the template consists of oriented points and the target consists of
points, under the assumption that both template and target are drawn from outlines of
shapes coming from the same class.
Given a collection of features describing a set or object embedded in Rd, correspondences
can be obtained via registration and vice versa. One of the first broadly successful
approaches to registering point-sets was Iterated Closest Point (ICP) alignment [5].
ICP is a simple alternating algorithms between two stages:
1 Find the nearest neighbors to points in the moving template,
2 Fit a transformation to the correspondences generated from these neighbors.
ICP terminates when the change in the tranformation is sufficiently small or none of
the neighbors change. ICP uses both registration and correspondence in the alignment
process. There are many variants of ICP [85]. Some of the most popular employ
restrictions on the types of matches considered or use sampling strategies to improve
the robustness of the final fit. We highlight ICP because it is one of the few approaches to
registration or alignment in which the authors emphasized the algorithms ability to handle
mismatched representations: they explain how to compute point-to-parametric-entity
distances in their paper [5]. Note that ICP can be used for a variety of transformation
types, just as the algorithms presented below. The field of registration has gone in two
very different directions since the development of ICP: measure-based registration and
correspondence-based registration. We will discuss both now.
In medical image processing special emphasis is place on the modeling of the features
and objects being registered. A major component in the modeling is the use of density-
or measure-based shape models, such as discussed above. Registering these models is
often done with measure-based registration approaches. In these approaches, sparse
21
feature sets are first converted into scalar field representations. Then the fields are
aligned by some metric on the representations. Finally non-rigid registration of the
template field with that of the target yields dense point to point correspondences by
post-processing. In this representation we expect high values to be associated with
the object (set or its boundary), require that our functions are positive valued and
integrate to 1 on Ω, and can only reconstruct points on the object by sampling. When
point-features or images are converted into non-parametric densities then information- and
estimator-theoretic distances can be employed. Kernel Correlation [98] and gmmreg [52]
both use Parzen-window densities, relying on correlation and L2-distance for registration
objectives, respectively. The method of matching distributions, or currents, [37] allows
singular measures to be matched. A crossover between the density and distance fields
citedeng2014riemannian utilized distance transforms yielding a density field which is
matched by a geodesic distance. In these works the unifying theme is a field that organizes
the point-features in terms of uncertainty.
SDFs organize features in terms of geometry. Registering SDFs has also been
attempted using an approach [51, 77] based on modeling pixel-wise behavior in a
pre-computed distance transform. The first technical problem encountered in registering
SDFs is the choice of a distance measure between them. The aforemented methods [51, 77]
use likelihood and mutual information based approaches on the values of the signed
distances treated as per-pixel random variables. While this is effective for the problem
at hand, it leaves much to be desired: they only consider small ranges of deformation,
is very slow, and does not address some of the problems that we point on below. In any
framework for registering transforms, such as distance functions, one must address the
problems inherent to the transform functions themselves which we now list. Far away
from the shape boundary (in the far field) SDFs take large values. This renders many
standard distances useless, like Lp or W p, unless one restricts to a compact domain
beforehand. Choosing this domain in a general way to allow arbitrary transformations
22
registering template to target requires the selection of invalid location values or the
repeated computation of intersection (and hence changing of the objective domain). Not
only is this awkward, but mathematically inconsistent. The second problem (referenced
previously) is that SDFs are usually not available in closed form, in sharp contrast to
parametric density representations. This implies that closed form distances between SDFs
are elusive. Third, note that registering is extremely difficult to perform within the space
of SDFs. For ϕ ∈ H to maintain the properties of SDFs H must be included in the
rigid transformations; to go beyond this some tampering with the values of the function
are required. The difficulty of managing this constraint is related to the reinitialization
problem in level-set methods [33, 39, 94] where ϕ is the (instantaneous) motion of an
interface represented by a level set function.
Point-set based methods usually feature explicit estimation of correspondences,
possibly in a soft or probabilistic fashion. Coherent Point Drift (CPD) [72] and TPS-RPM [18]
are two standard-bearers. TPS-RPM alternates between estimating the (soft) correspondence
and a TPS deformation. CPD uses a similar formulation, but also imposes additional
constraints (arising from motion coherence theory) on the deformation. RPM-LNS [111]
imposes symmetric neighborhood structures to preserve local shape while allowing global
deformation.
Graph matching methods have also been employed for point registration [38].
Local and global relations can be encoded in graphs, yielding a powerful structure for
correspondence estimation. While graph matching is a computationally hard problem,
algorithms for structured graphs and relaxation techniques show promise for point
matching [55, 112, 113]. When a planar shape is available as a cyclic graph elastic
matching can be done quickly given a choice of point descriptor (such as curvature)
[87]. Manifolds induce Laplace-Beltrami eigenfunctions [60], providing a canonical basis
from which to perform matching from a joint coordinate perspective [58] or a function
mapping perspective [76]. These methods all rely on equivalent organization of source
23
and target. While organization elevates the richness of the matching techniques available,
it also presents a difficulty: these methods require a level footing between template and
target. Estimating a graph or mesh from points can be very challenging.
Point feature organization can be viewed from many perspectives: computational
geometric methods [31], psychological gestalt principles [9, 45, 67], clustering [79, 108],
and level-set methods [69, 110] all organize points in some sense. Shape representations
are typically chosen to engender a desired organizational aspect of shapes [56]. Through
a multi-valued function or a distributional representation, different aspects of shapes
can be embodied in fields that interact predictably [15, 20, 43]. These works provide a
spectrum of organizational principles that can be used to temper the difficulty of the
point matching problem. In this work we obtain a reconstruction while registering, which
means that no target structure needs to be estimated before registering. Few works touting
simultaneous registration and reconstruction are currently available [17, 66]. Next we
discuss an important class of geometry processing algorithms that obtain structure from
unstructured observations.
1.3 Prior Work on Surface Reconstruction
Surface reconstruction is the problem of choosing a surface to represent a collection of
partial observations in 3-D (points, oriented points, contours, and patches). This problem
is clearly ill-posed as an infinite number of solutions exist for almost any collection of
features in space. Most of the research on surface reconstruction consists of choosing
a form of regularization, engineering methods to improve the speed of inference from
features to an implicit field and then to a mesh, and data-driven methods. In the first
two cases, surfaces are taken to belong to general classes of smooth shapes. We will
focus on these methods in this review, but will mention how data-driven methods can be
incorporated into our approach as well.
Note that in the following we are focused on reconstructing compact surfaces without
boundary. They have no holes or openings and therefore no boundaries. Representing
24
boundaries cannot be done easily with a single, continuous implicit function. Also, we
note that most of the methods we will discuss reconstruct implicit fields. Thus, meshes are
obtained by extracting a level-set by marching cubes.
Early in the development of implicit field models the ”blobby model”, metaball, or
mixture of densities model became popular for visualizing potential fields of molecules in
Quantum Mechanics [6]. This model is similar to our own and serves as a good starting
point for introduction to the field. The basic idea is to represent a surface as a level-set
of a mixture of Gaussian components. This approach takes advantage of the fact that a
T−level-set of isotropic elements with different variance values can be re-expressed as a
.5−level-set and formulates the free parameters in terms of the variance and location
of each element. General quadrics featured in the exponent can be used to model
non-physical shapes including people and natural scenes. While the fitting process is not a
focus of this work, it was studied subsequently [65]. While this metaball approach is less
popular today, it is still commonly used for modeling particle-based matter interactions
(such as fluids flowing against each other) and for modeling molecules [84].
Simplistic atom-fitting approaches slowly gave way to more robust and faster
approaches based on splines and radial basis functions (r.b.f.’s) used in interpolation and
approximation theory [14, 32, 53, 73]. In this approach a norm on the smoothness of the
approximating implicit function is used along with some spline constraints. Researchers
frequently encode normal vectors as “normal points” which have a pre-defined distance
from the surface [14]. Rising to popularity before the wavelet revolution [24, 62] modeling
with metaballs missed out on the scale-space approach of function approximation. As
spline-wavelets [32, 53, 73] and other methods from computational physics [14] came
to maturity these approaches to function approximation gave rise to new surface
reconstruction algorithms. The standard approach to fast fitting of B-splines is using
a k-d-tree approach to localization and fitting a level-set of a simple class of functions
(typically algebraic) to the points in the neighborhood. Care must be taken to maintain a
25
neighborhood structure and weighting scheme enforcing smoothness of the final function
[73]. Fitting objectives range from least-squares based approaches to physically and
variationally inspired approaches [53, 54].
In the Poisson-based surface reconstruction [53, 54] a set of oriented points (an OPS)
S = (mi, νi)Ni=1 is provided and the question of estimating an underlying indicator field
for the implied surface is addressed. If the indicator field is represented by the unknown
function χ : Rd → R+. The authors of [53] make the observation that the gradient of a
smoothed χ, ∇χ, will be approximated well by a (carefully chosen) interpolated vector
field V . Specifically, they employ the Gaussian interpolant and then solve the Poisson
equation (instead of the vector differential equation ∇χ = V due to integrability concerns)
∆χ = ∇ · V
= ∇ ·
(1
N
N∑i=1
g(x−mi)νi
). (1–2)
The authors address scalability and compare with several standard-bearers on the basis of
computation-time and memory requirements.
Our own approach can be viewed as similar to this last in that we exploit a
continuation field. Note that the following reasoning is not rigorous, but provides
a heuristic for the suggested analogy with the Poisson-based reconstruction. The
continuation field that we use comes from a Hamilton-Jacobi equation that only yields a
linear equation under the correct parameterization (Chapter 2). One way to parameterize
it comes from the classical superposition principle: if ψ1, ψ2 ∈ C2(Rd,C) decay quickly
enough and |ψ∗1(x)ψ2(x)| = ϵ at x ∈ supp(|ψ1|), then
Arg(ψ1 + αψ2)(x) = tan−1
(|ψ1(x)|2 sin(θ1(x)) + αϵ sin(θ2(x))
|ψ1(x)|2 cos(θ1(x)) + αϵ cos(θ2(x))
).
Now we can contrast this with the RHS of Equation (1–2) (the argument to the divergence
operator). First, off we require a d−dimensional field to store the spline-based vector
field; the phase-based field only requires the 2−d Complex representation. Another issue
26
is that this vector field is clearly not always integrable and so estimating a distance field
from it is not easy. Indeed, identifying a contour associated with the oriented point-set
can run into stability issues since following streamlines going perpendicular to the normal
direction will lead into regions with very small magnitudes—not to mention vanishing of
the vector field. Finally, the representation by spline-based vector fields is not injective
since (m, ν1), (m,−ν1) leads to a zero field. These technical issues suggest that there
is room in the shape canon for yet another continuation-based representation. While this
thesis focuses on the atomic representation explained below in Chapter 2, at the end of
the thesis we outline a direction towards exploiting the superposition principle above for a
general approach.
1.4 Outline of this Document
In Chapter 2 we have brought in the basic genesis of this idea from the 2014 paper
[20]. This chapter provides a top-down view of the motivation behind this representation.
It explains how the CWR can be thought of as a linearization of the solution space for
Equation (1–1). An anecdotal surface reconstruction example is featured, the modular
distance field is defined, and some theoretical claims are made concerning how well
the finite approximation preserves properties of the modular distance field as well as a
projective embedding theorem.
In Chapter 3 we introduce a model and an algorithm for simultaneous point-set
registration and normal estimation. The normal estimation can be used to extract a
surface reconstruction (as mentioned above) but this is not the focus of this work. We
do compare the model to an algorithm that uses the normal-field interpolant provided
by the action of the deformation on the original oriented point-set, but no extensive
surface-reconstruction metrics are devised.
In Chapter 4 experiments validating the model and algorithm are presented. We also
include an early algorithm for likelihood-based registration and show results from applying
kPCA (using the r.k.H.s. corresponding to the image of L2(Rd) under the appropriate
27
Gabor Transform) to a subset of the MPEG7 dataset (planes) for surface reconstruction,
density estimation, and curve classification.
In Chapter 5 is an outline of some of the theoretical work we have done so far. We
have proved a theorem on the connectedness of the zero level-set of the phase of a mixture
of 2 atoms under special condition. We have also shown the conditions under which one
can extend beyond 2 atoms, and the overall class of curves that can be approximated
under these conditions. We also provide the theoretical justification for the representation
which goes through the modular distance field estimation.
In Chapter 6 we show several applications of the idea that go beyond registration.
One can more readily perform shape statistics using the CWR and we discuss this in a
section. We also show how to extract curves from the implicit function given by the phase
of the CWR. Beyond that, we extend the representation in two new directions: nonlinear
phase components and embedding on general geodesically complete, closed, Riemannian
manifolds. This allows one to represent co-dimension > 1 shapes with the CWR—a
distinguishing characteristic among implicit shape representations.
Finally, in Chapter 7 we review the thesis and discuss future directions of research.
28
CHAPTER 2REPRESENTING SHAPE WITH PHASE
In this chapter we seek to address the lack of rapprochement between distance
transforms and density functions. The main advantage of the distance transform is its
implicit curve representation whereas that of the density function is its representation of
uncertainty and noise. A unified representation would be beneficial to shape analysis
provided the respective advantages of both the distance and density functions are
preserved. To set the stage, we first turn to the mathematical underpinnings of the
distance function since they hold the key to subsequent integration.
Distance transforms satisfy the static Hamilton-Jacobi equation ∥∇S(x)∥ = 1 where
S(x) is the distance function. If the signed distance function is sought, the zero level set
of S(x) is a set of curves embedded in 2-D (or a set of surfaces embedded in 3-D). The
typical way to compute these fields given smooth initial conditions S|∂A = 0 is to use
the fast marching method [91]. However, in real-world settings the problem of interest is
typically a harder one:
Given partial boundary data ∂S = P compute the distance function of S. (2–1)
In this case the problem is ill-posed: many potential solutions can exist depending on the
structure of P (it could be partial curves, a point-set, an oriented point-set consisting of
locations and normal vectors, etc.). It is worth pointing out that even the fast-marching
method uses an initialization heuristic since the contours are often given as an explicit
sequence of points. The difficulty of computing the signed distance function from an
unorganized set of points is well known (see the massive body of work on reconstruction).
What is not so well known is the curious fact that the static Hamilton-Jacobi equation—a
nonlinear differential equation—is closely related to the static Schrodinger equation—a
linear differential equation [11]. It turns out that the Hamilton-Jacobi scalar field S(x)
29
is approximately the phase of the complex Schrodinger wave function ψ(x) which is the
solution to the wave equation.
As pointed out in previous work [20, 90], if we parameterize ψ = R expiS/ℏ, the
phase that we estimate approximately obeys the eikonal equation
−ℏ2∆ψ = ψ
−ℏ2∆(R expiS/ℏ) = ψ
−ℏ2(∆R + i
∇S · ∇Rℏ
− R(∇S · ∇S)ℏ2
)expiS/ℏ = ψ
||∇S||2 ≈ 1. (2–2)
with Equation (2–2) converging to the local condition of Equation (1–1) uniformly as ℏ ↓
0. In previous work this relation was used to motivate a distance function representation
using point- and line-potentials and the K0 Green’s function for the Schrodinger equation
[90].
Here we are interested in the signed distance. It obeys the same eikonal equation but
also has a continuity condition for the boundary, employing the boundary and domain
condition of Equation (1–1). This is difficult to model with a point-based Green function
for two reasons.
A Green function g for L will solve Lg = δ, where δ ∈ S is the delta distribution.g have some discontinuity in it (as we see with the Green function of the Helmholtzand Schrodinger operators). On the other hand, signed distances are smooth throughpoints on the boundary.
We will need to have different directions associated with different points, to reflectthe local direction of the eikonal field.
We do not pursue the Green function formalism in this work but provide a heuristic
parametric solution as a finite mixture model of complex ‘atoms’ (to use a suggestive
term). Since a wave function magnitude is related to a normalizable density function, it is
natural to ask a follow-up question: whether probabilistic information concerning a shape
30
can be embedded in the wave function magnitude? To answer this question, we turn to the
consideration of density functions next.
Density functions estimated from unorganized point-sets come in both parametric and
non-parametric flavors. Shape densities in the form of histograms, mixtures of Gaussians,
wavelets and kernel expansions are all used in the literature [23, 81]. If a mixture of
Gaussians is sought, the density function p(x) is peaked at a set of 2-D (or 3-D) “cluster
centers” with the degree of “peakedness” depending on the variance of the underlying
cluster. The difficulty of computing the mixture density function from an unorganized set
of points is well known—involving a search for cluster exemplars and associated covariance
matrices. The question at hand is whether one can associate a density function p(x)
with the squared magnitude of the wave function with the phase of the wave function
continuing to play the role of the distance function. This will boil down to normalizing the
Complex-valued continuation field that results from our choice of mixture below. By doing
so we have a candidate for an integrated shape representation with the wave function
magnitude and phase representing location uncertainty and curve geometry respectively.
2.1 The Complex Wave Representation
We begin by summarizing previous work which introduced an approximation to the
unsigned distance function [90] by solving the Schrodinger equation corresponding to the
static Hamilton-Jacobi equation ∥∇S∥ = 1:
Sτ (x) ≈ −τ log ϕτ (x;µ) = −τ logN∑k=1
exp
−∥x− µk∥
τ
, (2–3)
where µ = µiNi=1 is a collection of locations and the scalar field ϕτ (x;µ) is the solution to
the linear differential equation
−τ 2∇2ϕτ (x;µ) + ϕτ (x;µ) =N∑k=1
δ(x− µk). (2–4)
In Equation (2–4), τ is a free parameter and the approximation Sτ becomes increasingly
accurate as τ → 0. Superpositions of solutions are allowed, in sharp contrast to standard
31
distance transforms break down under addition. In the present work, we seek to go
beyond the unsigned distance transform (and linear differential equation approximations
thereof). In shape analysis, connectedness is fundamental to applications, but is rarely
available explicitly from the representation. Unsigned distance transforms solve a
wave-front equation that is not suited to dealing with issues of connectedness. The
approximation in Equation (2–4) does not fare any better since it is based on an isotropic
Green’s function evaluated at a point-set. To embody connectedness as a feature, ϕτ
must be modified. Drawing inspiration from the complex nature of wave functions in
physics, we introduce a complex modulation factor to Equation (2–3) that encodes normal
information. Intuitively, we can extend the geometric information contained in the normal
of a shape by propagating the phase as suggested by the (classical) superposition principle.
This leads to a phase factor expiνTk (x−µk)
λ modulating the real function ϕτ above [82].
The modulation that results from the newly introduced phase factor [20] acts as a
local spline, borne out in level sets of the phase of the complex wave. The magnitude
then acts as a regularizer to enable joining and hand-off of the spline through neighboring
kernels. Note that the normal information νk is attached to the center µk and therefore
is the statistics needed for this representation are oriented point-sets. The resulting
representation, in it simplest form, is
ψσ,λ(x;µ, ν) =N∑k=1
exp
−∥x− µk∥
2
2σ2+ i
νTk (x− µk)λ
,
where ν = νkNk=1 is a collection of normal vectors. N will be used to represent the
number of oriented points in C, with Ni referring to the points in Ci when considering
multiple oriented point-sets. Henceforth, oriented point-sets have unit normals unless
otherwise specified. We will write ψ(x; C) as shorthand for ψσ,λ(x;µ, ν) with σ and
λ suppressed wherever not explicitly needed. The wave function ψ(x; C) contains
geometry information of the curve through the level sets of the phase, probability
32
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
(a)
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −0.5 0 0.5 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
(b)
Figure 2-1. A simple example of composition of a distance function from oriented pointsusing the Complex Wave Representation (CWR). a) The level-sets of the angleare shown with higher values in red and lower in blue. b) The level-sets ofthe density are shown. Each oriented point on its own forms a linear wavepropagating in the direction of the normal vector. The density associatedwith each atom alone is Gaussian. By superimposing several waves we get anon-Gaussian density and the phase acts like an approximate signed distancefunction.
33
density information via the squared magnitude, and distance information [90] through
the logarithm of the magnitude (as λ→∞ and σ → 0).
A technical issue arises due to the wrapped nature of the 2-D wave function phase. At
any location x, we obtain the modular distance along the normal vector to the zero level
set of the phase
d(x;C) = λ arctan
(Im[ψ(x;C)]
Re[ψ(x;C)]
).
Note that the phase (carrying orientation data) is now a property of the field (Figure 2-2),
and is therefore defined everywhere. The unsigned distance transform obtained from
a point-set, despite also being defined everywhere, lacks the crucial connectedness
information, causing its zero level-sets to be broken islands marooning the original
points. The connectedness component afforded by the phase is critical to shape boundary
representation and perceptual grouping. In Chapter 5 proof of the continuation and
connectedness properties are carried out.
A principal advantage of using distance transforms is the integration of point
information via a field. Robust analysis of shapes is enabled by this property. Unfortunately,
the tight constraints imposed by the distance transform (such as ∥∇S∥ = 1) do not permit
averaging, component analysis and the like, thereby limiting the effectiveness of the
representation. The wave representation ψ allows for superposition and other operations
(Table 2-1) enabling a richer variety of potential applications than standard distance
transforms. With distance information in the magnitude along with connectedness and
orientation information provided by the phase, ψ preserves the attractive properties of the
signed distance function.
1 http://www.cise.ufl.edu/∼anand/GatorBait 100.tgz
34
Table 2-1. Technical Layout of the Operations on ψ. This table shows the diversity of therepresentation ψ for shape analysis. To the best of our knowledge, we have notseen a field representation for shapes with both connectedness and probabilityinformation over the whole of the embedding space. Note that the inner productdepends on the normalization, by Equation (3–8).
Unsigned Distance d2(x; C) ≈ −2σ2 log(ψψ)
Modular Distance (MD) d(x; C) = λ arctan(
Im[ψ(x;C)]Re[ψ(x;C)]
)Curve Geometry n(x; C) = ∇λ arctan
(Im[ψ(x;C)]Re[ψ(x;C)]
)Sampling Probability p(x; C) = |ψ|2/∥ψ∥22
Spatial, Frequency Variance σ, λ
MD Linearity ψ3(x; C3) = ψ1(x; C1) + ψ2(x; C2)→ d(x; C3) ≈ d(x; C1∪C2)
Kernel k((µj, νj), (µk, νk)) =exp
−
∥µj−µk∥2
4σ2 −σ2∥νj−νk∥2
4λ2+
i(νj+νk)(µj−µk)
2λ
(2π)D/2σD
2.1.1 Analysis of ψ
There are some similarities between the wave function ψ (with a Gaussian kernel) and
a Gabor wavelet. The latter has been extensively studied and used with great success in
pattern recognition [59]. One interpretation of ψ is as a square-root density. The study
of Gabor frames for square-root density approximation is basically nonexistent, and is a
possible direction leading out of this research. The unnormalized function |ψ(x)|2 is
|ψ(x)|2 ∝N,N∑
j=1,k≥j
cos
(νj(x− µj)− νk(x− µk)
λ
)exp
−∥x− µj∥
2
2σ2− ∥x− µk∥
2
2σ2
.
Note that this is not the L2 norm but the squared magnitude of ψ at location x. It is not
obvious from the expression above, but as |ψ(x)|2 is the magnitude squared of a complex
number, it is nonnegative everywhere. However, unlike Parzen estimates using kernels
with support equal to the entire domain (such as Gaussian or exponential kernels) zeros
may occur when a mixture is formed. When suitably normalized, |ψ(x)|2 can be treated
as a probability density function which immediately connects it to the plethora of shape
density functions used in the literature.
35
50 100 150 200 250 300
50
100
150
200
250
300
(a)
0 50 100 150 200 250 3000
50
100
150
200
250
300
(b)
50 100 150 200 250 300
50
100
150
200
250
300
(c)
0 50 100 150 200 250 3000
0.1
0.2
0.3
(d)
0 50 100 150 200 250 300−0.1
0
0.1
0.2
0.3
(e)
Figure 2-2. Visualization of the phase of ψ. Left: level sets of the unsigned distancetransform. Center: oriented point-set. Right: level sets of the phase of theCWR. In the second row, the scanlines indicated by the red lines are shown.Near the point locations, the level set of the phase is much clearer while forthe unsigned distance function, the normal information is totally inaccuratenear the data points. Conforming to the continuity requirement for perceptualgrouping [86] is a primary attribute of ψ. In the unsigned distance transform,the pectoral fin (the small fin below the gill fin) is basically indiscerniblewhereas in the phase it is very clear. The point-set used is a sampling from theGatorBait data set1 , specifically Acanthuridae Acanthurus Chronixis composedof a linear superposition of shapes that individually form closed curves. Seealso Figure 3-3.
2.1.2 ψ for Oriented Multi-curve Shapes
As discussed above, ψ has unique properties (relative to distance functions) stemming
from additivity of the representation, leading to an additivity or “superimposability”.
Depending on the choices of the free parameters σ and λ, modifying a shape with new
position and orientation data can be very easy. We briefly justify the viability of this
attractive property, and its limitations, below.
36
When
d(x) = λ arctan
(Im[ψ(x;C)]
Re[ψ(x;C)]
)
= λ arctan
N∑k=1
sin(νTk (x−µk)
λ) exp−∥x−µk∥2
2σ2
N∑k=1
cos(νTk (x−µk)
λ) exp−∥x−µk∥2
2σ2
is evaluated, the contribution of each of the cluster centers to the sum decays exponentially;
the slow growth of the arctangent yields stability to small contributions. To see what this
means for superposition, consider an oriented point-set C1 and let C2 be a new oriented
point-set to be superimposed. Let q1 be the zero level set of the unwrapped d(x;C1) and q2
be the zero level set of the unwrapped d(x;C2). Provided that p(x;C2)≪ p(x;C1), ∀x ∈ q1
and p(x;C1) ≪ p(x;C2), ∀x ∈ q2, the superposition of C1 and C2 is stable: the resulting
zero level sets approximately match q1 ∪ q2. This superposition principle allows us to
compose shapes additively.
The takeaway from this is that multiple curves can often be “added” easily: if one has
multiple CWRs, then provided that the properties detailed above hold, one can compute
the field by simply adding their fields together. Under this operation, the stability of
the level sets depends on the distance to the initial set and the free parameters. When
fields interact with each other and the above fails, then point discontinuities can arise in
the phase field of ψ. However, provided that the abutting shapes have agreeing normal
information (as in Figure 2-2), the resulting superposition can maintain the desirable
features of each of the underlying sets.
The frequency of the oscillatory part and spatial accuracy of the density play a
key role in the level sets of d. If the sampling of the curve location or normal data is
insufficient, the superposition limitation mentioned above kicks in and the curve may
be grouped incorrectly. As superimposed shape boundaries abut, the phases begin to
interfere. Eventually, once the abutting shapes fall within close enough in the receptive
37
Figure 2-3. Merging of two curves as oriented points move closer together. Since thenormals are aligned opposite to each other near the center of the image, thecurve portions originating from those points cancel each other out.
fields of one another, the curves cancel each other out. The parameters σ and λ act
as uncertainty parameters between the multi-curve and high curvature paradigms of
shapes. A route to mitigating the abutment issue (a universal phenomenon in multi-curve
representations) is allowing non-uniform frequency and spatial parameters to control the
degree of precision of d.
38
2.2 Wave Mixtures as Geometric Primitives
Here, we state additional interesting properties of this shape representation for the
purposes of curve reconstruction and as a feature function before proceeding to showcasing
simultaneous matching and reconstruction (Chapter 5 contains proofs). The analysis
(mathematical and experimental) is extended beyond the 2−d setting.
Consider a point-set augmented with directional information at each point. That is,
let S = (ma, νa)Ma=1, where νa is a normal associated with the point ma. We use S to
denote the set underlying the oriented point-set S, with each ma ∈ ∂S and νa pointing
in the outward direction from S. The complex field we use extends the standard Gaussian
Parzen window density to a square-root of a density by using the normal information,
written (unnormalized) as
ψS(x) =M∑a=1
e−∥x−ma∥2
2σ2 +iνTa (x−ma)
λ . (2–5)
λ controls the frequency of the wave: the lower the value of λ the higher the spatial
frequency. The wave oscillates along the normal near a point feature but integrates
information from different wavefronts in the far field (near and far are a function of σ, λ).
The squared magnitude of ψ(x) encodes probability density information. Zero level-sets of
the phase now carry shape geometry information.
The mixture in Equation (2–5) has similarities to the venerable Gabor filter or
wavelet—well known to vision researchers and mathematicians [24, 48, 59]. This allows us
to leverage the mathematical literature to prove useful properties of this representation,
such as proof of injectivity below which follows a similar argument for a related Gabor
system [48]. Gabor systems are families of time-frequency translates of an admissible
function. The primary difference between the choice of atom in Equation (2–5) and
the kind used in signal processing [59] is that we do not enforce biologically motivated
constraints. The connection to signed distance functions (and static Hamilton-Jacobi
equations) is more subtle [2].
39
Figure 2-4. Zero level-sets of the phase of ψ for subject 1 of FAUST under severaldifferent values of σ. A uniform value of σ is used across all of the orientedpoints. σ ranges from 1 × 10−4 to 1 × 10−2 from top to bottom, left to right.Note the stippling in the upper-right rendering. This shows the region ofinfluence of each oriented point.
In contrast to the unsigned distance function, the signed distance is smooth across
shape boundaries (providing a stable reconstruction) with the sign of the distance
indicating whether a location is inside or outside the shape. When we fit Parzen
window density estimators to a point-set, we can obtain an approximate unsigned
distance function at every point. The relation G(x) ≈ CRe−R2(x)
2σ2 holds (with CR being
a normalization constant), where the approximate unsigned distance function R(x)
approaches the true distance pointwise as σ decreases toward zero [90]. For oriented
point-sets the relation is
ψS(x) ≈ ΨS = e−b2S(x)
2σ2 +ibS(x)
λ (2–6)
40
where bS(x) is the SDF. We refer to ΨS as the modular distance function or MDF for a
set S. For a fixed S, the approximation becomes more accurate as σ, λ → 0. Note that
the magnitude is agnostic to the sign of the distance whereas the phase carries the sign
but is modular due to the wrapped nature of the phase. Note that we do not require or
use phase unwrapping, all of the analysis will be carried out in the wrapped setting (phase
unwrapping only occurs when surface reconstruction is executed (Chapter 6). We will be
clear when referring to the modular distance function whether the abstract function in
Equation (2–6) is being referred to or the modular distance that arises from the CWR is
being referred to.
A few key advantages to using the modular distance function in lieu of the signed
distance function are: i) the modulus decays as we approach the far-field, handling the
far-field issue mentioned above, ii) Equation (2–6) allows us to derive distances in closed
form (Equation (3–8)) we avoid the concerns with region v. boundary representation (in
exchange for modularity), iv) it can be approximated with a parametric mixture as laid
out in this thesis document.
2.2.1 A Note on Gabor Analysis
While this work is primarily applied, we think it is worth mentioning the history
of the Gabor or Weil-Heisenberg representation to put some of the mathematics
in context. Gabor originally proposed the time-frequency [35] representation as an
uncertainty-minimizing family of functions that act as a frame from which to encode
signals. The family of time-frequency shifts
TmEvg : v ∈ R,m ∈ R,
implement the Weyl-Heisenberg or Gabor transform from L2(R) → L2(R2). That is, the
operator
Ψg : L2(R)→ L2(R2) : f 7→ f : f(a, b) = ⟨TaEbg, f⟩ ,
41
is a faithful square-integrable representation of the Weyl-Heisenberg group implementing
an isometry between the domain and range [49]. While the advent of Wavelets has
refocused the interests of many in the Vision and Electrical Engineering community
towards approximate phase-space packet design, the Gabor Transform still has a number
of interesting related open problems (from a pure and applied standpoint). For instance:
which g lead to S : (Λ, g) 7→∑
a,b,c∈Λ cTaEbg being linearly independent for arbitrary
finite Λ ⊂ R2d? [48]. The proof of the embedding of sets of oriented points relies on
arguments that lead to this question. Though we have not addressed it in this work, the
use of a discrete Gabor Frame offers the promise of reconstruction error bounds [24, 59].
Finally, uncertainty relations (the support of Ψg is never a set of finite measure) and
stable reconstruction (the support is never too localized) are both intrinsic features of the
Gabor transform [105].
2.2.2 Square-root Densities and Probabilistic Interpretation of the ComplexWave Mixture
The CWR provides a geometric completion field or implicit shape representation
in the phase, as shown above visually and explain in further detail below theoretically.
Another perspective on the CWR is as a density. Square-root densities have been used in
the shape literature before, and can offer some advantages over classical densities from the
standpoint of having an elegant analytical expression for common information-theoretic
objectives [80]. From Table Table 2-1 we see that the normalized CWR is the square-root
of a density. We now briefly discuss some useful features of the CWR, and also Gabor
wavelets, as a square-root density.
First, from an inference standpoint the CWR can provide a better density estimate
for points drawn along curves provided that the normal estimates are correct and for
neighboring points on the curve the lines perpendicular to each oriented point intersects
the other along their Voronoi boundary. In Chapter 5 we discuss the significance of
this requirement in more detail. The effect is that the density estimates develop a
42
nonuniformity that reflects the coherence of the atoms: when the normal vectors are
aligned correctly this is often called coherence in the physics literature [89]. A principal
feature of coherence between atoms is the superposition aspect: when a well defined phase
can be observed the atoms are said to be coherent, but an indistinguishability between
the two emerges as a byproduct. In density estimation we observe the same effect: when
the oriented points or atoms are “coherent” we see an increase in density along a ridge
that follows the parameters, but we cannot say which one of the atoms is responsible for it
since both are necessary to observe a departure from normal Gaussian density values.
2.3 Relationship Between Signed Distances and Complex Wave Mixtures
To solidify the claim made in Equation (2–6), first note that ||ΨS||2 < ∞: |ΨS(x)|2 =
| exp− b2S(x)
2σ2 + ibS(x)/λ|2 is dominated by its concave envelop ΨS, which has
||ΨS||22 ≤ ||ΨS||2 ≤((2πσ2)d/2 + 1)πd/2 diam(S)d
Γ(d2+ 1)
, (2–7)
by an application of volumes of revolution. Furthermore, note that
||ΨS||22 ≥ (2πσ2)d/2 (2–8)
as d(x, p) > d(x, S) for all p ∈ S.
Then, note that as σ → 0 that ⟨exp− ||x−m||22σ2 +iv
T (x−m)λ,Ψσ,λ⟩ → 0 whenever m /∈ ∂S.
And as λ → 0 destructive interference causes ⟨exp− ||x−m||22σ2 + iv
T (x−m)λ,Ψσ,λ⟩ → 0 by an
application of the stationary phase expansion [106]. This means that as σ, λ shrink, the
only significant coefficients of the Gabor Transform of ΨS come from atoms centered on
the boundary, oriented in the outward normal direction.
This result implies that the CWR parameters are essential to the representation of an
MDF, with increasing relative weight in the Gabor expansion of the MDF as the variance
parameters shrink. The takeaway from this is that the CWR approximately recovers the
MDF if we can take the parameters to be sufficiently small—which is fine to do provided
we can sample densely enough with sufficient precision. In Chapter 6 the proximity of the
43
CWR to the MDF is explored empirically. More evidence supporting the substitution of
the signed distance by the complex wave mixture is provided in Section 3.3. Theoretical
results on the relationship are established in Chapter 5.
2.4 An Embedding Theorem for Complex Wave Mixtures
In some contexts, invariance of representation is desirable [58, 76]. For the purposes of
deformable matching, however, having a 1−to−1 mapping between the point features and
function representation is a prerequisite for employing distances as objective functions: if
a feature function is not injective, it is possible that two non-registered point-sets result
in the same feature functions, with zero distance between them. This is precluded in the
complex wave representation. Note that this injectivity was not furnished in the original
work [20]. The proof below follows Heil, Ramanathan, and Topiwala’s paper [48].
Theorem 2.1. ψ· is an injective map from finite sets of oriented points to L2. Any metric
on L2 distinguishes oriented point sets under this representation.
Proof. Let A = (ma, νa)Aa=1,B = (qb, ωb)Bb=1 be distinct oriented point-sets. We will
show that ψA − ψB is not identically zero. Suppose that m1 (a location in A, with index 1
by reordering) is on the convex hull of K = maAa=1 ∪ qbBb=1. Without loss of generality
assume m1 = 0. Let C = A ∪ B \ (m1, ν1). Then
ψA − ψB = exp−||x||2
2σ2+ i
νT1 x
λ+
∑(r,γ)∈C
h(r,γ)(x) exp−||x||2 − 2xT r
2σ2
= exp−||x||2
2σ2
expiνT1 x
λ+
∑(r,γ)∈C
h(r,γ)(x) expxT r
σ2
where each h(r,γ) = [−1](r,γ)∈B exp− ||r||2
2σ2 + iγT (x−r)λ. Since m1 is on the convex hull, there
is a ray κpκ>0 in the Voronoi cell of m1 (relative to K). So there is a κ sufficiently large,
so that ∣∣∣∣∣∣∑
(r,γ)∈A∪B\m1,ν1
h(r,γ)(x) expκpT r
σ2
∣∣∣∣∣∣ < ϵ/2,
44
so |ψA(κp)− ψB(κp)|2 > exp−||κpσ||2(1 − ϵ) > 0. If an oriented point-set has multiple
oriented points with the same location (but distinct normals at these oriented points) we
can use the injectivity of the Fourier Transform [34] to show that the sum of trigonometric
polynomials (for the duplicated locations) is nonvanishing. Thus, the above argument
holds even in that case. If d is a metric on L2 it is nonzero on pairs of distinct functions,
distinguishing oriented point-sets.
Note that in this proof the coefficients (which we have not mentioned in this work,
besides as normalization coefficients when projecting to the unit Hilbert sphere) in front of
the mixture elements of the CWR are immaterial since they are constants. So the intuition
of the resulting proof is that: no two distinct mixtures (possibly with identical coefficients)
evaluate to having the same phase and the same density as functions on the unit Hilbert
sphere.
45
CHAPTER 3REGISTRATION: RESONANT DEFORMABLE MATCHING
In registration, we seek a transformation of the template objects onto the target
objects. We denote the transformation of the positions as ϕ(ma)Ma=1 where ϕ ∈ H is an
element of the set of a class of hypothesized transformations. We will also estimate the
normal vectors on the target curve or surface as we perform registration. Once models for
registration and organization are chosen, a smooth model fit measure or distance between
shapes is developed. We choose one that is also expressible analytically and C∞ smooth.
Finally, theoretical relationships with other approaches to deformable registration are
studied.
3.1 Hypothesis Classes for Registration
We depart from standard registration techniques as in our case the transformation of
not only the template centers maMa=1, but also the template normals νaMa=1, is carried
out under the action of ϕ. The appropriate transformation is the Jacobian, ϕ′, of the
deformation ϕ. ϕ′ evaluated at the point ma acts linearly on the template normal νa. Note
that ϕ′ : Rd → Rd is the derivative of the deformation with respect to the spatial variable,
not the parameters of the transformation. ϕ acts on (ma, νa) by
ϕ · (ma, νa) = (ϕ(ma), ϕ′|maνa), with (ϕ′|ma)i,j =
∂ϕ(i)
∂xj
∣∣∣∣ma
,
and ϕ · S = ϕ · (ma, νa)Na=1. We can write the transformed template as
ψϕ·S(x) =M∑a=1
e−∥x−ϕ(ma)∥2
2σ2 ei(ϕ′|maνa)
T(x−ϕ(ma))
λ . (3–1)
Note that the centers and normals have been transformed via the action of the deformation
ϕ but the location variable x remains intact. This allows us to define a distance between
template and target functions in terms of a feature-space domain integral, which we will
minimize w.r.t. ϕ. Before going into more detail about the distance, we discuss several
hypothesis classes commonly used in registration.
46
Figure 3-1. Tait-Brian angles. ψ provides the rotation about the initial z−axis, θthe rotation about the subsequent y−axis, and Φ the rotation about thesubsequent x−axis. 1
3.1.1 Euclidean Transformations
Also referred to as rigid or rigid-body transformations, the Euclidean Transformations
comprise a group given by
E(n) = O(n)⊗ T (n).
The equation above is born out by observing that applying a translation T and rotating
by R is equivalent to applying R and translating by RTTR the conjugated translation.
Recall O(n) included reflections and rotations. Note that E(n) is a Lie group.
The Euclidean group can be parameterized in several ways. Due to the normality of
T (n) ◁ E(n), we can factor out T (n). T (n) is clearly smoothly isometrically isomorphic
to Rn and therefore is parameterized automatically. O(n) is the catch. There are two
standard parameterizations: Euler (or Tait-Brian) angles and Quaternions. Before
1 This image was taken from Wikimedia commons. [19]
47
proceeding we note that O(n) is not connected—there is no path from the identity to
a reflection about an axis. Therefore the parameterizations we discuss are of SO(n), and
the optimization over reflections must be handled separately.
3.1.1.1 Euler angles
Perhaps the most common parameterization of SO(3) is using yaw-pitch-roll [71],
Tait-Bryan, or Euler angles. The three angles (Φ, θ, ψ) act on a rigid body v by (Φ, θ, ψ) ·
v = RxΦR
yθR
zψv, where R
a· indicates a rotation about the a−axis. Note that in higher
dimensions O(n) has the added complication of requiring n angles to parameterize a
general rotation.
To derive updates to the estimate of a rotation for a non-convex problem we will need
to compare (at least) the derivatives of the parameters through the objective function. An
example of this follows
∂ψ [(Φ, θ, ψ) · v] =
sin(ψ) − cos(ψ) 0
cos(ψ) sin(ψ) 0
0 0 0
v.
3.1.1.2 Quaternions
One disadvantage to the Euler-angle representation of SO(n) is gimbal-lock. This
occurs when ψ approaches ±π/2 at which point to achieve a rotation of θ = 0 in the
original coordinate system, θ′ = π/2 is required—leading to a large discontinuity as
ψ → ±π/2. The formal geometric reason for this is that there is no smooth cover of SO(3)
by (S1)3).
Quaternions offer a solution by parameterizing SO(3) as a double cover. The
representation encodes the transformation as q = (qx, qy, qz, qw) and represents a rotation
of θ = 2 cos−1(qw) radians about the (qx, qy, qz)/(sin(cos−1(qw))) axis. Checking the algebra
of Q verifies the composition of rotations. Geometric analogy with S3 embedded in R4
48
illustrates the double cover: qw = 0 defines a subset of the unit quaternions covering SO(3)
and rotating qw about the origin provides a second cover.
Note that the double cover is only an analogy unless q is fixed to unit length. This
is usually done in ad hoc manners, rendering the quaternions (unit quaternions) still
imperfect. In practice, we use the quaternions.
3.1.1.3 Action of Euclidean transformations on the normal vector
Rigid transformations act on the normal vector by dropping the translation. This
maintains unit length of the normal vector. The differentiation of this transformation with
respect to parameters mimics that performed above (in the Euler angle setting).
3.1.2 Affine Transformations
Affine transformations arise frequently in registration in the imaging field as different
intrinsic parameters for different devices lead to different scales [95]. The parameterization
of affine transformations is somewhat simpler than Euclidean.
We use the augmented matrix representation
M · v =
A b
0 1
v
applied to homogeneous coordinates (x, y, z, 1) of the points. Therefore, nine parameters
are used (a, b, c, d, e, f, g, h, i) and can be differentiated through the objective function
directly by reading off the action above.
Action of Affine Transformations on the Normal Vector.
As above, affine transformations act on normal vectors by dropping the translation.
This leads to stretching of normal vectors. The straightforward derivative with respect to
the affine transformation acting on the normal vector w is
∂Mij(M · w) = Eijw.
49
It is also possible to implement an action that preserves the unit magnitude of the
normal vectors. The natural way to make the affine group act as SO(d) is to use the
polar decomposition. Here we consider only the behavior of the linear part A of the affine
transformation M . The polar decomposition of A, A = PQ, yields a positive and an
orthogonal part P and Q, respectively. We simply let M act on the normal vectors by
implementing the action for the orthogonal part Q of A as outlined above. This introduces
a complication when updating the estimated transformation: we must scale the derivative
above by the inverse of the positive part.
Note that in the case of the thin plate spline an affine transformation must be
factored out. Therefore, this section also applies to the transformation involved in the
thin plate spline. In that setting we use the former transformation model for the normal
vectors.
3.1.3 Nonrigid Transformations and Regularization
The most common parameterizations of nonrigid transformation are the Thin-plate
spline (TPS) and the Gaussian Radial Basis Function (GRBF) parameterizations.
Both arise from a Reproducing Kernel Hilbert Space with a norm furnishing a natural
regularization of the transformation. Before going forward with the specific kernels used
here, we provide an explanation of how and why these spaces are sufficient for certain
spline problems.
Suppose we start with an arbitrary r.k.H.s. V that is canonically embedded in
L2(Ω,Rd) ∩ C0(Ω,Rd). Denote by w ⊗ δx the linear form that takes v ∈ V and outputs
(w ⊗ δx|v) = wTv(x). From Riesz’s Theorem there exists Kwx ∈ V such that for any
v ∈ V linear evaluation of (Kwx |v) = (w ⊗ δx|v). Since w 7→ Kw
x is linear, K(y, x) is a
matrix such that K(y, x)a = Kax(y). This typifies a matrix-valued kernel, necessary for the
vector-valued spline problems considered here.
50
We put the following assumption on V : for any given set x1, . . . , xn ∈ Ω and any given
α1, . . . , αn ∈ Rd if for all v ∈ V we have∑n
i=1 αiv(xi) = 0 then α1 = . . . = αn. This is
essentially an injectivity requirement.
Now consider the following problem: Given x1, . . . , xn ∈ Ω, λ1, . . . , λn ∈ Rd find v ∈ V
minimizing ||v||V such that v(xi) = λi. What we would like to know is that the span of
K(·, xi) is sufficient for modeling the deformations required by such spline problems. What
can be shown is that
Lemma 1. If there exists a solution v of the problem above, then v ∈ v ∈ V : v(xi) =
0∀i⊥ = V ⊥0 . Furthermore, if v ∈ V ⊥
0 is a minimizer of the norm restricted to this set, it is
also a minimizer on all of V . Finally, V T0 = v =
∑ni=1K(·, xi)αi, αi ∈ Rn.
A detailed presentation of the preceding argument (for one dimensions, as well as a
sketch for d−dimensions) can be found in the shape literature [107].
One takeaway from this argument is that the coefficients α (which consists of
row-vectors α1, . . . , αn) can be solved for linearly given the knots x1, . . . , xn and the
displacements λ1, . . . , λn for the exact spline problem. The solution will simply be
α = K−1λ. When the approximate spline problem replaces the problem above, we
can substitute (K + id /C) for K in the equation above—where C is the penalty applied to
the ℓ2 distance between v(xi) and λi.d
3.1.3.1 Thin-plate spline radial basis functions
Thin-plate splines arise as the mathematical model for deformation of a thin plate of
uniform elasticity. It is most common to approach this problem of plate deformation as
a spline problem. The thin-plate model represents a specific regularizer on the (ill-posed)
spline problem for deforming a finite set of points onto corresponding target locations
[cite]. The corresponding Euler equation for this spline problem is
E(ϕ) =N∑i=1
||yi − ϕ(xi)||2 +∫Rd
||Hϕ(x)||2Fdx, (3–2)
51
where xi, yiNi=1 is a set of correspondences pre- and post-deformation and || · ||2F is the
squared Frobenius norm (applied to the Hessian of ϕ). If we compute the Euler-Lagrange
equation for the norm in the objective E then we get
δ||ϕ||H = 0 implies
0 = 2∂xx(∂xxϕ) + 2∂yy(∂yyϕ) + 4∂xy(∂xyϕ)
0 = ∆2ϕ.
Then we can view Equation (3–2) as a forced Euler equation. This involves compute the
Green’s function for the biharmonic equation.
The observation that the norm of ϕ in the Equation (3–2) is invariant to affine
transformations and therefore rotations of the original coordinate frame suggests that
a shift-invariant kernel (corresponding to an appropriately chosen RKHS) will lead to
a fundamental solution for the spline problem. In 2−d and 3-d there are simple radial
solutions for this:
G2(x) = g2(r) = r2 log(r)
G3(x) = g3(r) = r,
where r = ||x||. We further note that the spatial derivatives of G are
G′2(x) = x(2 log(||x||) + 1),
G′3(x) =
x
||x||.
The resulting action on points in Rd (for d = 2, 3) is thus
ϕ(x) =n∑i=1
αiK(xi, x) + Ax+ b,
where αi, ı = 1, . . . , n is a collection of coefficient vectors in Rd, (A, b) forms an affine
transformation, and K(x, xi) = Gd(x − xi) = Gd(||x − xi||). Note that the biharmonic
equation can be taken as the defining property of the r.k.H.s. by building the hypothesis
52
space up from an operator standpoint [107]. From this perspective, the polyharmonic
splines—which use ∆k where k is chosen appropriately based on the dimension of the
problem—can be constructed.
To understand the behavior of ϕ on oriented points, we recall that the spatial
derivative of a map is also related to the tangent map or pushforward, which acts on
tangent vectors. To understand the use of this formalism, we consider the diffeomorphism
ϕ as acting on a submanifold of codimension 1 embedded in the ambient space Rd and
mapping it onto its image. Thus the deformation ϕ acts on the oriented point-set (m, ν)
like
ϕ · (m, ν) = (ϕ(m), ϕ′mν),
where ϕ′m is the spatial derivative of ϕ evaluated at m.
3.1.3.2 Gaussian radial basis functions
In the discussion at the beginning of this subsection we pointed out that essentially
the only requirement for defining a nonlinear deformation space for splines is a proper
kernel. On of the most common kernels in use is the Gaussian kernel
G(x, y) = id exp−||x− y||2
2σ2,
which has the variance parameter σ which can be tuned for the specific problem. K is
clearly radial in this formulation. In this setup, the direct solution
ϕ(x) =n∑i=1
G(x, xi)αi
is applied. Again, the problem is to solve for α that minimize the appropriately chosen
objective function (discussed below).
53
The action of the Gaussian r.b.f. on oriented point-sets is evaluated in the exact same
fashion as above. With this kernel, the spatial derivative is
G′xi(x, xi) = −2
x− xiσ2
G(x, xi).
Note that in this setting the parameter σ occurs at multiple scales: a linear scaling of the
vector in the derivative and an exponential scaling of the influence of a control point on
the deformation of its neighborhood.
3.2 Introducing Normal Variables for the Target Oriented Point-set
Oriented point-set matching assumes an additional feature: normal directions for each
point of the template and the target. Template normals are estimated offline: a standard
approach to estimation involves the fitting of curves and surfaces to the template features
followed by a sampling of the curves (or surfaces) into an oriented point-set. We assume
template curves (surfaces) do not self-intersect in order to preserve normal uniqueness.
This leaves the target normals. To recover a reconstruction of the surface underlying
the target point-set, we augment the objective with variables for the target normals
W = ωiNi=1. This normal estimation component has no counterpart in the density
matching literature. Adding these parameters does not increase overfitting of ϕ, since the
parameterization of ϕ is independent of the normals. However, interaction of the normal
vectors in the template and target provides an additional regularization on ϕ. We give
conceptual and empirical justification of this below.
To summarize, first we assume that we are in possession of an oriented template
point-set S. This template point-set is deformed onto an un-oriented (and un-organized)
target point-set T via the action of a non-rigid deformation minimizing Equation (3–4).
Since the target point-set is un-organized, we estimate a set of target normals at each
point during the matching process, thereby obtaining an oriented target point-set denoted
T (W ) = (qi, ωi)Ni=1. This simultaneous matching and reconstruction approach is enabled
by a closed-form distance measure between template and target complex wave mixtures.
54
Care must be taken here to deform normals accordingly with the deformation of the target
centers.
3.3 Choosing a Suitable Distance Function
Minimizing D(ψϕ·S , ψT (W )) w.r.t. ϕ and W is a difficult optimization problem
regardless of the choice of D—symmetries and local minima stand in the way. In
the literature, we have seen different choices (geodesic [28], Cauchy-Schwarz [46],
Kullback-Leibler [104]) as well as different choices for the Parzen kernel (Gaussian [52],
Schrodinger [90]). This cross product space of distances, kernels and algorithms is an
active area of research.
We use the L2 distance. The L2 distance for density function registration was studied
in [52] as a specialization of the density power divergence [3]. It strikes a balance between
robustness to sampling and computability. L2 is robust to small Gaussian perturbations in
the location parameters: Eδ[||ψS − ψS+δ||2
]→ 0 as var(δ) → 0 by Fubini’s theorem [34].
While behavior under resampling is harder to examine theoretically, a certain amount
of robustness is borne out in Section 4.4. Now, note that if ||ψS − ψT ||2 < ϵ then
||ψS/CS − ψT /CT ||2 < ϵ′ (CS , CT are normalization constants), and so 1 − ϵ′/2 <
|⟨ψS/CS , ψT /CT ⟩|. Continuing the line of reasoning in Section 2.3, if we pass to the
normalized versions of ΨS and ΨT then we see that the signed distances bS and bT must be
approximately aligned in the near field (of S and T ). Otherwise, destructive interference
would cause cancellations in the product field, decreasing the correlation.
We evaluate the squared L2 distance between the deformed template and target
complex wave mixtures, subsequently minimized w.r.t. the unknown matching and normal
parameters. The action of the spatial non-rigid deformation results in deformed template
points and normals. Contrast this to the typical density matching situation in which
only the template points are deformed. The squared L2 distance between the deformed
55
(a) (b)
(c) (d) (e)
Figure 3-2. An example of surface reconstruction by RDM. (a) and (b) are the inputsto RDM, the points in (a) are the unorganized target and (b) is the OPStemplate. (c) shows the estimated normal vectors from RDM and the truenormal vectors. 99% of the normal vectors are recovered to within π/4 angularerror. (d) shows the reconstructed surface (the zero level set of the phase ofψ) from the true normal vectors and (e) the reconstructed surface from RDM.The protrusions from the ears are due to mis-oriented normals in the highcurvature area near the ear lobe.
56
(a) (b)
Figure 3-3. An example of curve reconstruction by RDM. (a) The target points areshown in black ×’s, with the level-sets of the unsigned distance functionshown as contours. (b) After RDM, the level sets of the target set using theestimated normal vectors are shown. Abutting point-sets make this particularreconstruction problem difficult (Section 4.4.)
template ψϕ·S and target ψT (W ), D(ψϕ·S , ψT (W )), is given by
∫RD
|M∑a=1
e−∥x−ϕ(ma)∥2
2σ2 +iϕ·νTa (x−ϕ(ma))
λ −N∑b=1
e−∥x−qb∥
2
2σ2 +iωTb (x−qb)
λ |2dx (3–3)
where the target wave mixture has been specified for the oriented point-set T (W ) =
(qb, ωb)Nb=1. Note that their cardinalities M and N can differ. When evaluating the L2
distance, we are required to determine the inner product between terms which may differ
in their location and frequency (with common scale and frequency parameters σ and λ
respectively).
The inner product, denoted I(q,ω)(m,ν) = ⟨ψ(m,ν), ψ(q, ω)⟩, is given by the integral
∫RD
e−∥x−m∥2−∥x−q∥2
2σ2 +iνT (x−m)−ωT (x−q)
λ dx =e−
∥m−q∥2
4σ2 −σ2∥ν−ω∥2
4λ2+i
(ν+ω)T (m−q)2λ
(2πσ2)D2
.
If m = q, then the spatial term goes to 1 and weights the Gaussian corresponding to
the frequency term heavily. If m ≈ q + δω⊥ this weighting is dampened, but we obtain
constructive interference provided the normals ν and ω are aligned. When the normals are
not aligned, we get destructive interference. This can either force the normal estimates in
57
line with the template or influence the template movement, and prevent unnecessary local
rotation of the template normals.
The objective function minimized in this work is therefore
E(ϕ,W ) = D(ψϕ·S , ψT (W )) + βL(ϕ). (3–4)
In Equation (3–4), ϕ and W are the desired spatial deformation and target normal
set respectively. Additionally, β is a regularization parameter and L a suitable spline
regularization (chosen to be the thin plate spline bending energy). Assuming a set of fixed
centers pbPb=1 on the template, the thin plate spline [8, 103] maps the location x ∈ RD to
the location A(x) +∑P
b=1CTb K(x − pb) where A is an affine transformation, K is a radial
basis spline kernel and CbPb=1 is the set of spline parameters. The mapping is linear in
each Cb and A and therefore so is ϕ′. The regularization term depends on the choice of
kernel.
We can characterize the asymptotic behavior of our matching objective. Examining
the wave mixture, we see that the wave flattens out as λ → ∞—eventually approaching 1.
This intuitively results in the Gabor tending to the Gaussian. This is made more precise in
the following Proposition, essentially a consequence of the dominated convergence theorem
[34].
Proposition 3.1. Let ma, νaMa=1, qb, ωbNb=1 be a pair of oriented point-sets. As λ → ∞
the distance in Equation (3–3) converges to∫RD
|M∑a=1
e−∥x−ma∥2
2σ2 −N∑i=1
e−∥x−qi∥
2
2σ2 |2dx.
3.4 Gradient Computation and Optimization Details
We derive the gradient for the TPS parameterization discussed above. The penalty
term is easy to differentiate with respect to C:
∂Cβ trCT KC = 2βKC
58
Original OPSTransformed OPS
(a) (b) (c)
...
..0.05
.0.1
.0.15
.0.2
.0.25
.−1 .
−0.5
.
0
.
0.5
.
1
.
σ.
X-Translation
..
... 0.2.
0.25
.
0.3
.
0.35
.
0.4
.
0.45
.
0.5
.
0.55
.
0.6
.
0.65
.
0.7
.
0.75
.
0.8
.
0.85
.
0.9
.
0.95
.
1
.
1.05
.
1.1
.
1.15
.
1.2
.
1.25
.
1.3
.
1.35
.
1.4
(d) ...
..0.1
.0.2
.0.3
.0.4
.0.5
.−1 .
−0.5
.
0
.
0.5
.
1
.
λ
.
X-Translation
..
... 0.2.
0.3
.
0.4
.
0.5
.
0.6
.
0.7
.
0.8
.
0.9
.
1
.
1.1
.
1.2
.
1.3
.
1.4
.
1.5
.
1.6
.
1.7
.
1.8
(e)
...
..0.05
.0.1
.0.15
.0.2
.0.25
.
−0.5
.
0
.
0.5
.
σ.
RotationAngle
..
... 0.1.
0.15
.
0.2
.
0.25
.
0.3
.
0.35
.
0.4
.
0.45
.
0.5
.
0.55
.
0.6
.
0.65
.
0.7
.
0.75
.
0.8
.
0.85
.
0.9
.
0.95
.
1
.
1.05
.
1.1
.
1.15
.
1.2
(f) ...
..0.1
.0.2
.0.3
.0.4
.0.5
.
−0.5
.
0
.
0.5
.
λ
.
RotationAngle
..
... 0.2.
0.25
.
0.3
.
0.35
.
0.4
.
0.45
.
0.5
.
0.55
.
0.6
.
0.65
.
0.7
.
0.75
.
0.8
.
0.85
.
0.9
.
0.95
.
1
.
1.05
.
1.1
.
1.15
.
1.2
.
1.25
.
1.3
.
1.35
.
1.4
(g)
...
..0.05
.0.1
.0.15
.0.2
.0.25
.0.5 .
1
.
1.5
.
σ.
X-Shear
..
... 0.2.
0.25
.
0.3
.
0.35
.
0.4
.
0.45
.
0.5
.
0.55
.
0.6
.
0.65
.
0.7
.
0.75
.
0.8
.
0.85
.
0.9
.
0.95
.
1
.
1.05
.
1.1
.
1.15
.
1.2
(h) ...
..0.1
.0.2
.0.3
.0.4
.0.5
.0.5 .
1
.
1.5
.
λ
.
X-Shear
..
... 0.2.
0.3
.
0.4
.
0.5
.
0.6
.
0.7
.
0.8
.
0.9
.
1
.
1.1
.
1.2
.
1.3
.
1.4
.
1.5
.
1.6
(i)
Figure 3-4. Profile of the L2 distance function over several transformations and choicesof parameters σ, λ. The original set of oriented-points is deformed and thedistance is computed. a), b), c) show the pose of the original (black) anddeformed (red) in each of the translational, rotational, and shear deformations.d) and e) show the behavior of the distance under X−translations andvariations in σ and λ, respectively. f) and g) show the same under rotations,and h) and i) under X−shear. For the d), f), and h) λ = .25 and for e), g), andi) σ = .1.
59
by differentiating the trace and using the symmetry of K. The derivative of the inner
product with respect to the parameters C is
∂CIϕC·(m,ν)(q,ω) = ∂ϕC(m)I
ϕC·(m,ν)(q,ω)
∂ϕC(m)
∂C+ ∂ϕC·νI
ϕC·(m,ν)(q,ω)
∂ [ϕC · ν]∂C
. (3–5)
Recall that ϕC acts on the normal ν at point m by ϕC · ν = ϕ′C|mν [93] where ϕ′
C|m
is the Jacobian at m. Note that now ∂∂CϕC is the derivative w.r.t. the TPS parameters,
not the spatial variable. We use ∂· and∂∂· interchangeably. Let R ∈ RP×N be given
by Rij = K(pi − mj), the kernel matrix pairing template and control points. Then
∂∂C
[ϕC(mj)]a = Rjea (the superscript a indicates the ath coordinate, ea ∈ Rd the ath basis
row vector) with Rj the jth column of R. Differentiating,
∂[ϕC(mj)]a
∂C= Rjea, [
∂IϕC·(m,ν)(q,ω)
∂ϕC(m)]a =
[−ϕC(m)− q
2σ2+ i
ϕC · ν + ω
2λ
]aIϕC·(m,ν)(q,ω) .
When applying the entire gradient update (through all points), this is simply an outer
product of the derivatives of the inner product and R.
The latter factor in Equation (3–5) is not typically seen in the point matching
literature, and arises due to the transformation in Equation (3–1). We must differentiate
ϕ′C|mj
ν with respect to C, where ′ denotes differentiation with respect to the domain
variable. First, to see how ϕ′C|mj
acts in the TPS parameterization, denote by [R′]k the
matrix of derivatives in the kth coordinate of the kernel function at each point in the
template set. Then
[ϕ′C|mj
νj]a
=[[Rj
′]a
T
C]νj, and so ∂C
[ϕ′C|mj
νj]a
= ([Rj
′]a)νTj ,
by treating C as a scalar form acting on ([Rj
′]a, ν). So the second term in Equation (3–5)
is
∂IϕC(mj ,νj)
(q,ω)
∂C= I
ϕC(mj ,νj)
(q,ω)
D∑a=1
([−σ2ϕC · νj − ω
2λ2+ i
q −m2λ
]a[Rj
′]a)νjT .
60
Therefore, the descent direction for the TPS parameters is given by
∇CD =2M∑i=1
M∑j=1
[∂ϕC(mi)I
ϕC(mi,νi)(mj ,νj)
∂ϕC(mi)
∂C+ ∂ϕC·νiI
ϕC(mi,νi)(mj ,νj)
∂ [ϕC · νi]∂C
]
− 2M∑i=1
N∑j=1
[∂ϕC(mi)I
ϕC(mi,νi)(qj ,ωj)
∂ϕC(mi)
∂C+ ∂ϕC·νiI
ϕC(mi,νi)(qj ,ωj)
∂ [ϕC · νi]∂C
].
To complete the picture, we return to the principal themes of this work—simultaneous
registration and reconstruction. Recall that we began by pointing out that there was
a paucity of literature on non-rigid SDF matching in comparison to density matching.
We zeroed in on the difficulty of estimating SDFs as the principal reason. Rather than
estimate an SDF for the target point-set with the aid of a deformed template, we chose
to estimate target normals as we deformed the template. To do this, we apply the descent
direction for each ωi in terms of combinations of ∂ωiIϕC(mi,νi)(qj ,ωj)
during each round. Further
details of the optimization algorithm are provided below. To obtain the signed distance
from these normals one may use previously developed methods [53, 63] or use the phase
of the resulting wave-function directly (Figure 3-2 and Figure 3-3). The result is an
integrated probability density and SDF approach to simultaneous deformable template
matching and multiple curve (or surface) reconstruction.
3.5 A Brief Comparison with Currents
In the introduction the method of curve matching based on the geometric measure
theoretic concept of currents was mentioned. I’ll go into a slightly more detailed analysis
of this framework here, attempt to provide an argument that the representation field
underlying the currents model is a spline-based vector field representation, and contrast
the resulting registration algorithms.
The mathematical theory of currents extends the theory of generalized functions
into the representation of surfaces embedded in an ambient space [29]. The space of
2−dimensional currents on R3 is the dual W ∗ to W = Ω2c(R3), with an element [S] ∈ W ∗
61
representing a surface S → R3 by acting on ω ∈ W as
[S](ω) =
∫S
ω(x)(u, v)dA(x)
where u, v span TxS. The continuity condition for [S] requires that the limit of a sequence
of 2−forms converging to 0 (in the norm topology of all derivatives of the 2−forms)
vanishes in the limit under [S]. The topology on W ∗ is the weak-∗ (Schwartz) topology.
Note that since ω is a 2−form on R3 we can associate it at each point with a vector w(x)
by w(x) · (u × v) = ω(x)(u, v). In this sense, the space of currents can represent a broad
class of embedded surfaces in Euclidean space.
In previous work on diffeomorphic measure matching [100] the fact that the
push-forward of a current agrees with the current associated to the deformation of the
underlying surface (again, everything is embedded so we are considering the action of
diffeomorphisms of Rd on the surface)
ϕ∗S(ω) = (ϕ(S))(ω) = S(ϕ∗ω)
is exploited for registration. The pointwise vectorial representation of 2−form on R3 with
the normal vectors of the underlying surface a parameterization can be used. For elements
of the r.k.H.s. we identify the space of 2−forms with the space of compactly supported
vector-valued functions (hence the kernel will be matrix-valued) with smoothly varying
derivatives, and the natural inner product. The evaluation functional δξx(ω) = w(x) · ξ
belongs to W ∗ and we obtain the formula
⟨δξx, δηy⟩ = ηTkW (x, y)ξ.
The currents model for surface registration implies that the norm between the
oriented point-set A = (ma, ξa)Na=1 and another oriented point-set B = (qb, ηb)Mb=1
62
should be evaluated in this r.k.H.s. and has the form
||φ(A)− φ(B)||W =∑a,b
ξTb kW (ma,mb)ξa +∑c,d
ηTd kW (qc, qc)ηc − 2∑a,c
ηTc kW (ma, qc)ξa.
(3–6)
The authors of [100] also develop a deformation formalism based on the vector space
of C∞ vector fields, which is unnecessary for understanding the analogy between the
two methods under consideration. The authors use isotropic, shift-invariation kernels in
their experimental evaluation (kW (ma, qc) = g(|ma − qc|/σ) id). In terms of Equation
(3–6), we see that the standard norm on the space of vector fields on Rd evaluated on
spline-interpolants of the form described above in Equation (1–2) leads to a similar
data-fidelity term.
Our method can be viewed as performing matching of oriented point-sets as
embedded in L2 under the feature map defined by the CWR. Another way to express the
norm is by using the r.k.H.s. associated with the completion of the image H = Ψg(L2(Rd))
[105]. The evaluation functional δηx ∈ H∗ results in
δηx(f) = f(x, η).
This results in the reproducing kernel formula
KH(x, ξ; y, η) = ⟨gx,ξ, gy,η⟩,
where g is the window function for the Gabor transform under consideration. The
difference is that both the normal and the location vector factor into the feature space
representation of the oriented point-set, unlike with the currents representation where
the kernel between points defines an inner product between the oriented points by a
transformation of the euclidean inner product.
63
3.6 Analysis of the RDM Objective Function
In this section we provide the detailed form of the RDM objective function and
analysis the asymptotic behavior of the objective. We show that it is related to a
correlation-based registration approach as the frequency parameter λ ↑ ∞.
3.6.1 Inner Product of CWRs
Thus far we have focused on isotropic CWRs: waves with a uniform influence field.
Here we outline the difference on the registration objective by evaluating the inner product
between anisotropic CWRs. This allows us to apply non-uniform emphasis on the oriented
points and have distinct directional effects at different points.
3.6.1.1 Isotropic CWRs
First we derive the expression for the inner product between two complex waves, as
an ingredient necessary to compute the distances. Recall the objective function
D(ψS , ψT ) =
∫Rd
|M∑a=1
e−∥x−ma∥2
2σ2 +iνTa (x−ma)
λ −N∑i=1
e−∥x−qi∥
2
2σ2 +iωTi (x−qi)
λ |2dx.
By using the inner product expansion
D(ψS , ψT ) = ||M∑a=1
e−∥x−ma∥2
2σ2 +iνTa (x−ma)
λ ||2
− 2M∑a=1
N∑b=1
Re⟨e−∥x−qb∥
2
2σ2 +iωTb (x−qb)
λ , e−∥x−ma∥2
2σ2 +iνTa (x−ma)
λ ⟩
+ ||N∑b=1
e−∥x−qb∥
2
2σ2 +iωTb (x−qb)
λ ||2 (3–7)
we can reduce the integral in Equation (3–3) to a sum of integrals of the form of pairwise
inner products. So we must solve
σ,λI(m,ν)(q,ω) =
∫Rd
e−∥x−m∥2
2σ2 +iνT (x−m)
λ e−∥x−q∥2
2σ2 −iωT (x−q)
λ dx.
64
We suppress the superscript σ, λ for now. The first step is to group real factors together
and complete the square, obtaining
I(m,ν)(q,ω) = e−
∥m−q∥2
4σ2
∫Rd
e−∥x−m+q
2 ∥2
σ2 +iνT (x−m)
λ−iω
T (x−q)λ dx.
Then we can remove the constant modulation eiνTm−ωT q
λ and get
I(m,ν)(q,ω) = e−
∥m−q∥2
4σ2 eiνTm−ωT q
λ
∫Rd
e−∥x−m+q
2 ∥2
σ2 −i (ω−ν)T xλ dx.
This clearly corresponds to a Fourier transform of e−∥x−m+q
2 ∥σ2 at ω−ν
λ. We can extract the
translation by m+q2
as a modulation and obtain
I(m,ν)(q,ω) = e−
∥m−q∥2
4σ2 ei(ν+ω)T (m−q)
2λ
∫Rd
e−∥x∥2
σ2 −i (ω−ν)T xλ dx.
The last factor is a Fourier transform of a Gaussian, and so
I(m,ν)(q,ω) = (2πσ2)d/2e−
∥m−q∥2
4σ2 ei(ν+ω)T (m−q)
2λ e−σ2∥ω−ν∥2
4λ2 . (3–8)
Note that we only take the Real part of these equations (see Equation (3–7)), but we work
with the complex version throughout.
3.6.1.2 Anisotropic CWRs
The inner product of the anisotropic Gaussian CWR has a similar derivation to the
above, with change of variables playing a role in the calculation of the normalization
coefficient and a more delicate completing of the square. For completeness, we include the
inner product here, along with the derivatives.
Im,ν,Σq,ω,H =
√(2π)d
|Σ +H|exp
− [Q(A, p)]
2
− [Q(Σ,m)] + [Q(H, q)]
2+ i
νTm− ωThλ
A = Σ+H, p = (ν − ω)− i(Σm+Hq),
Q(B, u) = uTB−1u.
65
The partial derivatives with respect to m and ν are
∂Im,ν,Σq,ω,H
∂m=[iΣA−1p− Σm+ i
ν
λ
]Im,ν,Σq,ω,H
∂Im,ν,Σq,ω,H
∂ν= −
[Ap+ i
m
λ
]Im,ν,Σq,ω,H .
The derivative with respect to Σ has a subtlety. Often Σ will be considered as a
function of ν: since by the definition of ν we expect to observe more points lying along ν⊥
it is appropriate to weight this region more heavily in the model. This also allows us to
maintain frame properties when pursuing the Gabor analogy. The complete derivative with
respect to ν includes ∂I∂Σ
∂Σ∂ν. Another way to achieve this relationship is to apply the same
transformation rule to Σ as to ν. We simply update ν using ∂I∂ν
instead of ∂I∂ν
+ ∂I∂Σ
∂Σ∂ν, and
then recompute Σ with each iteration based on the current ν (rather than using the action
Σ′ = ϕ · Σ where ϕ acts on S+d by multiplication by Jϕ). We also include the derivative ∂I
∂Σ
here for update purposes.
∂Im,ν,Σq,ω,H
∂Σ= −1
2
[−(ν − ω)(ν − ω)TA−2 −mmT ∂J
∂Σ−mqT ∂K
∂Σ− det(Σ +H)−1I
]Im,ν,Σq,ω,H
J = Σ(Σ +H)−1Σ, K = Σ(Σ +H)−1H
∂J
∂Σ= 2(Σ +H)−1Σ + ΣA−2Σ
∂K
∂Σ= (Σ +H)−1H + ΣA−2H.
3.6.2 Asymptotic Behavior of the RDM Objective Function
The limiting case as λ ↑ ∞ results in a wave with approaching infinite frequency,
essentially a negligible modulation. The following proposition cements the intuitive result
that the objective function tends towards a distance between Gaussian mixtures.
Proposition 3.2. Let S = ma, νaMa=1, T = qb, ωbNb=1 be a pair of oriented point-sets. As
λ→∞ Equation (3–3) converges to∫Rd
|M∑a=1
e−∥x−ma∥2
2σ2 −N∑i=1
e−∥x−qi∥
2
2σ2 |2dx. (3–9)
66
Proof. By turning the tables on the parameters we can view the sequence as the function
D(λ;ψσ,λS , ψσ,λT ). That is, for each pair (ma, νa), (qb, ωb) we have
σ,λI(ma,νa)(qb,ωb)
→ (2πσ2)d/2 exp−||ma − qb||2
4σ2 as λ→∞.
Comparing Equation (3–8) and Equation (3–9) the result is clear.
67
CHAPTER 4REGISTRATION: EMPIRICAL ANALYSIS
In this Chapter, the performance of the registration algorithm outlined in Chapter 3
is evaluated. We compare with the state of the art in density field matching (such as
gmmreg, abbreviated to GMM) [52], generalized function matching (diffeomorphic measure
matching abbreviated DIFF) [37], point-based matching (CPD) [72], and graph-matching
(FGM-U) [113]. While other methods [77, 92] are appropriate for further comparison,
handling the asymmetry in representation is not possible in their current formulation. The
corresponding results are indicated by the appropriate marker and color combinations (see
legend in Figure 4-7). We investigate the performance of RDM on a variety of datasets and
conditions, outlined below.
We also evaluate the performance of the RDM algorithm on normal recovery against a
commonly used pipelined approach. The results show that RDM is a preferable method to
simultaneously aligning and estimating normal parameters for curves and surfaces.
4.1 Experimental Validation
To validate the performance of the RDM registration algorithm we test the accuracy
of recovery of transformation parameters over a range of rigid, affine, and non-rigid
deformations. A series of randomly generated transformations of the appropriate type
lying within a specified range are created and then the oriented point-set is transformed.
In some situations points may be dropped or occluded. For missing points with outliers
points are dropped from the target uniformly at random whereas for occlusion a region of
the shape is dropped from the target. A specified number of trials is performed with each
setting of outliers, occlusion, noise, etc. The absolute error of recovery is reported in a
graph in each section. Errorbar plots represent the standard deviation of the absolute error
of recovery by the height of the bars.
In the non-synthetic experiments, a different measure of recovery must be done. For
the Gatorbait dataset we use known curve-wise correspondences and report the total
68
Frechet distance across all curves. For the CMU House dataset, TOSCA, FAUST we
use known correspondences and report precision-recall curves. For the IBSR subcortical
structure dataset we report the DICE coefficient of the subcortical volumes before the
transformation estimated by RDM as well as after, for each structure. For more details see
the explanation in each section.
4.2 Rigid and Affine Registration
In this section the performance of RDM for rigid and affine registration is tested.
The parameterization of the groups is discussed in Section 3.1. Experiments probing the
robustness to noise, outliers, and range of the greedy descent algorithm are performed and
recorded here.
4.2.1 Range of Rotation
In this experiment a set of curves from the Gatorbait dataset is used as a template
and target. As a target, it is progressively rotated more at each stage (in the same
direction). For each angle the error in recovery of the angle parameter is reported.
We found that RDM performed comparably to GMM and CPD in terms of
convergence basin for rotations. We note here that two simple and straightforward
improvements that can be done for rigid alignment is to try multiple initializations and
to use rigid invariant feature matching as a pre-processing step and then fine-tune the
alignment using an algorithm like RDM.
4.2.2 Gaussian Noise on Points
In this section we experiment with the performance of the rigid registration algorithm
when the target is affected by point-wise Gaussian noise. We do this for both rigid and
affine motion of the target. First we generate a random rigid-body motion, (R∗, t∗), to an
element of the Gatorbait dataset of size N and generate a target oriented point-set of size
N . Then Gaussian noise of the specified variance is added to each point. The tasks are to
recover (R∗, t∗) and the normal vectors on the target. To randomize the process, the trials
at each level of noise consist of random rotations R∗ with the rotation angle θ ∈ [−π3, π3]
69
Table 4-1. Range of convergence for rotations. a) shows the rotation (about (0, 0)) of−4π
9radians in blue quiver arrows, the initial template in red circles, and the
registered template in green quiver arrows. b) shows relative errors for the rangeof angles. After 2.2 radians the performance degraded. c) shows the recoveredregistration for −π
3radians. d) depicts relative errors for the range of angles
tested. The algorithm performed well up to π/2 radians away from the baselinepose. σ0 = 20 for both experiments.
(a)
−50 0 50 100 150 200 250 300−300
−200
−100
0
100
200
300
(b)
Angle λ0 = 100 Error λ0 = 1000 Error−2π
39E-5 3E-10
−4π9
2E-4 9E-10−2π
93E-4 1E-8
0 .00 .002π9
3E-4 1E-74π9
2E-4 3E-102π3
1E-4 1E-7
(c)
−200 −150 −100 −50 0 50 100 150−150
−100
−50
0
50
100
150
(d)
Angle λ0 = 100 Error λ0 = 1000 Error−π
21.1 1.0
−π3
6E-5 6E-5−π
62E-4 1E-4
0 .00 .00π2
2E-4 1E-4π3
6E-5 6E-5π6
1.1 1.1
and a small Gaussian translation of unit variance. In this experiment a coarse level of
registration is performed followed by a fine level. In the coarse level σ = 20 and λ = 200.
In the fine level, σ = 10 and λ = 150.
For the affine experiment, we apply an affine transformation to the template with
linear part having eigenvalues in [.75, 1.25] and translational component up to 100% of
70
...
..0.
2.
4.
6.
8.
10.
12.0 .
0.5
.
1
.
1.5
.
2
.
2.5
.
3
.
σ of Gaussian noise
.
Absolute
Error
onTransformationParam
eters
.
. ..Angle
. ..X
. ..Y
(a)...
..2
.4
.6
.8
.10
.
50
.
100
.
σ of Gaussian Noise
.
AngularError
onNormal
Vector
(b)
...
..1
.2
.3
.4
.5
.
·10−2
.
0.5
.
1
.
1.5
.
2
.
2.5
.
3
.
3.5
.
·10−2
.
σ of Gaussian Noise
.
Absolute
Error
onTransformationParam
eters
.
. ..q1
. ..q2
. ..q3
. ..q4
. ..X
. ..Y
. ..Z
(c)...
..1
.2
.3
.4
.5
.
·10−2
.0 .
50
.
100
.
150
.
σ on Gaussian Noise
.
AngularError
onNormal
Vectors
(d)
Figure 4-1. Median error and variance for rigid transformation with pointwise Gaussiannoise. a) shows the recovery of the transformation parameters while b)shows the recovery of the angle of the normal vector. In a) the blue linethat indicates the recovery of the rotation parameters is on the bottom,very close to zero through the range of experiments.c) and d) show the sameexperiment repeated on the TOSCA dataset. 10 trials at each level of noisewere performed.
71
the diameter of the object. Then Gaussian noise is applied of the specified range shown in
the plots. The Gatorbait dataset is a shape spanning [−150, 150] × [−150, 150] while the
TOSCA dataset in the Centaur, normalized to lie within [−1, 1]3. Thus the different ranges
of the parameters.
4.2.3 Missing Points With Outliers
In this section we perform experiments that showcase the robustness of the RDM
registration algorithm. We generate a random rigid-body motion, (R∗, t∗), to an element of
the Gatorbait dataset of size N and generate a target oriented point-set of size N . Then
points are dropped from the target at random, until the remaining number of points is
Nr. We also drop the normal vectors from the target. The tasks are to recover (R∗, t∗) as
well as the normal vectors on the target. Finally, we inject K randomly generated points
into the target so that Nr + K = N . This is the missing points with outliers (MPO)
experiment.
To randomize the process, the trials at each level of noise consist of random rotations
R∗ with the rotation angle θ ∈ [−π6, π6] and a small Gaussian translation of unit variance.
For the 3-D experiments the Euler angles all fall within the interval [−π/8, π/8]. In this
experiment a coarse level of registration is performed followed by a fine level. In the coarse
level σ = 40 and λ = 100. In the fine level, σ = 20 and λ = 40.
For the affine MPO experiment, we use the same randomly generated translation
parameters and the same inlier/outlier rates. To generate an affine transformation we
decompose it as RSR∗ = A and generate R as above (by a random angle or random unit
quaternion), and S diagonal with entries in [.75, 1.25] so that the scaling component has
a range of .5. In the affine alignment setting we found that normal vectors are harder to
estimate correctly using RDM.
4.3 Synthetic Normal Recovery, Warps, and Occlusions
This section consists of two sets of experiments. First, we compared our algorithm to
a pipeline approach to normal estimation:
72
...
..60.
70.
80.
90.
100.
110.
120.
130.
140.0 .
2
.
4
.
6
.
8
.
10
.
12
.
Number of Points in Target
.
Absolute
Error
onTransformationParam
eter
.
. ..Angle
. ..X
. ..Y
(a) ...
..60.
70.
80.
90.
100.
110.
120.
130.
140.10 .
20
.
30
.
40
.
50
.
60
.
70
.
80
.
90
.
Number of Points in Target
.
Absolute
AngularError
onNormal
Vector
(b)
...
..70.
80.
90.
100.
110.
120.
130.
140.
150.
0.1
.
0.2
.
0.3
.
0.4
.
0.5
.
0.6
.
Number of Points in Target
.
Absolute
Error
ofTransformationParam
eters
.
. ..q1
. ..q2
. ..q3
. ..q4
. ..X
. ..Y
. ..Z
(c)
...
..70
.80
.90
.100
.110
.120
.130
.140
.
0
.
20
.
40
.
60
.
80
.
100
.
Number of Points in Target
.
Average
AngularError
onNormal
Vector
(d)
Figure 4-2. Median error and variance for rigid transformation with dropped inliers andoutliers added. a) shows the recovery of the transformation parameters whileb) shows the recovery of the angle of the normal vector. In a) the blue linethat indicates the recovery of the rotation parameters is on the bottom, veryclose to zero through the range of experiments. 10 trials at each level of inlierdropout were performed. c) and d) show the results for the 3-D dataset.
73
...
..70.
80.
90.
100.
110.
120.
130.
140.
150.
0.5
.
1
.
1.5
.
2
.
2.5
.
3
.
3.5
.
·10−2
.
Number of Points in Target
.
Absolute
Error
onTransformationParam
eters
.
. ..a11
. ..a12
. ..a21
. ..a22
. ..a31
. ..a32
(a) ...
..70
.80
.90
.100
.110
.120
.130
.140
.20 .
30
.
40
.
50
.
60
.
70
.
80
.
90
.
100
.
Number of Points in Target
.
AngularError
onNormal
Vectors
(b)
...
..2
.4
.6
.8
.10
.
0.5
.
1
.
1.5
.
2
.
2.5
.
3
.
σ of Gaussian Noise
.
Absolute
Error
onTranform
ationParam
eters
.
. ..a11
. ..a12
. ..a21
. ..a22
. ..a31
. ..a32
(c)...
..2
.4
.6
.8
.10
.
50
.
100
.
σ of Gaussian Noise
.
AngularError
onNormal
Vectors
(d)
Figure 4-3. Median error and variance for affine transformations. a) shows the recovery ofthe transformation parameters (shown in different colors) while b) shows therecovery of the angle of the normal vector for different levels of missing pointswith outliers. In c) and d) the corresponding results for Gaussian noise trialsare shown. 10 trials at each level of inlier dropout were performed.
74
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.51
1.5
2
2.5
3
3.5
4
4.5
5·10−2
Deformation Level
Error
onPoints
GMM+NN RDM
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.52
4
6
8
10
12
Deformation Level
AngularError
onNormals(degrees)
Figure 4-4. Top: the recovered normal vectors for GMM+NN (left) and RDM (right)and the True normals (both) are attached at the points. Bottom: Averageregistration error (left) and the median angle error between correspondingnormal vectors (right) for RDM and GMM+NN. 50 trials per level areperformed. Even when GMM matches equally well, it does not provide robustnormal estimates.
1 Register a template point-set to a target by an estimated deformation.
2 Let the deformation act on the normal vectors of the template.
3 Use a nearest-neighbor approach to infer normal vectors onto the corresponding
points in the target set.
We used GMM as the matching algorithm. A single curve (a body curve consisting of
80 points) from the multi-curve GatorBait dataset was used. For synthetic deformation,
a diffeomorphism is fit to point perturbations by solving a 3-D flow problem [12]. This
has the advantage that the Jacobian is positive definite everywhere and so relative normal
orientations are preserved. The deformation level corresponds to supx∈RD ||ϕ(x) − x||
(evaluated on the test points). Results are shown in Figure 4-4. In this experiment GMM
and RDM only used a single initialization. This experiment shows that a substantial gain
in normal recovery is obtained by using RDM relative to imposing template structure after
matching.
75
In the second set of experiments, we tested the performance of our algorithm in the
2-D and 3-D synthetic settings against 4 other methods: CPD, DIFF, FGM-U, and GMM.
For 2-D we used a point-set consisting of 5 curves (Figure 3-3) from the aforementioned
dataset, while for 3d the Stanford bunny and TOSCA datasets were used. We create
the target by randomly perturbing points lying along a grid and solving for a TPS with
identity affine component. No information about the target normal vectors is known
beforehand. After registration, the mean distance to the corresponding point [average error
(AE)] is recorded. For the occlusion trials, an approximate fixed deformation level is used,
.3 in norm for 2-D and .15 in 3-D. While operating at similar degrees of generalization
to occlusion, RDM performs much better at medium and high levels of deformation than
the competitors. Robustness to outliers and noise is also studied. For these experiments
FGM-U was run at 50 scales, due to runtime limitations. One can see from 4.4 that
FGM-U struggles with nonrigid deformation. A plot showing the percentage of recovered
normal orientations is included as well.
4.4 Non-Synthetic Matching Experiments
We perform intra-class matching experiments on the TOSCA [10], FAUST [7],
and Gatorbait datasets. TOSCA and FAUST represent the 3-D performance gauge
on real matching experiments and GatorBait for real multi-curve datasets. The same
statistic as above—average error to correspondent—is collected for the sets with known
correspondence. We present recall (percentage of correct correspondences within a
threshold) for matching pose 0 to poses 1, 2, 4, and 5 (smaller deformations) of the
FAUST training registrations over all 10 subjects and present comparisons with GMM
and CPD. For TOSCA we match the first cat, dog, and gorilla to the remaining poses.
We have foregone benchmark comparisons here because in the large deformation regime
extrinsic matching is prone to local minima, and we restrict the comparisons to relative
performance among other extrinsic matching techniques. We simply use these datasets as a
baseline for comparison with GMM and CPD.
76
The GatorBait dataset does not have known correspondences. Furthermore, it consists
of nearly abutting curves (Figure 3-3)—organizing points into their appropriate curve
components is made much harder by the existence of neighborhood points on different
curves. The Frechet Distance [102] between corresponding parts in the final registration
and the target is recorded. This allows us to measure how accurately each part of the
template is matched to the target. The first fish species is used as the template and
matched to 23 other species. We also perturb the fish with noise and add outliers as
uniformly drawn additional points as shown in 4.4. For large 3-D datasets, DIFF and
FGM were found to be impractical from a runtime perspective (for a runtime comparison
see Figure 4-7). For the GatorBait dataset, FGM was not competitive.
4.5 CMU House Dataset
The CMU House dataset consists of a sequence of image frames and keypoints. The
task is to perform point-matching and recover correspondences between points. From a
correspondence standpoint, FGM [113] with Delaunay triangulation (FGM-del) is the
state of the art on this dataset. However, FGM is sensitive to the graph structure—with
2-nearest neighbors FGM’s performance suffers. Should a large set of correspondences be
needed, graph matching becomes impractical—even the 30 correspondences here represent
significant computational effort for graph matching. To initialize normals for RDM, we
extract the gradient of the image I at each of the keypoints in the frames. This is a
departure from the usual consideration of ‘normals’—we sample a vector field (∇I) at
discrete points.
4.6 3-D Subcortical Structure Registration
The IBSR dataset2 consists of a collection of 256 × 256 × 128 MRI volumes scanned
at varying resolutions, including marked analysis images. Each scan comprises over 75
marked regions, corresponding to different neuro-anatomical components. From such
2 http://www.cma.mgh.harvard.edu/ibsr
77
Table 4-2. Average (Standard deviation) initial DICE and final DICE scores over a setof four subcortical structures registered along the boundary. RDM produces asubstantial increase in DICE compared to the default alignments. It performsroughly as well as CPD when we use the same transformation basis (Gaussianr.b.f.) and better than GMM in each setting.
Part Initial DICE RDM-TPS DICE RDM-GRBF DICE GMM-TPS DICE CPD DICELeft Thalamus .80 (.02) .88 (.02) 91 (.01) .86 (.02) .91 (.02)Left Putamen .74 (.04) .79 (.01) .77 (.02) .77 (.05) .77 (.01)Right Thalamus .78 (.04) .92 (.01) .95 (.03) .87 (.02) .94 (.01)Right Putamen .64 (.08) .81 (.03) .79 (.02) .79 (.04) .81 (.03)
data-sets atlases can be build that encompass the primary modes and variations of
common structures in the brain. These can later be used for computer-aided diagnostics.
In this experiment, we selected 8 sub-cortical structures from the mid-brain: the left
and right Thalamus, Putamen, Hippocampus, and Amygdala. A boundary mesh for each
substructure was extracted using marching cubes on the labeled analysis image. Then the
resulting mesh was transformed to oriented points using barycenters and face-centered
normal vectors. We therefore end up with a simple collection of oriented points, rather
than a complex of meshes. To form a aligned set of structures, we register the resulting
oriented point-sets with a minimal bending energy. The resulting transformation is used
to deform the MRI volumes, and then the resulting transformed image is treated as a
candidate segmentation from which a DICE score is computed. We report the mean (and
standard deviation) of the DICE coefficients across n = 5 trials of matching for each
structure. The average bending energy (of the estimated TPS) is .6.
4.7 Maximum Likelihood Registration with |ψ|2 as a Density
In this Section an alternative alignment procedure to RDM is evaluated for affine
alignment. The probabilistic interpretation of ψ naturally has the flavor of a mixture of
Gaussians. However, ψ has curve normal information and the squared magnitude of ψ is
not actually a mixture of Gaussians. The latter indeed has eigenvector information in the
covariance matrix but this cannot be interpreted as the normal to a curve: eigenvectors
have direction but not orientation since ek and −ek are the same eigenvector.
78
In the complex wave ψ, the normal is directly encoded into the representation, and we
can solve for it in a number of ways. Our preliminary results for curve normal parameter
estimation using maximum likelihood suggest that the interesting (computationally hard)
problem of orienting the normals will be an exciting new route to the signed distance
function problem. We must emphasize here that the CWR lends itself to multiple avenues
of the parameter estimation process: probabilistic, geometric, and data driven (see below).
The unnormalized function |ψ(x)|2 is
|ψ(x)|2 ∝N,N∑j,k=1
cos
(νj(x− µj)− νk(x− µk)
λ
)exp
−∥x− µj∥
2
2σ2− ∥x− µk∥
2
2σ2
. (4–1)
Note that this is not the L2 norm but the squared magnitude of ψ at location x. It is not
obvious from the expression above, but as |ψ(x)|2 is the magnitude squared of a complex
number, it is nonnegative everywhere. When suitably normalized, |ψ(x)|2 can be treated
as a probability density function which immediately connects it to the plethora of shape
density functions used in the literature.
Here we consider the shape registration problem under a maximum likelihood
formulation. C = µk, νkN1k=1 is given as a template and the task is to find a mapping
from P = xjN2j=1 to C within a class of admissible maps H. The maximum likelihood
optimization problem
maxf∈H
N2∏j=1
|ψ(f(xj);C)|2, (4–2)
is robust to Gaussian noise on C (Figure 4-8). Here H consists of a rotation followed by a
shear. Note that this is not the same as maximizing the likelihood of a Gaussian mixture
on a test point-set since the cross terms of |ψ|2 interact. Instead, it is uniquely suited to
situations in which an oriented template is registered to an unoriented point-set.
79
...
..0.
0.1.
0.2.
0.3.
0.4.
0.5.
0.6.0 .
0.1
.
0.2
.
0.3
.
0.4
.
0.5
.
Deformation Level
.
Error
.
. ..FGM . ..DIFF . ..GMM . ..CPD . ..RDM
..
..0.
0.1.
0.2.
0.3.0 .
0.1
.
0.2
.
0.3
.
0.4
.
0.5
.
Occlusion Rate
.
Error
...
(a)
...
..0.
0.5.
1.
1.5.
2.
2.5.
3.
3.5.
4.
4.5.
5.
5.5.
·10−2
.0.2 .
0.25
.
0.3
.
0.35
.
0.4
.
0.45
.
0.5
.
0.55
.
0.6
.
0.65
.
0.7
.
0.75
.
0.8
.
0.85
.
0.9
.
Noise Level
.
Frechet
Distance
..
..0.
0.1.
0.2.
0.3.
0.4.
0.5.
0.6.
0.7.
0.8.
0.9.
1.
1.1.0.1 .
0.15
.
0.2
.
0.25
.
0.3
.
0.35
.
0.4
.
0.45
.
0.5
.
0.55
.
0.6
.
0.65
.
0.7
.
0.75
.
0.8
.
Outlier/Inlier Ratio
.
Frechet
Distance
...
(b)
Figure 4-5. Experimental comparison of RDM and several point and graph matchingalgorithms on a 2-D dataset. a) Average error for 2-D synthetic experiments.The GatorBait Dataset is deformed as explained in Section 4.3. The left plotshowcases robustness to moderate deformation levels and the right plot showsrobustness to occlusion (dropping points from a randomly placed circular disc).b) Average Frechet Distance for 2-D real experiments. The Frechet distancesbetween the registered template and the target curves are reported. On the leftthe target has added noise of the indicated standard deviation and on the rightoutliers are added.
80
...
..0.
0.05.
0.1.
0.15.
0.2.
0.25.
0.3.
0.35.
5 · 10−2
.
0.1
.
0.15
.
Deformation Level
.
Error
..
..0.
0.05.
0.1.
0.15.
0.2.
0.25.
0.3.
0.35.0 .
0.03
.
0.06
.
0.09
.
Occlusion Rate
.
Error
..
..0.
0.05.
0.1.
0.15.
0.2.
0.25.
0.3.
0.35.
0.5
.
0.75
.
1
.
Deformation Level
.
%of
Recovered
Orientations
.
. ..localPCA . ..RDM
...
(a)
...
..0.
5.
10.
15.
20.
25.
30.0 .
0.1
.
0.2
.
0.3
.
0.4
.
0.5
.
0.6
.
0.7
.
0.8
.
0.9
.
1
.
Percentage of Diameter
.
Recall
..
..
5
.
10
.
15
.
20
.
25
.
30
.
0.1
.
0.2
.
0.3
.
0.4
.
0.5
.
0.6
.
0.7
.
0.8
.
0.9
.
1
. Percentage of Diameter.
Recall
...
(b)
Figure 4-6. Experimental comparison of RDM and several point and graph matchingalgorithms on 3-D datasets. (a) The same experiments as 4.4 b) are carriedout in 3-D on the Stanford Bunny. The legend is consistent between that figureand this one. Here we also report the percentage of normal vectors recoveredto within a cone of π/3 radians.The left plot showcases robustness to moderatedeformation levels and the right plot shows robustness to occlusion (droppingpoints from a randomly placed circular disc). (b) Recall for TOSCA (left)and FAUST (right). Recall graphs for a subset of TOSCA and FAUST arereported. See Section 4.4 for more details.
81
0 0.01 0.02 0.030
0.2
0.4
0.6
0.8
1
Distance
Rec
all
Matching to Frame41
0 0.01 0.02 0.030
0.2
0.4
0.6
0.8
1
Distance
Rec
all
Matching to Frame61
0 0.01 0.02 0.030
0.2
0.4
0.6
0.8
1
Distance
Rec
all
Matching to Frame81
0 0.01 0.02 0.030
0.2
0.4
0.6
0.8
1
Distance
Rec
all
Matching to Frame101
RDM CPD GMM DIFF FGM NONE
AUCAlgorithm Frame 40 60 80 100FGM-del 1.00 1.00 1.00 1.00FGM-2NN 1.00 .867 .800 .500RDM .931 .871 .857 .833CPD .888 .819 .731 .681GMM .862 .795 .738 .671DIFF .836 .791 .688 .403
RuntimesAlgorithm Frame 40 60 80 100FGM-del 8.8s 11s 17s 15sRDM 5.9s 5.1s 6.5s 5.2sCPD .32s .35s .44s .38sGMM 6.2s 7.5s 8.8s 11sDIFF 12s 11s 12s 15s
Figure 4-7. Recall graphs and area under the curve for the CMU House. For FGM,triangulation yields excellent matches but nearest-neighbor graphs arepoor. All experiments run on an AMD X2 B22 with 8Gb of RAM. Ourimplementation is not optimized for runtime but is competitive. AlthoughGMM has quicker function evaluation, RDM converges faster for this dataset.We report AUC in the table.
82
σerr/σdata |θ∗ − θ| |s∗1 − s1| |s∗2 − s2|.045 0.006 0.076 0.046.075 0.008 0.088 0.027.09 0.008 0.052 0.052
Figure 4-8. Maximum likelihood alignment using |ψ(x)|2 as a density.The blue circles() are a noiseless template with accurate normal data while the red points(×) are points sampled from the template with Gaussian noise added. For arange of noise parameters, an alignment of the noisy data to the template wasfound by maximizing the likelihood in (4–2). The unknown transformationparameters were drawn uniformly with θ ∈ [0, 2π) and s1, s2 ∈ [.5, 2]. θ is thetotal rotation angle of the template, s1, s2 scale in the respective directions ofthe rotated basis. σdata is the spatial standard deviation of the points in thetemplate, and σerr is the standard deviation of the added Gaussian noise to thetest point-set. Root mean squared relative error is reported over 25 trials ateach noise threshold.
83
CHAPTER 5THEORY OF THE REPRESENTATION: CONNECTEDNESS, COMPLETENESS, AND
CONTRIBUTIONS TO THE GABOR EXPANSION
In this Chapter theoretical results relating to the CWR as an implicit shape field are
developed along two axes:
Direct analysis of the connectedness of the zero level-sets of the phase,
Asymptotic approximation of the modular distance field by families of GaborWavelets.
5.1 Connectedness of Pairs of Complex Waves
This section contains a direct proof of the connectedness of the zero level-sets of
the Complex Wave Representation for symmetrically oriented points. In some cases
explicit specification of the geometry of the level-sets is also available. The proofs of these
theorems only go through when there are two atoms present in the field. In Section 5.2 a
pertubation-based argument proves the existence of an extension to more than two atoms
in certain cases.
The main goal of this section is to show the connectivity of a 0−level-set of the phase
of a superposition of waves. The appropriate set runs through a neighborhood near each
of the wave source locations. Throughout, m1,m2 will refer to the spatial locations in the
Euclidean plane of two oriented point centers and ν1, ν2 will refer to their respective unit
normal vectors. We will use ϕ1, ϕ2 to refer to the respective angles of ν1, ν2. ψ will be the
CWR of the two oriented points. Unless otherwise stated m1,m2 will be assumed to take
values (1, 0) and (−1, 0). To obtain the theorem for general symmetric configurations relies
on the behavior of ψ and the parameters under the action of similarities.
5.1.1 Zeros of ψ
The non-vanishing of Reψ implies the existence of θψ (the phase of ψ) and it will be
important to establish it in the sequel. We will characterize the zeros of θψ in terms of
Imψ, so we will need to prohibit co-occuring zeros. We characterize the set where ψ can
vanish for a pair of oriented points.
84
Lemma 2. Let m(1)1 > 0 and m2 = −m1. Then |ψ| = 0 lies entirely within x1 = 0.
Proof. ψ(x) = exp− ||x−m1||22σ2 + iνT1 (x − m1) + exp− ||x−m2||2
2σ2 + iνT2 (x − m2).
|ψ| = 0 ⇐⇒ |ψ|2 = 0. And
|ψ|2 = [Reψ]2 + [Imψ]2 = exp−||x−m1||2
σ2+ exp−||x+m1||2
σ2
+ 2[sinνT1 (x−m1)/λ sinνT2 (x+m1)/λ+ cosνT2 (x−m1)/λ cosνT2 (x+m1)/λ
]exp−||x−m1||2
2σ2− ||x+m1||2
2σ2
= exp−||x−m1||2
σ2+ exp−||x+m1||2
σ2+
2 cosνT1 (x−m1)/λ− νT2 (x+m1)/λ exp−||x−m1||2
2σ2− ||x+m1||2
2σ2
≥ exp−||x−m1||2
σ2+ exp−||x+m1||2
σ2 − 2 exp−||x−m1||2
2σ2− ||x+m1||2
2σ2.
Suppose that x∗1 > 0 (the first component of x∗ is positive). Then ||x∗−m1||2 < ||x∗+m1||2
and so exp− ||x−m1||22σ2 > exp− ||x+m1||2
2σ2 . By factoring the last equation as
(exp−||x−m1||2
2σ2 − exp−||x+m1||2
2σ2)2,
which is nonnegative by the preceding observation, we see that
|ψ(x∗)| > 0.
The same argument shows that |ψ(x)| > 0 for x ∈ R− × R. Therefore, zeros can occur only
when ||x−m1|| = ||x+m1||, so x must lie on x1 = 0.
It is also clear from the argument above that zeros can only occur when cosνT1 (x −
m1)/λ− νT2 (x+m1)/λ = −1, which implies
Theorem 5.1. Zeros of |ψ| can only occur along the Voronoi boundary between m1,m2 at
points x : νT1 (x−m1)− νT2 (x−m2) = λkπ such that k ∈ 2Z+ 1.
Zeros of magnitude are one possible form of “disconnection” in the level sets of the
phase of ψ. These correspond to real zeros crossing imaginary zeros. Another type of
disconnection that can occur is a disconnection caused by endpoints in the zero level-set.
85
Part of any proof of connectedness will involve either excluding these possibilities or
explicitly showing a parametric connected curve arising from the implicit functions. This
subsection divides along these lines. In the symmetric case (when m1 = −m2 = −(1, 0)
and ϕ1 = π − ϕ2) we can provide a closed form parametric curve for the zeros. In the
asymmetric case we can characterize when the disconnections occur as we let ϕ1 or ϕ2
move away from the symmetric configuration.
5.1.2 Connectedness of θψ = 0 for Symmetric Configurations
It is simpler to prove the connectedness of the zero level set of the phase under special
configurations. The approach taken here is to first prove it for a class of configurations and
then expand out from there. The configuration that is most amenable to connectedness
is one in which the normal vectors and the locations “agree.” By agree we mean that the
equation of the lines
ℓ1 = x : ν1 · (x−m1) = 0,
ℓ2 = x : ν2 · (x−m2) = 0,
intersect at the interface of equal distance between the points. We also call this the
“coherent” case. It is depicted visually by the red and blue lines in 5.1.2. We further
simplify this by assuming symmetry of m1,m2 about the y−axis. This assumption is
without loss of generality of conditions for the connectedness, as will be shown below. This
case is characterized by the following equations
m(1)1 = −m(1)
2 , ν(1)1 = −ν(1)2 , ν
(2)1 = ν
(2)2 . (5–1)
Note that the above conditions imply the existence of a point p ∈ x1 = 0 such that
d(p,m1) = d(p,m2) and ν1(p−m1) = 0 = ν2(p−m2).
Theorem 5.2. Under the conditions of Equation (5–1), the zero level-set of the phase of ψ
is connected from a small set near m1, thru p, and to a small set near m2. It is symmetric
about the line of equidistance. If ν(1)/λ ∈ Zπ/2 then the line goes through m1 and m2.
86
.. y..π2
.π
.3π2
.2π
Figure 5-1. Visualization of g along a vertical slice of the set containing a zero of Imψ. Thered function is the second term and the blue is the first term of g. The greenfunction is the resulting tempered sinusoid. The tan region contains the zerolevel-set of g for all values of x ∈ [−1, 0].
The following claim will help to prove the Theorem.
Claim 1 If Imψ = 0 and Reψ = 0 on A ⊂ R2 then θψ = 0 on A.
Proof. As θψ = tan−1(
ImψReψ
)and the argument to arctan is defined and zero, the claim
follows.
Now we are ready to prove Theorem 5.2.
Proof. Factor out a term from Imψ and define a function f as follows
f(x) = Imψ/(exp−(x2 + y2 − 2x+ 1
2σ2)
= exp−2x/σ2 sin(ν1((x, y)−m1)/λ) + sin(ν2((x, y)−m2)/λ).
Since m1,m2, ν1, ν2 obey Equation (5–1), we can write
f(x) = exp−2x/σ2 sin((ν(1)(x+ 1) + ν(2)y)/λ) + sin((−ν(1)(x− 1) + ν(2)y)/λ).
The zeros of f coincide with the zeros of Imψ, as the factor removed from Imψ in
constructing f is nonzero.
Fix x ∈ [−1, 0]. So exp−2x/σ2 ≥ 1. In the y direction about the point (x, ν(1)x−ν(1)ν(2)
),
which falls on the line ℓ1 perpindicular to ν1 going through m1, the first term in f acts line
87
a sine without a shift. We can view f about this point by rewriting f as
g(y) = exp−2x/σ2 sin(ν(2)y/λ) + sin(ν(2)y/λ+ θ). (5–2)
By inspecting Equation (5–2) one can see that in the y direction g is periodic with period
2πλ/ν2. Note that
g(y) = f(x,ν(1)x− ν(1)
ν(2)+ y).
Now we wish to show that at each x, g has a single zero near ℓ1.
g has the form A sin(ωt) + sin(ωt+ θ) and so it is possible to write g as a single sine,
g(y) =√1 + exp−4x/σ2+ 2 exp−2x/σ2 cos(θ) sin(ν(2)y/λ+ ϕ),
θ = ν2(x−m)/λ
ϕ = tan−1(sin(θ)
exp−2x/σ2+ cos(θ)). (5–3)
First note that the factor in the square root is nonzero and nonnegative since if θ = π
then exp−2x/σ2 > 1 and so for some ϵ > 0 we have exp−4x/σ2 = (1 + ϵ)2 and
1 + (1 + ϵ)2 − 2(1 + ϵ) = ϵ2 > 0. The next important fact is that the denominator of the
argument to tan−1 in the equation for ϕ is nonzero for all x ∈ [−1, 0], and so all points
along a zero level-set of this function belong to the same branch of the arctangent. Thus
γ(x) =
(x,−ν
(1)x+ ν(1)
ν(2)− λϕ/ν(2)
)defines a smooth curve of zeros of the function f . γ is connected since γ(2) varies smoothly
with x (as exp−2x/σ2+ cos(θ) is nonzero).
In this symmetric configuration ν1(x − m1) − ν2(x − m2) = 0. So it follows that
there are no zeros of ψ along x = 0. Thus, by Claim 1 these zeros of f are also zeros
of the phase of ψ. It follows that this line of zeros passes within λ exp−2/σ2ν(2)(1−exp−2/σ2) of m1 as
ddx
tan−1(x) < 1. This bound is rough, as it does not take into account the relationship
88
between sin and cos functions but simply extremizes both independently, a tighter bound
can be obtained by substituting cos−1(exp−2/σ2) into θ.
Since all of the parameters have reflection symmetry about x = 0 so does Imψ.
Thus the embedded curve γ can be extended from [−1, 0] to [−1, 1] symmetrically, defining
zeros of Imψ on the interval connecting m1 to m2. Once we do this, we can write the
extended curve as
γ(x) =
x,−ν1(|x| − 1) + λ tan−1(
sin(−2ν1|x|/λ)cos(−2ν1|x|/λ)+exp2|x|/σ2
)ν2
.
If ν(1)/λ ∈ Zπ/2 then at x = −1, 1, the arctangent is zero. So γ runs through m1,m2.
Let q denote a line in R2, and Rq denote the reflection operator about q — a
Euclidean transformation. Recall the action of Euclidean transformations on oriented
points from Chapter 3. Since the preceding argument does not depend on the choice of
σ, λ the following Theorem holds
Theorem 5.3. Let m1,m2, ν1, ν2 be a collection of two oriented points such that there
exists a line z about which Rz(m1, ν1) = (m2, ν2). Then the CWR for (m1, ν1), (m2, ν2) has
a connected zero level set going between small neighborhoods of m1,m2 through the point of
intersection of normal lines defined by the oriented points.
Proof. There is a similarity transforming the oriented points in an arbitrary configuration
subject to the conditions above to the configuration of Theorem 5.2. If the action of the
similarity changes the magnitude of ν1, ν2 then adjust λ to make them have unit norm
again—since the Theorem’s conditions do not depend on λ this can be done without
changing the result. In the new coordinate system Theorem 5.2 holds. Similarities are
smooth bijections, so they preserve connectedness of sets. Since the similarity maps the
zeros of the transformed function to the zeros of the original, the result follows.
89
5.1.3 Connectedness of θψ = 0 for Asymmetric Configurations
The technique of re-writing an equation proportional to Imψ does not result in an
explicit formula of a cuve for the asymmetric case. It is possible to obtain a pure sinusoid
in the direction ν1+ν22
, but this introduces a phase shift with an (x, y) dependence which
does not help to characterize the zeros of the resulting function.
Therefore, we tried two approaches to characterize the zeros in this case which led
to success. First, a numerical approach was used to explore the different configurations
of ν1, ν2 and the effect on connectedness. Based off of this a few conjectures about the
behavior of an analytic relationship were established. Then after deriving a geometric
relationship describing the transition from a connected to disconnected state, a collection
of nonlinear equations is derived characterizing the transition. From here, an implicit
characterization that is directly interpretable can be derived. The situations can be
characterized when asymptotics in σ, λ are considered. In most non-asymptotic cases the
functional characterization is not directly solvable. However, applying numerical means to
it results in a more effecient and robust characterization procedure than initially devised.
The result is a fast algorithm that can determine connectedness given parameters.
5.1.3.1 Numerical analysis of asymmetric connectedness
A numerical approach was taken to characterize the connectivity, with the hope
that this would lead to a hypothesis about the underlying behavior of the CWR for
2-atom asymmetric configurations. For a range of σ, λ values, we computed the CWR for
m1 = (−1, 0), m2 = (1, 0) at every possible configuration for ν1, ν2 (up to symmetry for
ν1). The CWR was computed on a grid of [−3, 3]× [−10, 10] with spacing h1 = .03, h2 = .1.
Then, the zero level-sets of the resulting image were extracted using marching squares.
Each of these corresponds to a series of locations in the image plane, and the magnitude
at these locations was summed. This was taken as an approximation to the line integral of
|ψ| along the zero level-sets. There may be several such level-sets, so this is done for each.
Finally, the level-sets are ranked by this line integral. If there is a single dominant (at least
90
10 times the next highest value) value then this is taken as a symmetric configuration,
otherwise it is taken as disconnected. The results are shown in Figure 5-2.
Based on the results in Figure 5-2 a trend emerging is that for a fixed σ−value,
increasing the value of λ eventually results in a connected configuration so long as the
angles of the normal vectors belong to the same interval [0, π]. The following argument
as to why led to subsubsection 5.1.3.2, which provides an analytic characterization of the
event of a disconnection.
Suppose we fix ϕ1. View Imψ as a function of (x, y, ϕ2) for now. Equivalently, consider
the function f : R2 × R → R : (x, y, ϕ2) 7→ Imψ(x, y;ϕ1, ϕ2). We use the notation ϕ1, ϕ2 to
indicate the angle of the normal vector, i.e. ν1 = (cos(ϕ1), sin(ϕ1)).
By Subsection 5.1.2 we know that at ϕ2 = π − ϕ1 there is a single connected curve
joining a region near m1 to a region near m2. This is the symmetric setup. Then if we vary
ϕ2 smoothly on T then we get a smooth change of f . By the way the symmetric argument
is structured, it is clear that if a small variation of one of the normal vectors causes a
disconnection then that disconnection must occur at the intersection point of the lines
at the zero of Imψ. If there is an isolated ϕ2 then ∂f∂ϕ2
= 0. However, if we work out the
derivative of f with respect to ϕ2 we get
exp−|x− 1|2 + y2
2σ2 cos(ν2 · (x− 1, y)/λ)
ν⊥2 (x− 1, y)
λ,
which is nonzero along ν2(x − 1, y) = 0, except at x = 1, y = 0. Thus, it follows that in a
small neighborhood of π− ϕ1 that the zero level-set remains connected as we vary ϕ2. Note
that solving for ∂f∂ϕ2
= 0 does not depend on σ in the preceding argument. Considering
the surface of the zero level-set of f , it must be shaped like a saddle about a point of
intersection. Thus, we can look for points which a saddle point occurs to characterize the
disconnection.
91
5.1.3.2 An analytical condition for asymmetric connectedness
By inspecting visualizations of the imaginary part of ψ under varying ν2 conditions,
one notices a pattern. It is also possible to predict theoretically that for fixed values of
ν1, σ, λ moving the value of ν2 away from the symmetric case will lead to a disconnection,
and that at the point of disconnection a saddle point emerges. This is clear theoretically
because there are two possibilities of what can happen to the line connecting regions near
m1,m2 as we alter ν2: the line can become part of a ridge of extremal values resulting
in an endpoint, or the line can meet with another zero level-set. We will discuss these
possibilities in greater detail below, but it is clear to see that the former condition should
not be expected from the superposition of two sinusoids.
Thus, the saddle point emerges at the point where a disconnection (or connection)
occurs. Saddle points correspond to ∇Imψ = 0 with |HImψ| < 0. The gradient condition
provides us with two equations
∇Imψ = 0 or
0 =− (x+ 1, y)
σ2exp−(x+ 1)2 + y2
2σ2 sin(ν1 · (x+ 1, y)/λ)
− (x− 1, y)
σ2exp−(x− 1)2 + y2
2σ2 sin(ν2 · (x− 1, y)/λ)
+ ν1/λ exp−(x+ 1)2 + y2
2σ2 cos(ν1 · (x+ 1, y)/λ)
+ ν2/λ exp−(x− 1)2 + y2
2σ2 cos(ν2 · (x− 1, y)λ). (5–4)
For exposition we use the notation
θj = νj · (x−mj)/λ.
92
Note that we also expect the saddle point to show up on the Imψ = 0 level-set. This
means we have a third equation
Imψ = 0
exp−2x
σ2 = −sin(θ2)
sin(θ1), (5–5)
after simplifying a bit. We seek a ν2, x, y satisfying these conditions: since ν2 = (cos(ϕ2), sin(ϕ2))
these three equations can be used to predict a point (x, y) and an angle ϕ2 that produces a
disconnection.
To summarize this approach: we expect all values of ϕ2 between π−ϕ1 and the nearest
value of ϕ2 such that Imψ = 0,∇Imψ = 0 to result in a connected component going
through near m1 to near m2. We will solve explicitly for x, y given ϕ2 to get x(ϕ2), y(ϕ2).
Then we will use Imψ(x(ϕ2), y(ϕ2)) = 0 to solve for the nearest value of ϕ2 which results in
a disconnection.
First, we will manipulate Equation (5–4) into a useful form for further analysis. The
key to doing so will be invoking Equation (5–5). If we multiply both gradient equations by
expx2+2x+y2+12σ2 we get
− (x, y)
σ2exp−2x/σ2 sin(θ1)−
(x, y)
σ2sin(θ2) +
(2, 0)
σ2sin(θ2)
+ ν1/λ exp−2x/σ2 cos(θ1) + ν2/λ cos(θ2) = 0. (5–6)
Now we invoke Equation (5–5) to replace the exponential decay with a ratio of sines. Note
that this only applies away from zeros of sin(θ1).
(x, y)
σ2sin(θ2)−
(x, y)
σ2sin(θ2) +
(2, 0)
σ2sin(θ2)
− ν1/λ sin(θ2) cot(θ1) + ν2/λ cos(θ2) = 0.
93
The first two terms cancel and after reorganizing we get the following pair of equations
cot(θ1) =ν(1)2
ν(1)1
(cot(θ2) +
2λ
σ2ν(1)2
),
cot(θ1) =ν(2)2
ν(2)1
cot(θ2).
Now we set the two RHS equations equal to each other and rearrange
tan(θ2) =ν2 · ν⊥1ν(2)1
σ2
2λ.
Similarly for tan(θ1)
tan(θ1) =ν2 · ν⊥1ν(2)2
σ2
2λ.
So we can solve for x, y by solving
ν(1)1 (x+ 1) + ν
(2)1 y = λ tan−1
(σ2
2λ
ν2 · ν⊥1ν(2)2
),
ν(1)2 (x− 1) + ν
(2)2 y = λ tan−1
(σ2
2λ
ν2 · ν⊥1ν(2)1
). (5–7)
We obtain
x =
λ
(ν(2)2 tan−1
(σ2
2λ
ν2·ν⊥1ν(2)2
)− ν(2)1 tan−1
(σ2
2λ
ν2·ν⊥1ν(2)1
))− ν(1)1 ν
(2)2 − ν
(2)1 ν
(1)2
ν⊥1 ν2,
y =
λ tan−1
(σ2
2λ
ν2·ν⊥1ν(2)2
)− ν(1)1 (x+ 1)
ν(2)1
. (5–8)
There are a few important features to address in this equation. First, the tan−1
has branches spaced at distance π. Determining which branch corresponds to the zero
and saddle point that indicates a disconnection is the first problem we need to solve
to predict parameters that lead to disconnections. Fix a branch of the arctangent and
let x∗, y∗ denote the corresponding solution to Equation (5–8). While it is possible that
Imψ(x∗, y∗) = 0 for this solution, some other branch may solve the necessary equation.
94
Looking at Equation (5–7), it seems that the first few cuts will correspond to the most
significant level set, as the LHS of these equations corresponds to the line perpindicular
to the parameters of the oriented points. This mean that taking a greater branch will
push the solution in Equation (5–8) further away from the centroids and decrease the
likelihood of the resulting curve. We have found empirically the +π branch for the first
and third arctangents results in similar zero crossings to the numerical results reporting in
subsubsection 5.1.3.1. To get more insight into this reasoning see Chapter 6 Section 6.1.
Second, we must consider viability of the Equations. The substitution of Equation
(5–8) requires sin(θ1) = 0. Note that, if sin(θ1) = 0 then for Imψ = 0 it follows that
sin(θ2) = 0. Also, the cos terms must both be 1. Then the gradient equation results in
ν1 = exp−2x/σ2ν2, which can only happen if ν1 = −ν2 and at x = 0. Equations (5–7)
require ν(2)1 , ν
(2)2 be nonzero, and ν⊥1 · ν2 be nonzero. These conditions are each discussed
below.
We first focus on the ν⊥1 · ν2 → 0 asymptote. This asymptote occurs when ν1 = ±ν2.
If ν(2)1 = 0 then the result is a disconnected pair of vertical curves. Observe that from
Equation (5–6) if we replace ν2 by ν1 then the y−equation corresponds to Reψ = 0. This
leads to the equation ν(1)1 = λkπ for some k ∈ 2Z + 1. Thus if λ > 2/π then these cases
remain connected for ν1 in the upper half arc.
Finally, returning to the dependence on the parameters σ, λ. We can predict the
dynamics of the disconnections as we vary σ, λ. First, if we fix everything but λ we
note that the function λ tan−1(1/λ) has a zero and point of C1 discontinuity as λ ↓ 0.
Intuitively, as the wavelength λ ↓ 0 the disconnections happen more rapidly as we vary
ν2. Indeed, we can see that as λ ↓ 0 that unless we are in a symmetric configuration that
eventually the condition in Equation (5–8) will be satisfied for any given branch of arctan.
On the other hand, as λ ↑ ∞ we get that
ν1 · (x−m1) =σ2ν2 · ν⊥12ν
(2)2
, ν2 · (x−m2) =σ2ν2 · ν⊥12ν
(2)1
.
95
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = .45, λ = .45
(a)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = 1.6, λ = .45
(b)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = .45, λ = 1.6
(c)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = 1.6, λ = 1.6
(d)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = .45, λ = 3.2
(e)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = 1.6, λ = 3.2
(f)
Figure 5-2. Numerical experiments showing the connectedness and non-connectednessat different values of parameters. White cells indicate a single connectedcomponent in a subdomain with total magnitude at least 10 times the nexthighest components magnitude. This is the situation in which the unwrappingalgorithm outlined below has a high confidence of success.
96
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = .45, λ = .45
(a)
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = 1.6, λ = .45
(b)
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
Angle of ν1
Angleofν 2
σ = .45, λ = 1.6
(c)
0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Angle of ν1
Angleofν 2
σ = 1.6, λ = 1.6
(d)
0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Angle of ν1
Angleofν 2
σ = .45, λ = 3.2
(e)
0 0.5 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Angle of ν1
Angleofν 2
σ = 1.6, λ = 3.2
(f)
Figure 5-3. Plots showing the zero crossings of interest for the analytical solution to thedisconnection problem. White cells indicate positive values for Imψ (computedby assuming reflection symmetry about ϕ1 = π/2 and ϕ2 = π/2). The dotsalong the anti-diagonal are due to numerical instability as ν1 · ν2 ↓ 0. Insubfigure b) a flaw of the numerical experiments is exposed: the rankingalgorithm becomes less effective when the spatial covariance variable is veryhigh and so ordering the significance of curves becomes harder. Anecdotally,the result indicated by these plots is more suggestive of the real situation.
97
If we then apply a small angle approximation to the sines in the Imψ = 0 equation we
get
exp−2x/σ2 = −ν(2)1 /ν(2)2 .
This clearly has no solution when ν1 and ν2 both point in the same upper or lower
half-space. This justifies the suggestion that for fixed σ, ν1, ν2 sending λ ↑ ∞ tends
to produce a connected contour as long as the normal vectors are not in a degenerate
configuration.
As σ ↑ ∞ we get θ1 → π/2λ, θ2 → π/2λ. This leads to exp−2x/σ2 = 1 as the
equation for Imψ = 0, since one of θ1, θ2 must be negative and one must be positive for a
zero to occur.
As σ ↓ 0 both equations go to zero, and we expect the break to happen near the line
x = 0 as the exponential in the imaginary zero condition blows up away from this line.
This results in
sin(θ1) + sin(θ2) = 0,
θ1 = θ2 + π + 2kπ,
since sine is 2π periodic and flips sign by a shift of π. This results in the same conditions
of Theorem 5.1 as σ ↓ 0.
5.2 Going Beyond Two Atoms with Imψ
In this subsection, the subset of the class of curves for which the representation is
complete is outlined. If we relax the representation to have arbitrary R-valued coefficients
(instead of unit-valued coefficients) in front of each wave component then the closure of
the class of curves for which the representation is complete is the set of all curves arising
as the level-sets of phases of functions in the Modulation Space of functions with bounded
Gabor expansions.
However, this result alone is not fully satisfying for several reasons:
98
1. The unit-valued coefficient can be interpreted better than R−valued coefficients,
2. The type of convergence may not be amenable to engineering techniques or
applications,
3. We may need to use an arbitrarily small value of σ,
4. There may not be a straightforward approach to constructing curves in the closure of
a family of implicit functions.
We provide a more detailed analysis of this general form of completeness in the following
Section, where we analyze the contributions to the Gabor expansion of an auxiliary field
to the signed distance function. This field represents an element of the equivalence class
of curves embedded inM that corresponds to some fixed curve. It is privileged in some
sense: it has large values of Gabor expansion at points drawn from the boundary with
frequency vectors drawn from the normal bundle at that point.
The class of curves we focus on in this Section arises in a natural way from the proof
of the connectedness in the two-atom case above. Fix a length L > 0. Consider the
following constraints on polygons P = (ni, vi, ei)Ni=1 (where vi, ei represent the polygon
and ni are nodes representing the midpoints of the ei).
(D1) The length of each edge of P is an integer multiple of L,
(D2) The set of nodes N defined by midpoints along piece-wise linear components of
length L can be numbered in order (along the boundary of P) as 1, 2, . . . , n and each
pair meets at some vertex vi of P at distance L/2 from both node ni and node ni+1,
(D3) Let (ni, ni+1) be the nodes adjacent to vertex vi. No other node is as near to vi. No
other node is nearer to ni or ni+1 besides ni−1 or ni+2 respectively.
We will show that the set of polygons P is approximable in the directed Hausdorff sense
by the zero level set of ImψP where Pi = ni, (vi−ni)⊥
||vi−ni|| N−1i=1 ∪ ni+1,
(ni+1−vi)⊥||ni+1−vi|| . Given a
metric space (M,d), the direct d−Hausdorff distance (or just directed Hausdorff distance)
99
is a distance between two sets X, Y ⊂M . It is given by the following min/max formula
dH(X,Y ) = supx∈X
infy∈Y
d(x, y).
We consider the curves in this section as embedded in (R2, || · ||∞) and so the supremum
norm induces the inner distance function in our definition of the directed Hausdorff
distance.
The proof of this is sketched as follows:
1. The 2-atom case is shown above, corresponding to a pair of nodes and a vertex at
the intersection point of their normal lines,
2. Each successive pair of atoms in P obeys the conditions for the 2-atom case,
provided P obeys Conditions (D1), (D2) and (D3), since the locations are the
midpoints of straight line segments and the normal vectors are the corresponding
perpindiculars.
3. We show that given a 2-atom configuration obeying the conditions for the 2-atom
case, adding a perturbation of sufficiently small magnitude and gradient magnitude
results in a zero level-set close to the original in the H∞ metric.
4. We show that for a collection of atoms drawn from the polygon described above, for
each pair of consecutive atoms representing two nodes joined at vertex i there is a
function ηi that is an upper bound for the magnitude of the remaining atoms.
5. Therefore, there exists a σ sufficiently small (since it controls the magnitude of
ψP\Pi) so that the contribution from all oriented points beyond a fixed 2-atom case is
small.
6. Since there are a finite number of edges to consider in the polygon, and each one
has a σ such that the remaining part of Imψ , ImψP\Pi, causes a small perturbation
near it, by taking the minimal σ over all successive pairs we can bound the overall
directed Hausdorff distance between the new curve(s) and the polygon. Since
100
Imψ is smooth the curves transition smoothly between pairs the entire polygon is
approximated by this (these) curve(s).
Note that we do not prove the connectedness of the resulting curve. We also give anecdotal
verification of this class with examples, and provide examples of degenerate configurations.
5.2.1 Stability of Level-sets of Imψ
In this subsection it is shown that under the symmetric case and appropriate
additional conditions ψ admits a family of small perturbations that do not excessively
shift the zero crossings of Imψ. The main condition used here is Condition (D3). We will
show that for δ > 0 the directed Hausdorff distance from the polygon to the level-set is
less than δ.
Take (mi, νi) and (mi+1, νi+1) as the pair of atoms under consideration, call them
Q collectively. As outlined above, we will consider each consecutive pair of atoms as if
they were in symmetric configuration. This can be done by a similarity transformation
of the atoms. The resulting function will have scaled values of σ, λ corresponding to the
similarity. The similarity has determinant between 2/L and 4/L since a pair of atoms is
between L/2 and L apart and we will map them to 2 apart.
With the appropriate similarity transformation applied, we first make an observation
about the polygonal arc joining mi, vi,mi+1. Due to Condition (D3) no nodes are as near
to vi as mi,mi+1. These points have distance L/2 to vi since they are midpoints on a
linear segment ending at vi. Since Voronoi cells are convex, it follows that the whole curve
mi → vi → mi+1 falls in the union of the Voronoi cells of mi and mi+1. Call this curve Vi.
So there is some ϵ > 0 such that
L/2 + ϵ < minj∈1,...,i−1,i+2,...,N
d(mj, Vi).
By the triangle inequality, there are no other nodes within ϵ of mi,mi+1. So at each
point x within ϵ of Vi one of mi or mi+1 is nearer to x than any other point. Recall that
the sinusoidal part of the imaginary part of the symmetric configuration has constant
101
...
..−2.
−1.5.
−1.
−0.5.
0.
0.5.
1.
1.5.−2 .
−1.5
.
−1
.
−0.5
.
0
.
0.5
.
1
.
1.5
.
L/2
.
T
.
dmin > L/2 + ϵ
.
. ..imψ = 0 or γi
. ..upper and lower boundaries for γTi
. ..atoms
. ..nearest other atom
. ..Voronoi Boundary b/w mi, mi+1
. ..Vi
Figure 5-4. An explanatory figure to accompany the proof of approximation for themulti-atom case. By taking T = sin−1(κ) for κ ≤ sin( ϵν
2
λ), it follows that
T < ϵ. This is an important step in choosing σ, λ resulting in stable level-sets.
phase at all points: ν2/λ in the direction of the Voronoi boundary (recall ν2i = ν2i+1 in this
configuration). We will show that there is a value of σ such that after the perturbation by
ImψP\Pithe zero level set remains within distance δ of Vi.
For some value of the sinusoidal part, κ, there is a fixed T such that for every point z
along γi (the zero level-set of the resulting Imψ for these atoms) the value at z ± T (0, 1) is
±κ, provided that κ ∈ [0, 1]. The set γTi = (x, y + z)|(x, y) ∈ γi, z ∈ [−T, T ] falls within
δ′ = min(δ, ϵ) of Vi if we take
κ = sin(ν2T
2λ) and
λδ′ >σ2
2
we get that supx∈[−1,0]
sin(θ2(x))exp−2x/σ2+cos(θ2(x))
< δ′/2. Since T is at most δ′/2 and going from Vi
to γi by the overestimate of the arctangent yields a bound on the distance of points in γTi
to Vi of δ′/2, the total distance is at most δ′. The denominator of the overestimate of the
arctan does not get too small because of the bound on λ (see below). This provides a T
102
such that γTi falls within a dilation of Vi by δ′. We will bound the values of ImψPi
from
below and bound the values of ImψP\Pifrom above in this region.
Imψ is increasing through the zero level set by definition, so if necessary we can find
a sufficiently small value of κ such that Imψ is increasing in each vertical slice of γTi . We
need to find a lower bound for Imψ along the boundary. It is helpful to rewrite Imψ. We
can write Imψ as
Imψ(x, y) = F (x, y) sin
(ν22y
λ+ ϕ(x, y)
),
where F is defined as
F (x, y) =
(exp−(x+ 1)2 + y2
σ2+ exp−(x− 1)2 + y2
σ2
+2 exp−(x+ 1)2 + y2
2σ2 exp−(x+ 1)2 + y2
2σ2 cos(2ν11x/λ)
)1/2
and ϕ(x, y) is defined as ϕ in Equation (5–3), extended to be even in x, and also factors
in the shift associated with the opposing θi (depending on which half-space you are in).
To define F use the angle-sum formula, cos(α) cos(β) + sin(α) sin(β) = cos(α − β). If the
cosine term in F is positive then it is bounded below by
maxexp−(x− 1)2 + y2
2σ2, exp−(x+ 1)2 + y2
2σ2,
by droping all but the maximal value term and taking the square root directly.
Suppose that the cosine term in F is negative. Then 2|ν11x| > λπ/2, as the cosine is
nonnegative away from this set. So x2 >(λπ4ν11
)2. Suppose without loss of generality that
x > 0, so that x > λπ4ν21
. Thus, −(x+ 1)2 + (x− 1)2 < −λπ/ν11 . F can be written as
F (x, y) = exp−(x− 1)2 + y2
2σ2
·√1 + exp−(x+ 1)2 − (x− 1)2
σ2+ 2 cos(2ν11x/λ) exp−
(x+ 1)2 − (x− 1)2
2σ2.
103
Suppose we take λ ≥ 2σ2/π. Then, recalling ν11 = 0,
log(3/8) ≥ −1 ≥ − λπ
2σ2ν11and so
3/8 ≥ exp−(x+ 1)2 + (x− 1)2
2σ2, and
1/4 ≤ 1− 2 exp−(x+ 1)2 + (x− 1)2
2σ2.
Thus the lower bound for Imψ along the boundary is
Ji(σ) = minx∈∂γTi
Imψ ≥ κ/2 exp−(L/2 + T )2
2σ2.
Now we will construct a bound for the contributions from the remaining portions
of the polygon P \ Pi. We will refer to this function as ηi(x; σ). ηi(x;σ) will be strictly
bounded above by Ji(σ). Perturbing Imψi,i+1 by ηi does not cause points along the
boundary to change sign. Therefore, a zero level-set of the resulting perturbation still falls
within γTi . This consists of two steps:
1. Bound |ImψP\Pi| from above by ηi(x;σ),
2. Find a σ such that Ji > maxx∈γTi
ηi(x;σ).
First, we want to build a bound for |Imψ| from the atoms besides (i, i + 1). The
following inequalities provide a useful bound
|Imψ| = |∑
j =i,i+1
exp−d(x,mj)2
2σ2 sin(νj(x−mj)/λ)|,
≤ maxj =i,i+1
exp−d(x,mj)2
2σ2∑
j =i,i+1
| sin(νj(x−mj)/λ)|,
≤ (N − 2) maxj =i,i+1
exp−d(x,mj)2
2σ2
≤ (N − 2) exp−(L/2 + ϵ)2
2σ2.
We let
ηi(σ) = (N − 2) exp−(L/2 + ϵ)2
2σ2.
104
Now we will bound ηi(x; σ) < Ji on γTi . Suppose that (L/2 + ϵ)2 − (L/2 + T )2 >
2σ2 log(2N−2κ
). Then it follows that
κ exp(L/2 + ϵ)2 − (L/2 + T )2
2σ2/2 ≥ N − 2
Ji(σ) ≥ ηi.
and so Ji > ηi.
By the construction of γTi it has upper and lower boundaries at γTi ± T , as shown
in Figure 5-4. The effect of this result is to show that the values of Imψ along upper and
lower boundaries of γTi do not go below or above zero respectively (if ν2 > 0, above or
below resp. if ν2 < 0) by adding in additional points obeying Condition (D3). Therefore,
there is some value of σ such that the remaining portion of the imaginary part of ψ does
not push the zero level-set corresponding to i, i+ 1 outside or to the boundary of γTi .
In review, the following inequalities were required to be satisfied
ν1σ2/2 < δ′λ,
κ = sin(ν2T
2λ),
2σ2
π< λ,
ϵ >4σ2
Llog(
2
κN − 2).
These can all be satisfied by taking σ small enough. Let the zero level-set for the whole of
ψ be denoted by γ. By the argument above for every x ∈ Vi (for ψ as a whole) there is a
y ∈ γi within distance ϵ. Thus
supx∈Vi
infy∈γi
d∞(x, y) ≤ ϵ.
By taking δ′ to be less than the minimal ϵj over all j, we can ensure that the inequalities
are satisfied for all γj, Vj simultaneously. Since the directed Hausdorff distance obeys
105
dH(A ∪B,C ∪D) ≤ maxdH(A,C), dH(B,D) we get
dH(P , γ) ≤ maxj∈[1,2,...,N ]
dH(Vj, γj) ≤ δ.
Since δ was arbitrary, we conclude that the zero level-set of the imaginary part of ψ can be
made arbitrarily close to the polygon in the H∞ metric.
5.2.2 The Class of Curves Approximated By F
Lemma 3. Let γ : [0, 1] → R2 be a C∞ Jordan curve in the plane and ϵ > 0. Then there
exists a polygon γ of finitely many rational-length sides, such that
d∞H (γ, γ) < ϵ.
Claim 2 Polygons obeying Conditions (D1), (D2), (D3) for any L contain all Jordan
polygons of rational side lengths.
Proof. Let Q = (vi, ei, ni)qi=1 be an arbitrary Jordan polygon of rational side lengths.
Let (ai : bi) be the side lengths of the edges ei. Define L = supp:bi|p∀i
1/p. Note that we can
re-write each of the sides of Q as a collection of sides with step L = 1/p, creating new
vertices and nodes at each step, and the new polygon Q′ is equivalent to Q under any
standard metric. So now Q′ satisfies Conditions (D1), (D2).
Finally, we need to ensure that Condition (D3) is obeyed. To enforce this condition,
compute for each vertex vi the minimal distance to nodes outside of its neighbors di =
minj ∈i−1,i+1 d(vi, nj). If di ≤ L/2 for any i then repeatedly subdivide the intervals k times
until L/2k+1 < di, call this new length L and call this new polygon Q. During this process,
if np is the offending node and d(ep, vi) = dp then k = ⌈log(L/dp)⌉ is enough iterations to
terminate. Now the nearest that any node outside of its neighbors comes to vi is strictly
greater than L. And so Condition (D3) is obtained. Q does not have any vertices of nodes
that are not on Q and so they are equal as curves.
106
5.3 Asymptotic Approximation of Modular Distance Fields by Gabor Atoms
In this section we provide a proof of the claims in Section 2.3. That is,we show that
the point-wise behavior of the Gabor Transform of the modular distance field decays as
the parameters σ, λ ↓ 0. Furthermore, if σ2 = O(λ) then we provide a sketch that suggests
that |GgσΨσ,λ(m, ν/λ)| = O(1).
Claim 3 Suppose that S is a nonempty, bounded open set in Rd such that ∂S is a
smooth, connected, orientable manifold. Let (m, ν) ∈ Rd × Sd−1 be an oriented point. Let
J(σ, λ) = |
[Ggσ
Ψσ,λS
||Ψσ,λS ||
](m, ν/λ)| = ⟨
exp− ||x−m||22σ2 + iν
T (x−m)λ
(2πσ2)d/2,
Ψσ,λS
||Ψσ,λS ||⟩.
If (m, ν) is not drawn from the boundary of S with ν = n∂S, then |J(σ, λ)| → 0
super-polynomially as σ, λ→ 0 with λ = poly(σ).
Remark 1 Although this example does not quite fit into the statement of the Claim, an
analytical example of this type of decay is the integral in Equation (3–8). Clearly, when
λ ≥ O(σ2) this integral decays rapidly away from m = q, ν = ω.
Proof. The proof works in two cases. Let (m, ν) be an oriented point. First, suppose
m /∈ ∂S. This resulting integral decays exponentially (regardless of λ/σ) quickly by the
following bounds:
|J(σ, λ)| = |∫Rd
exp− ||x−m||2+bS(x)22σ2 + iν
T (x−m)−bS(x)λ
||Ψ||(2πσ2)d/2
dx|
≤∫Rd
exp− ||x−m||2+bS(x)22σ2
||Ψ||(2πσ2)d/2dx ≤ O(
1
exp 1σ2σ2d
). (5–9)
The first bound follows from taking absolute values inside the integral. The second bound
arises by the bound on ||Ψ|| from the text (both Equation (2–7) and Equation (2–8)) and
the following bound∫Rd
exp−||x−m||2 + bS(x)
2
2σ2dx
≤∫Rd\Bδ(m))
exp−||x−m||2
2σ2dx+
∫Rd
exp−||x−m||2 + c
2σ2dx, (5–10)
107
where c is a constant. The first term decays exponentially with σ by the rate of decay of
the erfc function (taking σ/δ to be sufficiently large). For the second term, we can bound
the contribution from exp− bs(x)2
2σ2 by the maximum value over the boundary of the ball
Bδ(m), which is of order exp− cσ2. The resulting integrals produce the normalization
coefficient which cancels the coefficient above, leaving the exp− cσ2 term which clearly
decays exponentially quickly as σ ↓ 0.
When m ∈ ∂S the previous bounds are insufficient to make the integral vanish, as
the second term in Equation (5–10) remains O(poly(σ)). To employ stationary phase
[106], we first need to cut down on the domain of integration by choosing a sufficiently
small σ so that the integral is approximated reasonably well by restriction to a compact
subset E that depends on all of S. First, let H to be the largest open ball about m that
contains no elements of the singular set, whereby bS is C∞ on E [25]. Since ∂S is C∞, H is
nonempty, and diam(H) < diam(S). Note that this implies that H ∩ ∂S consists of a single
connected component: as otherwise a midpoint between components would be contained
in H, and thus a skeletal point. Note that H may contain points x such that ∇bS(x) = ν.
The last restriction we require removes this case, ensuring non-stationarity of the phase:
since ∇bS|m = ν consider the preimage of ∇bS of an open neighborhood around ∇bS|m,
such that the closure does not contain ν, call it G. If G contains multiple connected
components, pare down to the one containing m. Then let F be G intersected with H. F
is open, so close it, let E = F . As it is clearly bounded, E is compact. Finally, we replace
the Gaussian factor by the Gaussian times a bump function that takes the value 1 on
V ⊂ E. When d is small (finite) we can choose a radius for V that is sufficiently snug
to E so that the product is an arbitrarily close estimate to the Gaussian that vanishes
smoothly on the boundary of E. Note that the error between the mollified integral and the
actual integral is bounded by the mass of the Gaussian in a thin shell which decays like
exp(− δ2
σ2 )δσ
d(|δ−η|) where η is the radius of V and δ = rad(E). Depending on the diameter
108
of E we must choose the appropriate σ, and we assume that O(λ/poly(σ)) = 1. Then the
following Lemma from [106] applies
Lemma 4. If ∇f = 0 in the compact domain D, and if g vanishes C∞ smoothly on ∂D,
then
I(λ) =
∫D
g(x) expif(x)λdx = O(λN) as λ→ 0, for all N ≥ 1. (5–11)
So the integral J(σ, λ) = O(λN) for all N . The integral can be bounded by the
appropriate Gaussian integral, which contributes at most a polynomial factor in σ−1.
Hence the integral decays super-polynomially.
Contributions for Remaining Portion of Phase-Space.
While it is not proved here, based on the Stationary phase approximation it seems
that the contributions from terms arising from the boundary with frequency vectors
pointing in the normal direction remain O(1) as the ratio O(λ/σ2) = 1 is maintained. Note
that the following argument is not rigorous, but provides a sketch of the behavior of points
on the boundary with appropriately facing normal vectors. This is the appropriate ratio
for producing non-vanishing stationary-phase contributions at points in neighborhoods of
zero curvature. In 2-D the ratio is O(λ/σ3) = 1 for non-saddle-type points with nonzero
curvature. In 3−d the ratio depends on the point-type: depending on the point-type the
hypersurface of stationary points is 1− or 2−dimensional, and so the approximation
of the Gaussian integral has a different contribution. Carrying this analysis through is
more difficult, since rather than upper bounds we need to compute lower bounds for the
Stationary phase coefficients. For a signed distance function (even with the caveats we
have added in our Claim above) degenerate stationary points and surfaces of stationary
points can both occur; therefore, computing the integrals is a very delicate task. We
simply comment that, under the appropriate considerations for using the stationary phase
109
approximation that the expansion looks like
J(σ, λ) ≈∫Rd
exp− ||x−m||2+bS(x)22σ2 + iν
T (x−m)−bS(x)λ
||Ψ||(2πσ2)d/2
dx
≈ (2πλ)d/2
(2πσ2)d
∑Z=z:∇bS(z)=ν
exp− ||z−m||2+bS(z)22σ2 + i
λg(z)√
| det(HbS(z))|
+(2πλ)d/2
(2πσ2)d
K∑k=1
∫Mk
exp− ||γ−m||2+bS(γ)22σ2 √
tr(HbS(γ))dσMk
(γ)
+O((2πλ)(d+1)/2
(2πσ2)d) (5–12)
and the exponentially decaying terms must fall on a point z or across a curve (or
hypersurface) Mk such that the product of the Gaussian decay factors hits 1. The
hypersurface Mk = x ∈ Rd : ∇bS(x) = ν. Note that the curve or hypersurface will be
of diameter O(1) relative to the shrinking σ, and so the contribution from the numerator
of the integral in the second term of the expansion of Equation (5–12) will be O(1). Since
bS is a signed distance, it also follows that the hypersurface will be hyperplanar within a
sufficiently small radius around the point m. Furthermore, under certain conditions we
expect√tr (HbS(γ)) =
√κ(P (γ)) where κ(P (γ)) the mean curvature at the projection of
γ to the near point on the curve ∂S. Note that the non-degeneracy of the denominators
of the terms corresponds to a meaningful geometric condition for the surface, which we
do not delve into here. We have avoided claiming this as a proof because the management
of the degeneracy condition for a signed distance must be done very carefully to ensure
correctness of the stationary phase. Integrals of this sort remain an active area of applied
mathematical research [4].
110
CHAPTER 6FURTHER EXPLORATIONS AND APPLICATIONS
In this Chapter we explore how the representation can be used for shape statistics.
The approximate linearity should allow approximate shape averaging, principal component
analysis, and more applications that can aid in shape modeling and analysis pipelines.
We also explore applications to graphics and image processing. In graphics, rendering
3−d models from a sparse set of observations is extremely useful. To this end We provide
a fast and accurate algorithm for surface reconstruction from partial observation. We also
explore the possibility of using Gabor approximation of an appropriately transformed
image for shape extraction from images.
6.1 Curve Extraction from ψ
Since the modular distance function is phase-wrapped, the zero level-set consists of
an infinite collection of connected components: the pattern repeats at coarser and coarser
scales. Therefore, extracting a single zero level-set entails choosing among these scales one
which best represents shape. Through the analogy with the CWR, we want the level-set
that best represents the oriented point-set. Only once we establish a clear mechanism to
choosing the best level-set can we say that the CWR bridges a gap between Mathematical
shape representations and perceptual grouping.
In the setting of the MDF, the correct zero level-set is obvious: the magnitude of the
correct zero level-set will be the highest of all of the magnitudes. Since the argument to
the Gaussian is the same as the argument to the complex factor, the zero of the complex
wave that indexes into the true zero will have the least argument to the Gaussian. Thus,
unwrapping the modular distance function is as easy as ordering the zero level-sets and
adding an appropriate offset of k2π for the kth largest value of the magnitude.
In the setting of discrete samples from the normal bundle, the CWR does not
maintain the correspondence between the magnitude and the phase (plus some wrapping
offset). However, provided that the samples are dense enough we can model the discrepancy
111
of the magnitude of the MDF compared with the CWR by a small perturbation. Then,
we can apply an appropriate estimator across pixels or points that index into the same
level-set of the phase to come up with an approximate value of the magnitude associated
with a given level-set.
We associate each level-set ℓi of the phase of the CWR with some level-set γi of the
MDF as follows. First we extract level-sets of the phase of the CWR and label them. If
we assume Gaussian error of the CWR (relative to the MDF) then the MSE estimate for
the magnitude-squared associated with a given level-set ℓi will be the expected value of
the magnitude squared of the CWR over the level-set. This is a line- or surface-integral
depending on the number of dimensions involved. Note that this procedure can be carried
out in a continuous fashion whenever ℓi is given by a spline-curve and in a discrete fashion
by performing marching squares or cubes. The MSE estimate of the magnitude for ℓi, ρi,
provides a statistic that is extremely useful for ordering the connected curves in the zero
level-sets of the phase of the CWR. We expect the kth largest value of the estimated MDF
magnitudes to correspond to the kth wrapped repetition of the true level-set.
Note that even in the case of the MDF it is possible that the kth level-set has a larger
integral value due to the fact that the length may be larger. Thus, a bound on the length
of kth level-set is necessary to derive a value of σ that yields a fast enough decay so that
the integral estimate provides the correct ranking. The key observation for making this
work is to note that the “offset curve (surface)” at distance k to the 0−level-set will
have an arc-length (surface area) that is greater than the arc-length (surface area) of the
k−level-set. This is because the level-sets are contained in the offset curves (surfaces)
[25]. So a bound on the length of the offset curve provides a bound on the length of the
corresponding level-set. Let γ be the 0−level-set of the single connected component of the
shape. The k−offset curve to γ is defined as
γ(t) = γ(t) + kN(t),
112
where N(t) is the normal vector to γ at t. The arc-length at t is the norm of the
t-derivative of the curve. So the arc-length at t is
∂tγ(t) = T (t) + kκ(t)T (t),
by the definition of curvature. Thus
||∂tγ(t)||2 ≤ (1 + kκ(t))2||∂tγ(t)||2,
by the triangle inequality. This pointwise bound provides the following global bound on
the arc-length
1∫0
||∂tγ(t)||dt ≤ (1 + kκmax)
1∫0
||∂tγ(t)||dt.
So to ensure that the wrapped level-sets have appropriately ordered integral magnitudes,
estimate the magnitudes by the Gaussian evaluated at the distance λ and bound the
integral
1∫0
|ψ(γ(t))|2dt ≤ L(γ)|ψ(γ(0))|2,
≤ (1 + λκmax)L0|ψ(γ(0))|2.
Using the previous bound, we get the following bound for admissible values of σ, λ for
unwrapping
λ ≥ σ√2 log((1 + λκmax)L0). (6–1)
6.1.1 Mean Shortest-Path Error Evaluation on 2-D Data: MPEG7 Dataset
In this subsection we evaluate the performance of the surface reconstruction using
a surface reconstruction measure based on matching a proposed surface to the true
underlying mesh. The surface is proposed by the algorithm outlined above. Then points
are taken at random from the initial mesh and a shortest path between the points is
113
Table 6-1. An algorithm for extracting the shape corresponding to a collection of orientedpoints.
Require: S = (ma, νa)Na=1 an OPs. k a number of contours.function Reconstruct(S, k)
2: σ, λ ← Admissible values based on Equation (6–1)
ψ ←∑
a exp−||x−ma||2
2σ2 + iνT (x−ma)
λ ▷ Build the CWR
4: C ← ℓa : ℓa is a connected component of θψ = 0 ▷ Get these with e.g. marchingcubes/squares
for i← 1 to |C| do6: αi ←
∫ℓi|ψ(x)|2dσ(x) ▷ Either pixel-wise estimate or by actually computing
CWR on ℓiend for
8: return ℓimkm=1 such that αip > αip+1 .end function
Figure 6-1. Zero level-sets of subject 5 of the FAUST sequence. The range of the implicitfunction is [−.03, .17] × [.44, .79] × [−.16, .14] and σ = .0003, λ = .0175. Thisresults in 3 wrapped zero level-sets appearing in the frame. They are orderedin increasing order by surface integral, with values ranging from 1 × 10−10 to1× 10−1.
114
computed. Corresponding points on the proposed surface are found by minimizing distance
in the ambient space over points in the proposal, and the path along the proposed surface
is computed. The difference in the length of the paths is reported.
In this subsection we provide a suggestive empirical result on the validity of a
reconstruction algorithm that leverages the rapid decay of the magnitude with the limited
growth of the length (or area, in the surface case) of the contours as the phase wraps. The
curve reconstruction algorithm is shown in Equation 6.1.
The validation methodology is as follows. The dataset used is the MPEG-7 dataset,
with shapes taken from the Bats, Birds, and Chickens subsets. Each consists of 20
silhouettes. A single contour was extracted from the images by choosing the boundary of
the indicator function that corresponds to the nominal curve. The resulting curves have
approximately 1000 points each and represent an outline of the pictured bird, chicken,
or bat. For each such image, 250 random pairs of points are chosen from the bounding
curve and the shortest distance between the two is computed. Then subsample the curve
at the appropriate sampling rate and extract 1 zero level-contour from the CWR. The
corresponding points chosen in the previous step are then projected down to the contour
and the shortest distance path on the contour is computed. We report the average error
between the two on 150, 000 trials spanning 60 curves, sampled at 10 rates, over 250 trials
at rate rate.
6.1.2 Hausdorff Distance-based Evaluation of 3-D Data: Spheres, Bunny,FAUST Datasets
Note that in 3-D the paths tend to have alternate routes when a surface defect
is generated, and so the arc-length measurement may not be as robust. Therefore, we
also compare the Hausdorff distances (or H-distances) [1, 32, 102] between the CWR
reconstruction and the Poisson surface reconstruction for a sphere, heads in the FAUST
dataset, and the Bunny dataset. We compare the reconstruction from the unwrapped
115
...
..0.
5.
10.
15.
20.
25.0 .
5 · 10−2
.
0.1
.
0.15
.
0.2
.
0.25
.
0.3
.
0.35
.
0.4
.
0.45
.
0.5
.
Sampling Frequency
.
Error
onShortest
Path(%
ofDiameter)
.
. ..Birds
. ..Bats
. ..Chickens
...
Figure 6-2. Average error on shortest path between 250 pairs of points (randomly chosen)in the estimated mesh at 10 sampling rates. σ = 170, λ = 50, and the averagediameter is 250. Note that both the bats and chickens have regions of veryhigh curvature (eg. in the wings of the bats or the feathers of the chickens).
CWR with Poisson surface reconstruction [53] on the basis of Hausdorff distance as a
percentage of diameter.
A standard measure for the performance of a spline is the “circle test”. This refers
to the principle that adding points on a circle to the data should lead to more accurate
estimates of the underlying circle. In other words, the resulting curve should have constant
curvature between points being connected. We verify this performance for the CWR and
compare with the behavior of Poisson surface reconstruction. Since the underlying surface
is the 2−sphere in this case, the shortest distance between a point on the estimated
surface and the underlying surface S2 will be the projection of the estimated point to
the unit norm vector lying in the same direction. So we report the Hausdorff distance
computed analytically in this case. We note that since Poisson surface reconstruction is
designed to handle outliers and low sampling-rates that it does not perform as well under
this standard spline test, but achieves reasonably good reconstruction metrics throughout
the lower sample regime.
116
Sampling Rate CWR H-distance Poisson H-distance.9 0.01 0.081.74 0.023 0.053.57 0.031 0.040.43 0.063 0.040.26 0.145 0.044
Figure 6-3. Sphere reconstruction over different sampling rates. From top to bottom, theset of 926 oriented points is sampled at a rate of .9, .74, .57, .43, and .26 ofthe initial point-set. Note that Poisson surface reconstruction handles theformation of edges better in the low sampling-rate case, but tends to estimatea slightly inflated sphere. The table shows the Hausdorff distances as a fractionof radius averaged over 10 trials at each rate..
117
Sampling Rate CWR H-distance Poisson H-distance.9 0.070 0.073.74 0.070 0.074.57 0.070 0.076.43 0.070 0.077.26 0.070 0.077
Figure 6-4. Face reconstruction over different sampling rates. From top to bottom, theset of 1470 oriented points is sampled at a rate of .9, .74, .57, .43, and .26 ofthe initial point-set. The table shows the Hausdorff distances as a fraction ofdiameter averaged over 10 trials at each rate.
We also compared the performance of the CWR reconstruction to the Poisson
surface reconstruction on the FAUST and Bunny datasets. FAUST and the sphere
above are both relative low sampling-rate datasets. The bunny is a high-rate dataset,
with 37K points. Since the FAUST dataset considered here is restricted to heads (for
rendering and precision purposes) the boundary conditions are important. We clip the
reconstructed surface just below the chin level for metric computation, since Poisson
surface reconstruction estimates a closed surface by design. We point out here that this
is a weakness of Poisson surface reconstruction, since there is no natural way to extract
multiple connected components from this method using a setting that enforces closed
surfaces—there is no “priviledged level-set”.
We performed 2 sets of experiments with the bunny. First, we experiment with adding
noise to the original mesh points. We added pointwise Gaussian noise with standard
deviation ranging from 1% to 25% of the diameter of the mesh. Then the algorithms are
run on the resulting perturbed oriented point-clouds. The results are shown below, with
the average Hausdorff distance over 10 trials at each noise level. We find that Poisson
reconstruction is much more robust to noise, but at low and moderate levels of noise the
two perform comparably.
Next, we compare the reconstruction performance at different sampling rates, as
detailed above. We find that Poisson and CWR reconstruction perform comparably for
high to moderate sampling rates.
118
Noise CWR H-distance Poisson H-distance.01 0.043 0.038.09 0.048 0.039.17 0.050 0.045.25 0.282 0.046
Figure 6-5. Bunny (closed surface) reconstruction over different noise levels. From top tobottom, the set of 6200 oriented points has noise added at σ = .01, .09, .17, .25times the diameter of the mesh. The table shows the Hausdorff distances asa fraction of diameter averaged over 10 trials at each rate. Note the surfaceartifact on the last shape in the CWR column, which causes a high Hausdorffdistance between the reconstruction and the original.
119
Sample rate CWR H-distance Poisson H-distance.9 0.045 0.040.57 0.045 0.043.43 0.049 0.044.26 0.052 0.045
Figure 6-6. Bunny (closed surface) reconstruction over different sampling rates. From topto bottom, the set of 31,335 oriented points is sampled at a rate of .9, .57, .43,and .26 of the initial point-set. The table shows the Hausdorff distances as afraction of diameter averaged over 10 trials at each rate.
120
6.2 ψ for kPCA on Curves
In kernel PCA (kPCA) [88], the goal is to build a linear basis of functions, B =
eiN−1i=1 , out of the features observed during a training phase that minimizes the
reconstruction error of the observations. The difference with PCA is that the basis
may not be linear in the underlying space where the observations are points, but rather in
the space corresponding to the feature functions. Often, evaluating inner products directly
in high-dimensional feature spaces is difficult: due to the size of the dimensionality or
numerical precision. In lieu of this brute-force approach to implementing PCA in feature
space, kPCA proposes using a reproducing kernel instead of directly evaluating the inner
products. While kPCA suffers from some computational setbacks (see blow) it offers firm
theoretical footing for performing nonlinear dimensionality reduction for moderately sized
data sets.
The theory of Reproducing Kernel Hilbert Spaces provides sufficient conditions for
posing the problem correctly, specifically Mercer’s Theorem which establishes conditions
on the validity of the choice of kernel.
To perform kPCA, first the mean in the ambient (not feature) space is removed
from the data set. Since the kernel performs a linear evaluation, this transfers to a linear
centering in feature space. The new feature vectors are denoted ψ. Next, the Gram matrix
Kij = 1n⟨ ˜ψ(x;mi, νi), ˜ψ(x;mj, νj)⟩ is formed. Since requires n2 iterations of the flops
required for computing the kernel. The final step in kernel PCA is eigendecomposition
of K. The reason why this corresponds to forming a basis of maximum covariance is as
follows. If we write the eigenvalue problem for the covariance matrix as λV = CV where C
is the feature covariance matrix, then note that V lies in the span of the features observed.
Therefore, we may write V =∑n
i=1 αi˜ψ(x;mi, νi). Lastly, we may consider the “weak”
version of the equations (by applying the linear evaluation functional at (mk, νk) for all
121
k ∈ 1, . . . , n) and form a system of n equations that reads
K2α = nKλα. (6–2)
If α′ is an eigenvector of K then it also satisfies Equation (6–2).
The result of diagonalizing K produces αiN−1i=1 containing the kernel principal
components. Then the collection of αi corresponding to nonzero λi are normalized and
used as a basis for test patterns.
The first problem we propose to solve with kPCA is to estimate the underlying point
density model for a shape, through the magnitude squared of its expansion on a linear
space of basis functions provided as training data. As an added and surprising benefit, we
show that the expansion itself contains closed curves. The perceptual gains of the CWR
are conserved under kPCA encoding. Moreover, since we are working on L2 rather than
PL2 the CWR is therefore uniquely suited to PCA-based compression unlike probability
densities which are positive and integrate to one. In summary, we can start with a set of
oriented point-sets, go to our feature basis and build a linear subspace, and accurately
approximate a closed curve and a probability density corresponding to an unseen, test,
oriented point-set in terms of a few basis coefficients. In Figure 6-7 the closed curve
estimated by the kernel is shown on the left of the figure. We construct this close curve
as explained in Section 6.1. This is a novel aspect of the CWR directly leveraging the
properties of linearity and superposition to construct a basis representation. We show that
the absolute error of this representation serves as a discriminative measure for classifying
oriented point-sets. An additional novel aspect is a framework that also yields a generative
approximation corresponding to the classification—the wave function that emerges from
the approximation of the data in the kPCA basis.
We also provide anecdotal evidence of the performance of kPCA for reconstructing
closed curves on the Gatorbait dataset. We used 20 oriented point-sets as training data
for kPCA and held back 5 test samples. Each involved about 300 oriented points. The
122
(a) (b)
(c) (d)
Class 1 err1 err21 1.7 2.42 0.73 2.03 0.67 2.94 0.42 2.75 0.54 2.7
(e)
Class 2 err1 err21 2.4 0.722 2.2 0.743 2.0 1.554 2.1 0.345 2.3 1.59
(f)
Figure 6-7. (a) and (b) Closed curves from linear combinations of closed curves. Linearbases B1 = e1i 25i=1, B2 = e2i 25i=1, each consisting of linear combinationsof 25 patterns were used to predict the features for 5 unseen patterns. Thefirst 24 principal components were used as a basis for the point-sets shown.(c) and (d) Density estimation from a learned dictionary through PCA. Wesimultaneously estimated a point density and a closed curve from the spanof 24 training patterns with coefficients derived from ≈ 100 elements of theoriented point-set. (e) and (f) Reconstruction error. The PCA bases performwell as a subspace classifier evaluated on the MPEG-7 dataset[96].
reconstruction of a training sample and several test samples shows recovery of closed
curves from mixtures of CWRs.
123
(a)
(b)
Figure 6-8. Recovery of closed curves in training and testing samples for the Gatorbaitdataset. a) Recovery of a training sample from all of the principal components.The subframes show the progress of the reconstruction as more principalcomponents are added. b) Recovery of a test sample.
124
6.3 Generalization of the CWR to Embedded Surfaces
Since the majority of the work in this Thesis is of an applied flavor, and since the
majority of the applications for shape registration and statistics occur in ambient space,
the generalizations in this section have been relegated to the end of this document.
These generalizations are important since they provide some key features that most
representations lack. Two key features, which we will discuss briefly below, is that one can
use these generalizations to generate manifolds with co-dimension greater than one such
as a curve in 3-D and also to perform registration of rigid bodies registering curves on the
bodies.
Recall the relationship between the Schrodinger equation and the signed distance
featured in Equation (2–2). Here we show how this equation generalizes to the sphere.
Recall that the sphere is a Riemannian manifold with metric
gS2 =
1 0
0 sin(θ)2
,
so that arc length is given by
ds2 = (dθ, dϕ)Tg(dθ, dϕ) = dθ2 + sin(θ)2dϕ2.
We will need some concepts from Riemannian geometry. A metric is a pointwise
smooth assignment of positive definite matrices to a manifold M , usually denoted g,
where smoothness is understood to be componentwise in the matrix. This metric matrix
at p acts on vectors in the tangent space TpM to allow length measurements in the
tangent space. We also need the definition of the gradient, connection, divergence, and
the Laplace-Beltrami operator. We view the gradient ∇ as a linear map from C∞(M) →
Γ∞(M), where Γ∞(M) denotes the set of smooth vector fields on M . In local coordinates
∇f = gik ∂f∂xk
∂∂xi
where gik is the ikth entry of the inverse of g. The connection on (M, g),
D is a bilinear map Γ(M) × Γ(M) → Γ(M) such that D is C∞ homogeneous in the first
125
entry and obeys a Leibniz property in the second entry. The divergence operator div acts
on a vector field V by inserting it into the second entry of D and evaluating the trace of
the remaining linear operator. Finally, the Laplace-Beltrami operator is given by
∆f = −div∇f = −trD∇f.
The negative sign appears to keep the operator positive semi-definite. For more details on
these definitions see [36].
If we consider the Laplace-Beltrami operator on the sphere then we get the equations,
following the development of Chapter 2,
−ℏ2(∆S2ψ) = ψ
−ℏ2( ∂
sin(θ)∂θ
sin(θ)∂ψ
∂θ+
1
sin(θ)2∂2ψ
∂ϕ2) = ψ
−ℏ2(O(1) +O(1
ℏ) +
1
ℏ2
(1
sin(θ)2(∂S
∂ϕ)2 + (
∂S
∂θ)2)R expiS/ℏ) = R expiS/ℏ.
From the last line we deduce the asymptotic equality
||∇gS||2 ≈ 1
which is exactly the eikonal equation on S2 [57].
A general definition of the CWR on a compact, simply connected Riemannian
manifold of genus 0 will be given below. First, the definition for S2 is given to make things
concrete. Let γ be a curve lying along a sphere sitting inside R3. We will model γ ∈ R3
as γ ∈ S2. Let miNi=1 ∈ S2 be locations on the unit sphere S2 ⊂ R3. Let νiTmiS2
be tangent vectors at mi, defining normal vector for a codimension 2 curve γ ∈ R3 and
a codimension 1 curve γ ∈ S2. Note that mi defines a binormal for γ at mi. Call this
collection A = mi, νiNi=1. Then the spherical-CWR for A on S2 is
ψA(θ, ϕ;σ, λ) =N∑i=1
exp−d(m, (θ, ϕ))2
2σ2+ i
(mi, νi) · (θ, ϕ)λ
126
Figure 6-9. A fish drawn on the surface of a sphere using the spherical CWR. On the leftwe see the magnitude of the CWR, and on the right the phase plot decoratedby different lines. The red line indicates the initial data. Blue and gray linesshow various level-sets of the resulting phase of the CWR.
where (m, ν) acts by integration of the inner product along the parallel transport
(m, ν) · (θ, ϕ) =∫γ(θ,ϕ)m
Dγν · γdσ
with γ(θ,ϕ)m the geodesic arc joining m to (θ, ϕ) and d the distance along this arc:
d(m, (θ, ϕ)) = cos−1(sin(ϕ) sin(m2) + cos(ϕ) cos(m2) cos(|m1 − θ|)).
In computational settings, d can be computed by going to extrinsic coordinates and using
the definition of inner product v · w = |v||w| cos(∡vw) or by various formulas that depend
only on intrinsic coordinates such as shown above. Note that Dγν is the parallel transport
of ν along γ, which for the sphere acts like rotation around the origin of the vector ν
affixed at m (or whatever point Dγ is evaluated at). Note that this also corresponds to
rotating the sphere so that m is at (0, 0) and then rotating along the z−axis to align ν
and (1, 1)/√2 in TmS
2.
The previous definition does not depend on the structure of the sphere beyond two
things:
127
1. Geodesic completeness,
2. The formula for d.
On a Riemannian manifold (M, g) the Laplace-Beltrami operator is given in coordinates as
∆f =1√|g|∂i
(√|g|gij∂jf
).
Thus
−λ2∆gψ =− 1√|g|∂i
(√|g|gij(∂jR+ iR∂jS/λ) expi
S
λ),
≈− 1√|g|
(O(λ2) + λ2
√|g|gij(∂ijR+ i∂iR∂jS/λ+ iR∂ijS/λ−R∂iS∂jS/λ2) expi
S
λ),
≈O(λ) + g(∇gS,∇gS)R expiSλ.
If we are willing to sacrifice closed-form access to the construction of d then we can define the
CWR on any mesh by either computing geodesics or computing the heat kernel and using this as
an approximation to the Gaussian factor. We still need to compute the parallel transport, which
is in general ill-conditioned on an arbitrary triangulated surface. However, theoretically we can
now write down the CWR on a geodesically complete, smooth Riemannian manifold (M, g)
ψM (x; C = mi, νi) =N∑i=1
exp−dM (x,mi)2
2σ2+ i
(mi, νi) · xλ
.
One can use Dijkstra’s algorithm for compute dM (x,mi)2 and use the resulting shortest paths
as geodesics γxmifor computing (mi, νi) · x. Note that the result is an approach that computes
the signed distance function on a mesh as opposed to the unsigned distance. Complexity-
wise, this approach is dominated by the need to compute geodesics from m source points,
which requires O(mN log(N)) calculations. One can also use Varadhan’s formula [22, 101] for
computing dM (x,mi)2, but this does not allow one to compute the signed distance. In practice,
we find using the approach of [22] when many source points are needed is faster than Dijkstra’s
algorithm. This method requires a linear solve for the backward Euler equation, which has quick
solutions for sparse matrices [16].
128
Figure 6-10. A diamond drawn on a mesh from FAUST using 4 oriented points. The redlines indicate zero level sets. The red lines on the arms are due to phasewrapping. The zero level set on the torso can be selected by extractingconnected components and ordering them by line integral of the modulusof ψ.
129
Registration of Spherical Data Using the Spherical CWR.
Here we explore two techniques for registering oriented point-sets on the sphere. The first
approach is essentially RDM restricted to the sphere: the oriented points are viewed as points
and normals on the sphere. This results in a technique based on numerical integration of the
difference of spherical CWRs. The second approach is based on Karcher Means of the oriented
points viewed as points on (S2)2. This results in an approach that is robust to Noise but suffers
from missing points.
First we consider spherical RDM. A naıve optimization approach would simply be to use
brute-force and explore the whole space of rotations, re-computing the spherical CWR of the
template at each iteration and taking as the minimizing argument the rotation angles minimizing
the distance. However, since the domain in question (S) is fixed it is possible to do better than
this. The approach uses the symmetry of the sphere: first, we compute the Frechet means by
minimizing∑N
i=1 d(m,xi) over m. One can do this with a simple recursive procedure
M1 = x1
Mn =Mn−1 ⊕1/n xn,
where ⊕1/n denotes a point 1/n of the way along the along the geodesic joining Mn−1 to
xn. We implement this with spherical linear interpolation. In certain contexts this algorithm
has been shown to provide good estimates for the Frechet mean [50]. From this point, we
subdivide the interval of the remaining degree of freedom into small portions and re-compute the
template spherical CWR at each setting. The result is an algorithm that aligns spherically-bound
curves represented by a sampling of oriented points. This can be compared to the results in
Subsection 4.2.2, see Figure 4.2.2. Note, however, that this algorithm only applies to spherical
data.
The second algorithm is dubbed bi-means. The approach involves computing the means
of the point locations and the means of the normal locations, both on the sphere, and solving
a least squares equation for the full rotation. The problem with applying this algorithm to
unorganized oriented points is that Karcher mean computation can depend on the ordering of the
points.
130
...
..0.02
.0.06
.0.1
.0.14
.0.18
.0.22
. 0.
0.25
.
0.5
.
0.75
.
1
.
1.25
.
1.5
.
Stdev of Noise
.
RelativeError
onTransformation
.
. ..spherical RDM
. ..bi-means
(a) ...
..75.
85.
95.
105.
115.
125.0 .
0.25
.
0.5
.
0.75
.
1
.
1.25
.
1.5
.
1.75
.
Number of Points in Target
.
RelativeError
onTransformation
.
. ..spherical RDM
. ..bi-means
(b)
...
..0.02
.0.06
.0.1
.0.14
.0.18
.0.22
. 0.
0.25
.
0.5
.
0.75
.
1
.
1.25
.
1.5
.
1.75
.
Stdev of Noise
.
RelativeError
onTransformation
.
. ..spherical RDM
. ..bi-means
(c) ...
..75.
85.
95.
105.
115.
125.0 .
0.25
.
0.5
.
0.75
.
1
.
Number of Points in Target
.
RelativeError
onTransformation
.
. ..spherical RDM
. ..bi-means
(d)
Figure 6-11. Registration of spherical oriented point-sets. In a) the robustness to vonMises-Fisher noise is tested and in b) the robustness to missing points istested. The bi-means algorithm is shown to be sensitive to ordering. Whenordering is the same between both point-sets the result is more comparable tospherical RDM, as shown in c) and d).
131
CHAPTER 7CONCLUSION AND FUTURE WORK
7.1 Contributions
In this thesis a new implicit function representation for embedded curves and surfaces
was developed. Based on the literature review in Chapter 1 the represention is the
first linearly composable representation to tout a direct proof of connectedness under
appropriate conditions. We emphasized applications for embeddings into Euclidean space.
The representation is modular, in that it contains as zero level-set several copies of the
curve or surface. As an application, an algorithm for extracting curves from this modular
representation was developed. In addition, an approach to shape statistics and a new
algorithm for simultaneous registration and reconstruction of curves and surfaces was
developed.
The Complex Wave Representation (CWR) uses superposition of local linear
approximation to a curve or surface to stitch together a global implicit function. We
showed that the representation corresponds to the non-vanishing phase-space atoms
of the Modular Distance Function (MDF), a function from which one can recover the
signed distance function, as the parameters of the MDF grow asymptotically. This allows
one to write down a simple parametric equation for a local approximation of the signed
distance—a new contribution to implicit function modeling. Furthermore, we showed that
provided that the samples are structured appropriately that one can recover a closed curve
or set of closed curves that approximates the original curve from the representation.
Beyond these theoretical contributions we developed several useful application of
the CWR. Shape statistics can be done easier with the CWR than with classical implicit
function representations. Towards this end, we applied the kernel arising from the Gabor
frame for kPCA. This kPCA between oriented point-sets is equivalent to doing PCA on
the corresponding CWRs. We showed that one can build a useful simple classifier from
132
the first few principal components of a shape set and showed how to reconstruct new
point-sets from the principal components.
The focus of this thesis is Resonant Deformable Matching (RDM). RDM uses the
closed form L2 distance between CWRs for oriented point-sets to do registration. During
the registration process, additional normal variables can be estimated on the target set.
This allows for simultaneous registration and reconstruction—where the registration
model feeds back in the current reconstruction directly. This subtlety is important for any
approach fusing reconstruction and registration and often leads to pipelined approaches,
but here we use a single framework and the reconstruction happens for free using the
CWR. We showed reconstruction results for surfaces and collections curves—including
challenging sets with abutting curves. We also developed an approach for registration
that uses maximum likelihood on the joint oriented point variables. While only partially
developed here, it is worth noting that the MLE approach to nonrigid registration suffers
from a O(k2n) runtime whereas the L2 is O(nk + n2 + k2). This work, along with other
related work performed during my Ph.D., can be found in papers published in major
conferences (ECCV, ICPR, and QIP) and journals (submitted to PAMI and SIIMS)
[20, 21, 42, 44, 68].
Finally, generalizations of the CWR were developed. By extending beyond linear
modulation factors we can obtain a more expressive frame of shape atoms. In the case of
quadrics, it turns out that the inner products between the resulting atoms are available
in closed form. This can be leveraged for fitting. Finally, the entire framework can be
implemented on geodesically complete embedded Riemannian surfaces. The relationship
between the eikonal and the Schrodinger equation was revisited from this standpoint and
practical uses, like registering shapes on a sphere and representing co-dimension 2 curves
was developed.
133
7.2 Future Work
There are many avenues of research that we did not have time to explore during my
thesis. A few directions that we think are worthwhile for future work by others in the
Shape analysis community.
The models of deformation used throughout this thesis have been traditional,
spline-based approaches. The deformation of the normal feature arises as a byproduct
of deforming the point-set, but can be implemented in closed form. This is done by using a
first-order model that uses the derivative of the transformation map at the point where the
normal is anchored. While this method ensures compatibility between the two components
of the deformation, it privileges the points over the normals. This is because the knot
points are in the embedding space. One might try to develop a model that deals with
this asymmetry. The simplest approach is to allow the normals to deform free of the
spatial model, or possibly according the spatial model plus a free-form deformation in
the normal space. The goal of using this type of model is to allow the normal to deform
free of influence of the spatial transformation while maintaining a certain amount of
fidelity to the spatial transformation. A more sophisticated approach would be to lift the
transformation to a deformation on Rd × Sd−1 and add the constraints
||ϕ′,(1)|mν||2 = 1,
ϕ′,(1)|mν = ϕ(2)ν, (7–1)
for each oriented point. Neither approach was explored directly in this thesis, and
exploring and comparing both would be a nice contribution to oriented point-set
registration and spline deformation fitting. We note that previous work in kriging [64],
spline modeling with uncertainty intervals, has explored a similar set of constraints as
Equations (7–1).
Finally, a note on a simple engineering problem and a very low-hanging avenue
of development. As mentioned above, the CWR model is easily parallelizable. Since
134
there is no inherent ordering of the terms in the summation they can be recorded in
arbitrary order. Indeed, as suggested by the work in Chapter 6 only a narrow band should
be necessary for reconstruction a surface. The narrow band should be drawn from a
neighborhood near the observations. Further, a non-rectangular mesh would probably
provide better resolving ability to a surface reconstruction pipeline. Indeed, a simple
filtering of mesh faces based on the ideas of Chapter 6 would be to drop mesh faces with
low surface integral of the magnitude of the CWR. This would allow one to avoid some
of the artifacts observed in the noisy cases studied above. However, this could result
in surfaces with unexpected boundaries and holes. It may be possible to use the kPCA
approach developed in this work to help patch the holes. Yet another approach is to look
for coherence and reject as outliers or excessive noisy those points which do not exhibit a
certain threshold of coherence. Based on the theoretical work of Chapter 5 connectivity
is a predictable feature of the CWR. This suggests that the coherence feature (how well
the normals of a point line up with nearby normals) should provide a clue of how well
an oriented point will serve as a surface point. However, in regions of high curvature we
expect this measure to break down and so a similar problem to the above arises.
135
REFERENCES
[1] N. Aspert, D. S. Cruz, and T. Ebrahimi, MESH: measuring errors be-tween surfaces using the hausdorff distance, in IEEE International Conference onMultimedia and Expo, 2002, pp. 705–708, http://dx.doi.org/10.1109/ICME.2002.1035879.
[2] G. Aubert and P. Kornprobst, Mathematical Problems in Image Processing:Partial Differential Equations and the Calculus of Variations, no. 147 in AppliedMathematical Sciences, Springer, 2006.
[3] A. Basu, I. R. Harris, N. L. Hjort, and M. Jones, Robust and efficientestimation by minimising a density power divergence, Biometrika, 85 (1998),pp. 549–559.
[4] A. Benaissa and C. Roger, Asymptotic expansion of multiple oscillatoryintegrals with a hypersurface of stationary points of the phase, in Proc. R. Soc. A,vol. 469, The Royal Society, 2013, p. 20130109.
[5] P. J. Besl and N. D. McKay, A method for registration of 3D shapes, IEEETransactions on Pattern Analysis and Machine Intelligence (PAMI), 14 (1992),pp. 239–256.
[6] J. F. Blinn, A generalization of algebraic surface drawing, ACM Transactions onGraphics (TOG), 1 (1982), pp. 235–256.
[7] F. Bogo, J. Romero, M. Loper, and M. J. Black, FAUST: Dataset andevaluation for 3D mesh registration, in Proceedings IEEE Conf. on Computer Visionand Pattern Recognition (CVPR), Piscataway, NJ, USA, June 2014, IEEE.
[8] F. Bookstein, Principal warps: Thin-plate splines and decompositions of deforma-tions, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 11(1989), pp. 567–585.
[9] K. L. Boyer and S. Sarkar, Perceptual Organization for Artificial VisionSystems, vol. 546, Springer Science & Business Media, 2012.
[10] A. M. Bronstein, M. M. Bronstein, and R. Kimmel, Numerical Geometryof Non-Rigid Shapes, Monographs in Computer Science, Springer, 2009, http://dx.doi.org/10.1007/978-0-387-73301-2.
[11] J. Butterfield, On Hamilton-Jacobi theory as a classical root of quantum theory,in Quo Vadis Quantum Mechanics?, Springer, 2005, pp. 239–273.
[12] V. Camion and L. Younes, Geodesic interpolating splines, in EnergyMinimization Methods in Computer Vision and Pattern Recognition (EMMCVPR),Springer, 2001, pp. 513–527.
136
[13] E. Candes, L. Demanet, D. Donoho, and L. Ying, Fast discrete curvelettransforms, Multiscale Modeling & Simulation, 5 (2006), pp. 861–899.
[14] J. C. Carr, R. K. Beatson, J. B. Cherrie, T. J. Mitchell, W. R. Fright,B. C. McCallum, and T. R. Evans, Reconstruction and representation of 3dobjects with radial basis functions, in Proceedings of the 28th annual conference onComputer graphics and interactive techniques, ACM, 2001, pp. 67–76.
[15] N. Charon and A. Trouve, The varifold representation of nonoriented shapes fordiffeomorphic registration, SIAM Journal Imaging Sciences, 6 (2013), pp. 2547–2580,http://dx.doi.org/10.1137/130918885.
[16] Y. Chen, T. A. Davis, W. W. Hager, and S. Rajamanickam, Algorithm887: Cholmod, supernodal sparse cholesky factorization and update/downdate, ACMTransactions on Mathematical Software (TOMS), 35 (2008), p. 22.
[17] M. Cho and K. M. Lee, Progressive graph matching: Making a move of graphsvia probabilistic voting, in IEEE Conference on Computer Vision and PatternRecognition (CVPR), IEEE, 2012, pp. 398–405.
[18] H. Chui and A. Rangarajan, A new point matching algorithm for non-rigidregistration, Computer Vision and Image Understanding (CVIU), 89 (2003),pp. 114–141.
[19] W. Commons, Tait-brian angles, zyx convention, 2013, https://en.wikipedia.org/wiki/File:Taitbrianzyx.svg.
[20] J. Corring and A. Rangarajan, Shape from phase: An integrated level-set andprobability density shape representation, in International Conference on PatternRecognition (ICPR), IAPR, 2014, pp. 46–51.
[21] J. Corring and A. Rangarajan, Resonant deformable matching: Simultaneousregistration and reconstruction, in European Conference on Computer Vision(ECCV), Springer, 2016, pp. 51–68.
[22] K. Crane, C. Weischedel, and M. Wardetzky, Geodesics in heat: A newapproach to computing distance based on heat flow, ACM Transactions on Graphics(TOG), 32 (2013), p. 152.
[23] D. Cremers, S. J. Osher, and S. Soatto, Kernel density estimation andintrinsic alignment for shape priors in level set segmentation, International Journalof Computer Vision, 69 (2006), pp. 335–351.
[24] I. Daubechies, The wavelet transform, time-frequency localization and signalanalysis, IEEE Transactions on Information Theory (IT), 36 (1990), pp. 961–1005.
[25] M. Delfour and J. Zolesio, Shapes and Geometries, Society for Industrialand Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA, USA,second ed., 2011, http://epubs.siam.org/doi/abs/10.1137/1.9780898719826.
137
[26] M. Delfour and J.-P. Zolesio, Shapes and Geometries: Metrics, Analysis,Differential Calculus and Optimization, Advances in Design and Control, Springer,2011.
[27] A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood fromincomplete data via the em algorithm, Journal of the royal statistical society. SeriesB (methodological), (1977), pp. 1–38.
[28] Y. Deng, A. Rangarajan, S. Eisenschenk, and B. C. Vemuri, A Rieman-nian framework for matching point clouds represented by the Schrodinger distancetransform, in IEEE Conference on Computer Vision and Pattern Recognition(CVPR), IEEE, 2014, pp. 3756–3761.
[29] G. deRham, Varietes differentiables, formes, courants, formes harmoniques, Act.Sci. Indust., 1222 (1955).
[30] E. W. Dijkstra, A note on two problems in connexion with graphs, Numerischemathematik, 1 (1959), pp. 269–271.
[31] H. Edelsbrunner, Shape reconstruction with Delaunay complex, in LATIN’98:Theoretical Informatics, Springer, 1998, pp. 119–132.
[32] V. Estellers, M. Scott, and S. Soatto, Robust surface reconstruction, SIAMJournal on Imaging Sciences, 9 (2016), pp. 2073–2098.
[33] V. Estellers, D. Zosso, R. Lai, S. Osher, J. Thiran, and X. Bresson,An efficient algorithm for level-set method preserving distance function, IEEETransactions on Image Processing (TIP), 21 (2012), pp. 4722–4734.
[34] G. B. Folland, Real analysis, Pure and Applied Mathematics (New York), JohnWiley & Sons, Inc., New York, second ed., 1999. Modern techniques and theirapplications, A Wiley-Interscience Publication.
[35] D. Gabor, Theory of communication. Part 1: the analysis of information, Journalof the Institution of Electrical Engineers-Part III: Radio and CommunicationEngineering, 93 (1946), pp. 429–441.
[36] S. Gallot, D. Hulin, and J. Lafontaine, Riemannian Geometry,Springer-Verlag, Heidelberg, Germany, 3 ed., 2004.
[37] J. Glaunes, A. Trouve, and L. Younes, Diffeomorphic matching of distribu-tions: A new approach for unlabelled point-sets and sub-manifolds matching, in IEEEConference on Computer Vision and Pattern Recognition (CVPR), vol. 2, IEEE,2004, pp. 712–718.
[38] S. Gold and A. Rangarajan, A graduated assignment algorithm for graphmatching, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI),18 (1996), pp. 377–388.
138
[39] J. Gomes and O. Faugeras, Reconciling distance functions and level-sets, Journalof Visual Communication and Image Representation, 11 (2000), pp. 209–223.
[40] L. Gorelick, M. Galun, E. Sharon, R. Basri, and A. Brandt, Shaperepresentation and classification using the Poisson equation, IEEE Transactions onPattern Analysis and Machine Intelligence (PAMI), 28 (2006), pp. 1991–2005.
[41] U. Grenander, A unified approach to pattern analysis, Advances in Computers, 10(1970), pp. 175–216, http://dx.doi.org/10.1016/S0065-2458(08)60436-2.
[42] B. H. Guan, J. Corring, M. Sethi, S. Ranka, and A. Rangarajan, Imagestack surface area minimization for groupwise and multimodal affine registration, inInternational Conference on Pattern Recognition (ICPR), IAPR, 2016, pp. 250–256.
[43] R. Guler, S. Tari, and G. Unal, Screened Poisson hyperfields for shape coding,SIAM Journal on Imaging Sciences, 7 (2014), pp. 2558–2590.
[44] K. S. Gurumoorthy, A. Rangarajan, and J. Corring, Gradient densityestimation in arbitrary finite dimensions using the method of stationary phase, arXivpreprint arXiv:1211.3038, (2012).
[45] G. Guy and G. Medioni, Inferring global perceptual contours from local features,in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE,1993, pp. 786–787.
[46] E. Hasanbelliu, L. Sanchez Giraldo, and J. C. Principe, Informationtheoretic shape matching, IEEE Transactions on Pattern Analysis and MachineIntelligence (PAMI), 36 (2014), pp. 2436–2451.
[47] R. J. Hathaway, Another interpretation of the em algorithm for mixture distribu-tions, Statistics & Probability Letters, 4 (1986), pp. 53–56.
[48] C. Heil, J. Ramanathan, and P. Topiwala, Linear independence of time-frequency translates, Proceedings of the American Mathematical Society (AMS), 124(1996), pp. 2787–2795.
[49] C. E. Heil and D. F. Walnut, Continuous and discrete wavelet transforms,SIAM Review, 31 (1989), pp. 628–666.
[50] J. Ho, G. Cheng, H. Salehian, and B. C. Vemuri, Recursive karcher expecta-tion estimators and geometric law of large numbers, in Proceedings of the SixteenthInternational Conference on Artificial Intelligence and Statistics, AISTATS 2013,Scottsdale, AZ, USA, April 29 - May 1, 2013, 2013, pp. 325–332.
[51] X. Huang, N. Paragios, and D. Metaxas, Shape registration in implicit spacesusing information theory and freeform deformations, IEEE Transactions on PatternAnalysis and Machine Intelligence (PAMI), 28 (2006), pp. 1303–1318.
139
[52] B. Jian and B. C. Vemuri, Robust point set registration using Gaussian mixturemodels, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 33(2011), pp. 1633–1645.
[53] M. Kazhdan, M. Bolitho, and H. Hoppe, Poisson surface reconstruction, inProceedings of the Fourth Eurographics Symposium on Geometry Processing, vol. 7,2006.
[54] M. Kazhdan and H. Hoppe, Screened poisson surface reconstruction, ACMTransactions on Graphics (TOG), 32 (2013), p. 29.
[55] I. Kezurer, S. Z. Kovalsky, R. Basri, and Y. Lipman, Tight relaxationof quadratic matching, Comput. Graph. Forum, 34 (2015), pp. 115–128, http://dx.doi.org/10.1111/cgf.12701.
[56] B. B. Kimia, A. R. Tannenbaum, and S. W. Zucker, Shapes, shocks, anddeformations I: the components of two-dimensional shape and the reaction-diffusionspace, International Journal of Computer Vision (IJCV), 15 (1995), pp. 189–224.
[57] R. Kimmel and J. A. Sethian, Computing geodesic paths on manifolds,Proceedings of the National Academy of Sciences, 95 (1998), pp. 8431–8435.
[58] A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, K. Glashoff, andR. Kimmel, Coupled quasi-harmonic bases, Comput. Graph. Forum, 32 (2013),pp. 439–448, http://dx.doi.org/10.1111/cgf.12064.
[59] T. S. Lee, Image representation using 2D Gabor wavelets, IEEE Transactions onPattern Analysis and Machine Intelligence (PAMI), 18 (1996), pp. 959–971.
[60] B. Levy, Laplace-Beltrami eigenfunctions: Towards an algorithm that “understands”geometry, in IEEE International Conference on Shape Modeling and Applications,IEEE, 2006, pp. 13–21.
[61] W. E. Lorensen and H. E. Cline, Marching cubes: A high resolution 3d surfaceconstruction algorithm, SIGGRAPH Computer Graphics, 21 (1987), pp. 163–169,http://doi.acm.org/10.1145/37402.37422.
[62] S. G. Mallat, A theory for multiresolution signal decomposition: the waveletrepresentation, IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI), 11 (1989), pp. 674–693.
[63] J. Manson, G. Petrova, and S. Schaefer, Streaming surface reconstructionusing wavelets, Comput. Graph. Forum, 27 (2008), pp. 1411–1420, http://dx.doi.org/10.1111/j.1467-8659.2008.01281.x.
[64] K. V. Mardia, J. Kent, C. Goodall, and J. Little, Kriging and splines withderivative information, Biometrika, (1996), pp. 207–221.
140
[65] S. Miraku, Volumentric shape description of range data using “blobby model”,vol. 25, july 1991, pp. 227–235.
[66] E. Mjolsness, G. Gindi, and P. Anandan, Optimization in model matching andperceptual organization, Neural Computation, 1 (1989), pp. 218–229.
[67] P. Mordohai and G. Medioni, Tensor voting: a perceptual organization approachto computer vision and machine learning, Synthesis Lectures on Image, Video, andMultimedia Processing, 2 (2006), pp. 1–136.
[68] M. Moyou, J. Corring, A. M. Peter, and A. Rangarajan, A grassmanniangraph approach to affine invariant feature matching, CoRR, abs/1601.07648 (2016),http://arxiv.org/abs/1601.07648.
[69] P. Mullen, F. de Goes, M. Desbrun, D. Cohen-Steiner, and P. Alliez,Signing the unsigned: Robust surface reconstruction from raw pointsets, ComputerGraphics Forum, 29 (2010), pp. 1733–1741, http://dx.doi.org/10.1111/j.1467-8659.2010.01782.x.
[70] J. R. Munkres, Elementary differential topology, vol. 54, Princeton UniversityPress, 1966.
[71] R. M. Murray, Z. Li, S. S. Sastry, and S. S. Sastry, A mathematicalintroduction to robotic manipulation, CRC press, 1994.
[72] A. Myronenko and X. Song, Point set registration: Coherent point drift, IEEETransactions on Pattern Analysis and Machine Intelligence (PAMI), 32 (2010),pp. 2262–2275.
[73] Y. Ohtake, A. Belyaev, M. Alexa, G. Turk, and H.-P. Seidel, Multi-levelpartition of unity implicits, in ACM SIGGRAPH 2005 Courses, ACM, 2005, p. 173.
[74] S. Osher and R. Fedkiw, Level-set methods and dynamic implicit surfaces,vol. 153, Springer Science & Business Media, 2006.
[75] S. Osher and J. A. Sethian, Fronts propagating with curvature-dependentspeed: algorithms based on Hamilton-Jacobi formulations, Journal of ComputationalPhysics, 79 (1988), pp. 12–49.
[76] M. Ovsjanikov, M. Ben-Chen, J. Solomon, A. Butscher, and L. Guibas,Functional maps: a flexible representation of maps between shapes, ACMTransactions on Graphics (TOG), 31 (2012), p. 30.
[77] N. Paragios, M. Rousson, and V. Ramesh, Non-rigid registration usingdistance functions, Computer Vision and Image Understanding (CVIU), 89 (2003),pp. 142–165.
[78] E. Parzen, On Estimation of a Probability Density Function and Mode, The annalsof mathematical statistics, (1962), pp. 1065–1076.
141
[79] M. Pauly, M. H. Gross, and L. Kobbelt, Efficient simplification of point-sampled surfaces, in IEEE Visualization, 2002, pp. 163–170, http://dx.doi.org/10.1109/VISUAL.2002.1183771.
[80] A. Peter, A. Rangarajan, and J. Ho, Shape l’Ane Rouge: sliding waveletsfor indexing and retrieval, in IEEE Conference on Computer Vision and PatternRecognition (CVPR), IEEE, 2008, pp. 1–8.
[81] A. M. Peter and A. Rangarajan, Maximum likelihood wavelet density estima-tion with applications to image and shape matching, IEEE Transactions on ImageProcessing, 17 (2008), pp. 458–468.
[82] A. Rangarajan, Revisioning the unification of syntax, semantics and statistics inshape analysis, Pattern Recognition Letters, 43 (2014), pp. 39–46.
[83] C. E. Rasmussen, The infinite gaussian mixture model., in NIPS, vol. 12, 1999,pp. 554–560.
[84] K. Reda, A. Febretti, A. Knoll, J. Aurisano, J. Leigh, A. E. Johnson,M. E. Papka, and M. Hereld, Visualizing large, heterogeneous data in hybrid-reality environments., IEEE Computer Graphics and Applications, 33 (2013),pp. 38–48.
[85] S. Rusinkiewicz and M. Levoy, Efficient variants of the icp algorithm, inThird International Conference on 3-D Digital Imaging and Modeling, IEEE, 2001,pp. 145–152.
[86] S. Sarkar and K. L. Boyer, Perceptual organization in computer vision: Areview and a proposal for a classificatory structure, IEEE Transactions on Systems,Man and Cybernetics, 23 (1993), pp. 382–399.
[87] F. R. Schmidt, D. Farin, and D. Cremers, Fast matching of planar shapes insub-cubic runtime, in IEEE International Conference on Computer Vision (ICCV),IEEE, 2007, pp. 1–6.
[88] B. Scholkopf, A. Smola, and K.-R. Muller, Nonlinear component analysis asa kernel eigenvalue problem, Neural Computation, 10 (1998), pp. 1299–1319.
[89] M. O. Scully and M. S. Zubairy, Quantum optics, Cambridge university press,1997.
[90] M. Sethi, A. Rangarajan, and K. Gurumoorthy, The Schrodinger distancetransform (SDT) for point-sets and curves, in IEEE Conference on Computer Visionand Pattern Recognition (CVPR), IEEE, 2012, pp. 198–205.
[91] J. A. Sethian, A fast marching level set method for monotonically advancingfronts, Proceedings of the National Academy of Sciences, 93 (1996), pp. 1591–1595.
142
[92] K. Siddiqi, A. Shokoufandeh, S. J. Dickinson, and S. W. Zucker, Shockgraphs and shape matching, International Journal of Computer Vision (IJCV), 35(1999), pp. 13–32.
[93] M. Spivak, A comprehensive introduction to differential geometry, vol. 1-5, Publishor Perish, 3rd ed., 1999.
[94] M. Sussman, P. Smereka, and S. Osher, A level set approach for computingsolutions to incompressible two-phase flow, Journal of Computational physics, 114(1994), pp. 146–159.
[95] R. Szeliski, Image alignment and stitching: A tutorial, Tech. ReportMSR-TR-2004-92, Microsoft Research, September 2004.
[96] N. Thakoor, J. Gao, and S. Jung, Hidden Markov model-based weightedlikelihood discriminant for 2D shape classification, IEEE Transactions on ImageProcessing, 16 (2007), pp. 2707–2719.
[97] D. W. Thompson, On growth and form, Cambridge University Press, 1917, 1945,http://www.biodiversitylibrary.org/item/28884.
[98] Y. Tsin and T. Kanade, A correlation-based approach to robust point set reg-istration, in European Conference on Computer Vision (ECCV), Springer, 2004,pp. 558–569.
[99] J. N. Tsitsiklis, Efficient algorithms for globally optimal trajectories, IEEETransactions on Automatic Control, 40 (1995), pp. 1528–1538.
[100] M. Vaillant and J. Glaunes, Surface matching via currents, in InformationProcessing in Medical Imaging, Springer, 2005, pp. 381–392.
[101] S. R. S. Varadhan, On the behavior of the fundamental solution of the heat equa-tion with variable coefficients, Communications on Pure and Applied Mathematics,20 (1967), pp. 431–455.
[102] R. C. Veltkamp, Shape matching: similarity measures and algorithms, in ShapeModeling and Applications, SMI 2001 International Conference on., IEEE, 2001,pp. 188–197.
[103] G. Wahba, Spline models for observational data, vol. 59 of Regional ConferenceSeries in Applied Mathematics, SIAM, Philadelphia, Pennsylvania, 1990.
[104] Y. Wang, K. Woods, and M. McClain, Information-theoretic matching of twopoint sets, IEEE Transactions on Image Processing (TIP), 11 (2002), pp. 868–872.
[105] E. Wilczok, New uncertainty principles for the continuous gabor transform and thecontinuous wavelet transform, Documenta Mathematica, 5 (2000), pp. 201–226.
143
[106] R. Wong, Asymptotic Approximations of Integrals, Society for Industrial andApplied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA, USA, 1 ed.,2001, http://epubs.siam.org/doi/abs/10.1137/1.9780898719260.
[107] L. Younes, Shapes and Diffeomorphisms, vol. 171 of Applied MathematicalSciences, Spring, New York, New York.
[108] L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, in Advances inNeural Information Processing Systems (NIPS), 2004, pp. 1601–1608.
[109] H. Zhao, A fast sweeping method for eikonal equations, Mathematics ofcomputation, 74 (2005), pp. 603–627.
[110] H.-K. Zhao, S. Osher, B. Merriman, and M. Kang, Implicit and nonpara-metric shape reconstruction from unorganized data using a variational level-setmethod, Computer Vision and Image Understanding (CVIU), 80 (2000), pp. 295–314.
[111] Y. Zheng and D. Doermann, Robust point matching for non-rigid shapes bypreserving local neighborhood structures, IEEE Transactions on Pattern Analysis andMachine Intelligence (PAMI), 28 (2006), pp. 643–649.
[112] F. Zhou and F. de la Torre, Factorized graph matching, in IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), IEEE, 2012, pp. 127–134,http://dx.doi.org/10.1109/CVPR.2012.6247667.
[113] F. Zhou and F. de la Torre, Deformable graph matching, in IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), IEEE, 2013, pp. 2922–2929.
144
BIOGRAPHICAL SKETCH
John Corring attended the Mississippi School for Math and Science before earning
the National Merit Finalist Award and earning bachelor’s degrees in mathematics and
computer science during his undergraduate years at the University of Southern Mississippi.
He began working on image processing problems during his undergraduate years, including
medical and geospatial remote sensing applications, which eventually lead to the current
work. He has published in the International Conference on Pattern Recognition and the
European Conference on Computer Vision during his Ph.D., and is currently working on
journal publications in Transactions on Pattern Analysis and Machine Intelligence and
SIAM Imaging Science.
145