34
Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer Vision versus Front-End Biological Vision Bart ter Haar Romeny 1 , PhD & Luc M .J .Florack 2 , PhD Utrecht University : Image Sciences Institute 1 , Dept.of Computer Science 2 3584 CX Utrecht, the Netherlands B . terHaarRomeny isi . uu . nl & florack cs . uu . nl Introduction The front end visual system belongs to the best studied brain areas. Scale-space theory, as pioneered by Iijima in Japan [11, 41, 43] and Koenderink [12] has been heavily inspired by the important derivation of the Gaussian kernel and its derivatives as regularized differential operators, and the linear diffusion equation as its generating PDE. The view visual system as a ’geometry’ engine is the inspiration of the current work, and simultaneously, the presented examples of applications of (differential) geometric operations may inspire the thinking of the visual system as a geometry engine. Scale-space theory has developed into a serious field [34, 20]. Several comprehensive overview texts have been published in the field [15, 10, 40]. The introduction of a geometry driven conduction term in the diffu- sion equation, making it locally adaptive to differential geometric properties (edge strength, curvature, orienta- tion) by Perona and Malik in the early nineties triggered a wealth of nonlinear PDE developments, which attracted the attention of the mathematical community. So far, however, this robust mathematical framework has seen impact on the computer vision community, but there is still a gap between the more physiologically, psycologically and psychophysically oriented researchers in the vision community. One reason may be the nontrivial mathematics involved, such as group invariance, differential geometry and tensor analysis. The last couple of years symbolic computer algebra packages, such as Mathematica, Maple and Matlab, have developed into a very user friendly and high level prototyping environment. Especially Mathematica combines the advantages of symbolic manipulation and processing with an advanced front-end text proces- sor. This paper has been completely written in Mathematica version 4 as a notebook. The advantage is that this paper can be read as an interactive paper: the high level code of any function is directly visible, and can be operated directly, as well as modified or templated for own use. Students can now use the exact code rather then pseudocode. With these high level programming tools most programs can be expressed in very few lines, so it keeps the reader at a highly intuitive but practical level. Mathematica notebooks are portable, and run on any system equivalently. Previous speed limitations are now well overcome.

Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Front End Vision: A Multiscale Geometry Engine

Scale-Space Theory in Computer Visionversus Front-End Biological Vision

Bart ter Haar Romeny1, PhD & Luc M.J.Florack2, PhD

Utrecht University: Image SciencesInstitute1, Dept.of Computer Science2

3584 CX Utrecht, the Netherlands

B.terHaarRomeny � isi.uu.nl & florack � cs.uu.nl

Introduction

The front end visual system belongs to the best studied brain areas. Scale-space theory, as pioneered by

Iijima in Japan [11, 41, 43] and Koenderink [12] has been heavily inspired by the important derivation of the

Gaussian kernel and its derivatives as regularized differential operators, and the linear diffusion equation as

its generating PDE. The view visual system as a ’geometry’ engine is the inspiration of the current work, and

simultaneously, the presented examples of applications of (differential) geometric operations may inspire the

thinking of the visual system as a geometry engine.

Scale-space theory has developed into a serious field [34, 20]. Several comprehensive overview texts have

been published in the field [15, 10, 40]. The introduction of a geometry driven conduction term in the diffu-

sion equation, making it locally adaptive to differential geometric properties (edge strength, curvature, orienta-

tion) by Perona and Malik in the early nineties triggered a wealth of nonlinear PDE developments, which

attracted the attention of the mathematical community.

So far, however, this robust mathematical framework has seen impact on the computer vision community,but there is still a gap between the more physiologically, psycologically and psychophysically oriented

researchers in the vision community. One reason may be the nontrivial mathematics involved, such as group

invariance, differential geometry and tensor analysis.

The last couple of years symbolic computer algebra packages, such as Mathematica, Maple and Matlab,

have developed into a very user friendly and high level prototyping environment. Especially Mathematica

combines the advantages of symbolic manipulation and processing with an advanced front-end text proces-

sor. This paper has been completely written in Mathematica version 4 as a notebook. The advantage is that

this paper can be read as an interactive paper: the high level code of any function is directly visible, and can

be operated directly, as well as modified or templated for own use. Students can now use the exact code

rather then pseudocode. With these high level programming tools most programs can be expressed in very

few lines, so it keeps the reader at a highly intuitive but practical level. Mathematica notebooks are portable,

and run on any system equivalently. Previous speed limitations are now well overcome.

Page 2: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

The main focus of the paper is twofold: to provide a rehearsal of the derivation of the Gaussian kernel and its

derivatives as an essential class of front-end vision aperture functions, and to provide a practical tutorial for

a broad audience to be able to do geometric reasoning with robust multiscale differential operators on dis-

crete images. This may break ground for the view of the front-end visual system as a geometry-engine, or

'inference machine', rather then do a spatial frequency analysis.

This paper can only focus on a small area, the differential geometry and its features. Much research is cur-

rently underway. An important area is especially the deep structure of images, where the relations between

scales are studied.

Initialization

We first initialize Mathematica with a path to the image directory, load some Graphics packages, set some

often used options for some plotfunctions, turn off the spellingchecker and opimize for speed and memory.

$Path � Append � $Path , " d: � images � " � ; ��� Graphics‘ ;

SetOptions ListDensityPlot , Mesh False ,

PlotRange � All , Frame � False , AspectRatio Automatic � ;SetOptions � ListContourPlot , PlotRange � All , Frame � False � ;SetOptions � ListPlot3D , PlotRange � All , Axes � False , Mesh � False � ;SetOptions � Plot3D , PlotPoints � 30, PlotRange � All , BoxRatios ��� 1, 1, .6 � ,

Boxed � False , Shading False , Axes ! False , ViewPoint "$# 0.950 , % 2.985 , 1.280 &(' ;SetOptions ) Integrate , GenerateConditions * False + ;Off , Table :: iterb - ; Off . StringJoin :: string / ;Off 0 General :: spell1 1 ; Share 243 ; $HistoryLength 5 10;

Biological inspiration:Receptive field profiles from first principles

In mathematics objects have no scale. Points have zero size, lines have zero width, differential operators have

neighborhoods shrinking to zero, making it strict local operators. In physics however, objects live on a range

of scales. Humankind can observe about 50 decades of scale [18]. In physics, dimensional units are essential,

there is no such thing as a physical 'point'. In front-end vision the apparatus is specifically equipped to

extract multiscale information: the threshold modulation depth is constant ( 6 5%) over more then 2 decades

of sizes, i.e. the visual system has a large range of sampling apertures.

One branch of biologically motivated multiscale computer vision is known as 'scale-space theory' [12]. Many

axiomatic approaches to scale-space theory exist. For an overview see the well documented paper by Weick-

ert [41]. We start this paper with a treatment of the derivation of the aperture functions in the very first

stages of the visual front-end from first principles.

We consider the physics of the observation process: Any vision system, whether biological or artificial, has to

take samples from a scene in the outside world. This is done through a sampling aperture, which has to have

2 Romeny-c.nb

Page 3: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

a finite size in order to integrate the entity to be measured (light intensity, X-ray radiation etc). At this stage,

we have no clue for what the size should be, so we leave it a free parameter.

We consider the visual system as a concatenation of steps, where the very first step is responsible for the

measurement. This stage has to be uncommitted, the original data should be measured as careful as possible.

In fact, at this stage we know nothing, and we have no preference whatsoever for any aspect of the data. We

can establish the following first principles:

1. The measurement is done in a linear fashion: we do not allow any nonlinearities at this stage, they should

need to incorporate any knowledge of some kind.

2. There is no preference for location in the visual scene: any location should be measured in the same fash-

ion, i.e. with the same aperture function.

3. There is no preference for orientation: structures with a particular orientation, like vertical trees or a

horizontal horizon, should have no preference, any orientation is just as likely. This necessitates an aperture

function with a circular integration area.

4. There is no preference for size: any size of structure, object, texture etc. is at this stage just as likely. We

have no reason just to look only through the finest of apertures. The visual world consists of structures at any

size, and this should be measured at any size. The biological motivation comes in here: the retina and subse-

quent processing layers measure with receptive fields at a wide range of scales..

When we want to establish principal relations between a set of physical quantities involved in a system, such

as our front-end observation system, we first study their dependence relation through dimensional analysis.

Any physical quantity has a dimension, and consequently it is expressed in units of that dimension. Examples

are meters, seconds etc. When we consider such a set of a physical quantities, a typically small number of

dimensionless combinations can be formed. The importance of this is given by the fact that these dimension-

less units can (and must) be expressed as functions of each other. This gives us a natural, physics based,

starting point for the relations to derive. The Pi-theorem states that the number of such dimensionless combina-

tions is equal to the number of variables minus the rank of the matrix m of the variables against their units.

We examplify this statement for the visual front-end. We do the reasoning in the Fourier domain, as this

turns out to be easier and leads to smaller equations. We give the theory for 2D. We will see that expansion

to other dimensionalities is straightforward. We use scripted symbols for variables in the Fourier domain. We

consider 'looking through an aperture'. The matrix m becomes:789999999

: ; <0 =>@?BAC?ED

1 F 1 0 0GIHKJL>NM0 0 1 1

OPQQQQQQQ

were R is the size of the aperture, S the spatial coordinate (frequency in the Fourier domain), T 0 the lumi-

nance of the outside world, and U the luminance as processed in our system. We have four physical entities

( V , W , X 0 and Y ), and the rank of the matrix is two:

TensorRank Z\[L[ 1, ] 1, 0, 0 ^ , _ 0, 0, 1, 1 `(`(a2

Romeny-c.nb 3

Page 4: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

so we may expect two independent dimensionless numbers to be extracted.

The two dimensionless numbers are given by the nullspace of our matrix, i.e. the list of basis vectors that

satisfy the matrix equation m.xbdc 0ef .

NullSpace g\h(h 1, i 1, 0, 0 j , k 0, 0, 1, 1 l(l(mon(n MatrixForm

p 0 0 q 1 11 1 0 0 r

So from the two rows (for the two dimensionless combinations) we find st t t t t t t tu0

and v@w as the two basis dimen-

sionless entities. They can therefor be expressed into each other: xy y y y y y y yz0 {}|d~����o� , where � is the kernel (filter,

aperture) function in the Fourier domain to be found. We now plug in our first principles, one by one.

No preference for location, together with the prerequisite for linearity, leads to the recognition of the process

as a convolution. The aperture function is shifted over the whole image domain, with no preference for

location: any location is measured ('filtered', 'observed') with the same aperture function (kernel, template,

filter, receptive field: all the same thing). This is written as:

L � x, y��� L0 � x, y�@� G� x, y� � � � �� L0 � u, v� G� x � u, y � v��� u � v

where L � x, y� is the luminance distribution obtained, L��� x, y� is the luminance distribution in the outside

world to be measured, and G  x, y¡ is our aperture function in the spatial domain. In the Fourier domain, a

convolution of functions translates to a regular product between the Fourier transforms of the functions:¢¤£�¥x, ¦ y §K¨ª© 0 «�¬ x, ­ y ® . ¯±°�² x, ³ y ´

The axiom of isotropy translates into the fact that we now only have to consider the length of our spatial

vector: { µ x, ¶ y} = ·¹¸º º » ||¼ ||.

The axiom of scale-invariance is the core of the reasoning: when we observe (blur) an observed image again,

we get an image which is blurred with the same but wider kernel:½¿¾�ÀÂÁ 1 ÃÅıÆ�ÇÉÈ 2 ÊÌËÎͱÏ�ÐÉÑ 1 ÒÔÓÉÕ 2 Ö .A general solution of this equation is: ׿Ø�ÙÉÚÂÛÌÜ exp ÝÞÝ�ßÔàÉáÉâ p ã . We must raise the argument here to the power

of p because we are dealing with the dimensionless parameter äÉå .

The dimensions are independent, thus separable: æçæéèëêì ìÔí$îçîEïñðëò1 óÂô eõ 1 öø÷�ù 1 úÂû eü 2 ý ... where eþ i are the basis

unit coordinate vectors.

The magnitude of ÿçÿ�� �� ������� is calculated by means of Pythagoras from the projections along e i, so we add the

squares, i.e. p 2. We further demand the solution to be real, so � 2 is real. We notice that when we open the

aperture fully, we blur everything out, so lim �� �� 0 ����������� 0 . This means that � 2 must be negative. We

choose � 2 � � 1! ! ! !2

and finally get the answer: "�#�$&%' ' , (�)+* exp,.- 1/ / / /2 0 2 1 2 2 , which is in the spatial domain:

G3 x4 , 5�6�7 18 8 8 8 8 8 8 88 8 8 8 8 8 8 88 8 89;: : : : : : : : : : : : : :2 <>= 2

exp?A@ xBC .xDEF F F F F F F F F F2 G 2 H .

This is the Gaussian kernel, which is the Green's function of the linear, isotropic diffusion equationI 2LJ J J J J J J J JKx2 LNM 2LO O O O O O O O O OP

y2 Q Lxx R Lyy SUT LV V V V V V VW s, where s X 2 Y 2 is the variance. Note that the derivative to scale is here the

derivative to Z 2, which also immediately follows from a considerations of the dimensionality of the equation.

4 Romeny-c.nb

Page 5: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

All partial derivatives of the Gaussian kernel are solutions too of the diffusion equation.

So the first important result is that we have found the Gaussian kernel and all of its partial derivatives as the

unique kernel for a front-end visual system that satisfies the constraints "no preference for location, scale and

orientation" and linearity. We have found a one-parameter family of kernels, where the scale [ is the free

parameter. This is a general feature of the biological visual system: the exploitation of ensembles of aperture

functions, which are mathematically modeled by families of kernels for a free parameter, e.g. for all scales,

derivative order, orientation, stereo disparity, motion velocity etc.

The Gaussian kernel is the unique kernel that generates no spurious resolution (e.g. the squares so familiar

with zooming in on pixels). It is the physical point operator, the Gaussian derivatives are the physical deriva-

tive operators.

Gaussian partial derivative kernels

Here are the receptive field sensitivity structures of some members of the Gaussian derivative family:

g \ x_ , y_ , ] _ ^ : _ 1` ` ` ` ` ` ` `` ` ` ` ` ` ` `2 acb 2

Exp dfe x2 g y2h h h h h h h hh h h h h h h h h h2 i 2 j ;

Block kml $DisplayFunction n Identity o ,p1 p Plot3D q g r x, y, 1 s , t x, u 3.5 , 3.5 v , w y, x 3.5 , 3.5 y{z ;p2 | Plot3D } Evaluate ~ D � g � x, y, 1 � , x ��� , � x, � 3.5 , 3.5 � , � y, � 3.5 , 3.5 ��� ;p3 � Plot3D � Evaluate � D � g � x, y, 1 � , x , y ��� , � x, � 3.5 , 3.5 � , � y, � 3.5 , 3.5 ��� ;laplacean � x_ , y_ , � _ � : � D � g � x, y, �;  , ¡ x, 2 ¢{£�¤ D ¥ g ¦ x, y, §>¨ , © y, 2 ª{« ;

p4 ¬ Plot3D ­ Evaluate ® laplacean ¯ x, y, 1 °�° , ± x, ² 3.5 , 3.5 ³ , ´ y, µ 3.5 , 3.5 ¶�· ; ¸ ;Show ¹ GraphicsArray º¼»�» p1, p2 ½ , ¾ p3, p4 ¿�¿�À , ImageSize Áà300 , 240 Ä{Å ;

Romeny-c.nb 5

Page 6: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Upper left: the Gaussian kernel as the zero-th order operator; upper right: Æ GÇ Ç Ç Ç Ç Ç Ç ÇÈx

; lower left: É 2GÊ Ê Ê Ê Ê Ê Ê ÊÊ Ê Ê ÊËx Ì y ; lower right:

retinal and LGN center-surround receptive fields are well modeled by the (positive resp. negative) LaplaceanÍ 2GÎ Î Î Î Î Î Î Î Î ÎÏx2 ÐÒÑ 2GÓ Ó Ó Ó Ó Ó Ó Ó Ó ÓÔ

y2 of the Gaussian kernel.

The receptive fields in the primary visual cortex closely resemble Gaussian derivatives, as was first noticed

by Young [Young 1984, 1986] and Koenderink [Koenderink 1984], and they may accomplish a double

simultaneous task: observation and differentiation. These RF’s come at a wide range of sizes, and at all

orientations.

Below two examples are given of the measured receptive field sensitivity profile of a cortical simple cell (left)

and a Lateral Geniculate Nucleus (LGN) center-surround cell, as measured by DeAngelis, Ohzawa and

Freeman [4], [http://totoro.berkeley.edu/].

Left: cortical simple cell, well modeled by a first order Gaussian derivative kernel. Right: center-surround

LGN cell, well modeled by the Laplacean of a Gaussian. From [4].

Through the center-surround structure at the very first level of measurement on the retina the Laplacean of

the input image can be seen to be taken. The linear diffusion equation states that this Laplacean is equal to

the first derivative to scale: Lxx Õ Lyy ÖØ× LÙ Ù Ù Ù Ù Ù ÙÚ s. One conjecture for its presence at this level could be that the

visual system actually might measure Û LÜ Ü Ü Ü Ü Ü ÜÝs, i.e. the slight change in signal Þ L when the aperture is changed

with ß s: at homogeneous areas there is no output, at highly textured areas there is much output. Integrating

both sides of à L áãâ Lxx ä Lyy åçæ s over all scales gives the measured intensity in a robust fashion.

Derivatives of sampled, i.e. observed data

The derivative of the observed data L0 è x, yé{ê Gë x, y; ì�í is given by îï ï ï ï ï ïðx ñ L0 ò x, yó{ô Gõ x, y; ö�÷ùø , which can

be written as L0 ú x, yûýüÿþ� � � � � ��x

G�x, y; ��� . The commutation of the convolution and the derivative operators is

possible because of their linearity, which is easily shown in the Fourier domain. From this we can see the

following important results:� Differentiation and observation can be done in a single step: convolution with a Gaussian derivative kernel.� Differentiation is now done by integration, i.e. by the convolution integral.� The Gaussian kernel is the physical analogon of a mathematical point, the Gaussian derivative kernels are

the physical analogons of the mathematical differential operators. Equivalence is reached for the limit when

6 Romeny-c.nb

Page 7: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

the scale of the Gaussian goes to zero: lim� 0 G� x; �������� x� , where ��� x� is the Dirac delta function, and

lim ��� 0 � �G� x; �� ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! !"

x # $ x.% Any differention blurs the data somewhat, with the amount of the scale of the differential operator. There

is no way out this increase of the inner scale, we can only try to minimize the effect.

The Gaussian kernel has by definition a strong regularizing effect. It was shown by Schwartz [39] that

differentiation of distributions of data (such as sampled data) has to be accomplished by convolution with a

smooth testfunction. It is important to realize that the process of observation is the regularizer. Recently some

interesting papers have shown the complete equivalence of Gaussian scale space regularization with a num-

ber of other methods for regularization [19, 38].

There have been published many formulations to derive the front-end aperture function as the Gaussian

kernel and its derivatives. For an overview see Weickert [41] and Lindeberg[17].

Gabor kernels

The derivation given below required first principles be plugged in that essentially stated "we know knothing"

(at this stage of the observation). Of course, we can relax these principles, and introduce some knowledge.

When we want to derive a set of apertures tuned to a specific spatial frequency k& in the image, we add thisphysical quantity to the matrix of the dimensionality analysis:'()))))))

* + L0 L k,.-0/1-321 4 1 0 0 5 1687:9�,<;0 0 1 1 1

=>???????

Following the exactly similar line of reasoning, we end up from this new set of constraints with a new family

of kernels, the Gabor family of receptive fields, with are given by a sinusoidal function (at the specified

spatial frequency) under a Gaussian window:

gabor @ x_ , A _ B : C Sin D x E 1F F F F F F F F F F F F F F F FF F F F F FGIH H H H H H H H H H H H H H2 JLK 2

Exp MON x2P P P P P P P PP P P2 Q 2 R ;

Note the similarity between Gabor and Gaussian derivative kernels. They can be made to look very similar by

an appropriate choice of parameters:

Romeny-c.nb 7

Page 8: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

gauss S x_ , T _ U : V 1W W W W W W W W W W W W W W W WW W W W W WXZY Y Y Y Y Y Y Y Y Y Y Y Y Y2 []\ 2

Exp ^O_ x2` ` ` ` ` ` ` `` ` `2 a 2 b ;

p1 c Plot dOe gabor f x, 1 g , h x, i 4, 4 j ,PlotStyle k Dashing lnm 0.02 , 0.02 oqp , DisplayFunction r Identity s ;

p2 t Plot u Evaluate v D w gauss x x, 1 y , x z�z , { x, | 4, 4 } , DisplayFunction ~ Identity � ;Show �n� p1, p2 � , DisplayFunction � $DisplayFunction � ;

-4 -2 2 4

-0.2

-0.1

0.1

0.2

The essential difference is that Gabor functions have an infinite number of zero crossings, the Gaussian

derivatives as many as the order of differentiation. By relaxing or modifying other constraints, we might find

other families of kernels. We conclude this section by the realization that the front-end visual system at the

retinal level has a task to be uncommitted, no feedback from higher levels is at stake, so the Gaussian kernel

seems a good candidate to start exploring with at this level. The extensive feedback loops from the primary

visual cortex to LGN may give rise to ’geometry-driven diffusion’ [30], nonlinear scale-space theory, where

the early differential geometric measurements through e.g. the simple cells may modify the kernels at other

levels. Nonlinear scale-space theory will be extensively treated in a forthcoming interactive paper.

Differential geometry and invariance

It is essential to work with descriptions that are independent of the choice of coordinates. This was Ein-

stein's impetus in his development of the general theory of relativity. This means, that when we apply a

transformation on our coordinates, we like our local image properties to be independent of this transforma-

tion. E.g. if we rotate our {x,y} coordinate frame, we do not want locale measures as edge strength or curva-

ture to be changed. Coordinate transformations can be divided in groups. E.g. all coordinate transformations

that leave the axes of the coordinates perpendicular (e.g. rotations, translations, mirroring and scaling) form

the group of the orthogonal transformations. Another group is the group of the affine transformations, where

the new coordinates {x',y'} are acquired through a linear transformation applied to the original coordinates

{x,y}, with a, b, c and d constants:� x’y’ � = � a b

c d ��� x y �Affine transformations occur when we view objects obliquely at a relatively large distance. At shorter dis-

tances such views are described with perspective transformations.

8 Romeny-c.nb

Page 9: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Entities that do not change under a group of coordinate transformations are called invariants under that

particular group. The only geometrical entities that make physically sense are invariants. In the words of

Hermann Weyl: "any invariant has a specific meaning", and as such they are widely studied in computer

vision theories.

In this paper we only study orthogonal and affine invariants, as they form an important basic group and are

often encountered in computer vision.

Multiscale derivatives: implementations

In order to get some feeling of the interactive use of Mathematica, we start in this section with three implemen-

tations of convolution with a Gaussian derivative kernel (in 2D): implementation in the Fourier domain, in the

spatial domain with a 2D kernel, and in the spatial domain exploiting the separability property through two

1D kernel convolutions. Blurring, i.e. convolution with the plain Gaussian kernel, is done through convolu-

tion with the zero order Gaussian derivative.

The function gDf [ i m, nx, ny, � ] implements the convolution of the image with the Gaussian derivative

for 2D data in the Fourier domain. This is an exact function, no approximations other then the finite periodic

window in both domains. We explicitly give the code of the functions here, so you see how it is implemented,

the reader may make modifications as required. For Mathematica novices: all information on (capitalized)

internal functions is on board in the Help Browser (highlight+key F1).

Variables: i m = 2D image (as a list structure)

nx, ny = order of differentiation to x resp. y� = scale of the kernel, in pixels

gDf � im_ , nx_ , ny_ , � _ � : � Module ��� xres , yres , gdkernel � ,�yres , xres �I� Dimensions � im � ;

gdkernel � N � Table � Evaluate � D � 1                              2 ¡£¢ 2

Exp ¤O¥ x2 ¦ y2§ § § § § § § §§ § § § § § § § § §2 ¨ 2 © , ª x, nx « , ¬ y, ny ­�®�® ,¯

y, °.± yres ² 1 ³µ´ 2, ¶ yres · 1 ¸µ¹ 2 º , » x, ¼¾½ xres ¿ 1 ÀµÁ 2,  xres à 1 ĵŠ2 Æ�Ç�Ç ;Chop È N ÉËÊÍÌ Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ì Ìxres yres InverseFourier Î Fourier Ï im ÐFourier Ñ RotateLeft Ò gdkernel , Ó yres Ô 2, xres Õ 2 Öq×�×�×�Ø�ØqØ ;

This function is rather slow, but is exact. Use it for 64x64 and 128x128 images only.

The function gDc[ i m, nx, ny, Ù ] implements the same function in the spatial domain. The parameters are

the same as above. This function is much faster, as it exploits the internal function Li s t Convol ve, and

applies Gaussian derivative kernels with a width truncated to +/- 4 standard deviations, which of course can

freely be changed.

Romeny-c.nb 9

Page 10: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

gDc Ú im_ , nx_ , ny_ , Û _ Ü : ÝModule Þ8ß x, y, kernel à ,

kernel á N â Table ã Evaluate ä D å 1æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ2 ç]è 2

Exp éOê x2 ë y2ì ì ì ì ì ì ì ìì ì ì ì ì ì ì ì ì ì2 í 2 î , ï x, nx ð , ñ y, ny ò�ó�ó ,ô

y, õ 4 ö , 4 ÷ùø , ú x, û 4 ü , 4 ýÿþ���� ;ListConvolve � kernel , im , Ceiling � Dimensions � kernel ��� 2 ��

The fastest implementation exploits the separability of the Gaussian kernel, and this implementation is mainly

used in the sequel:

gD � im_ , nx_ , ny_ , _ � : � Module ��� x, y, kx , ky , tmp � ,kx ��� N � Table � Evaluate � D � 1� � � � � � � � � � � � � � � �� � � � ���� � � � � � � � �2 ! Exp "$# x2% % % % % % % %% % %

2 & 2 ' , ( x, nx )�*+* , , x, - 4 . , 4 /1032�2�4 ;ky 5�6 N 7 Table 8 Evaluate 9 D : 1; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ;; ; ; ; ;<�= > > > > > > > >2 ? Exp @$A y2B B B B B B B BB B B

2 C 2 D , E y, ny F�G�G , H y, I 4 J , 4 KML�N+N�O ;tmp P ListConvolve Q kx , im , Ceiling R Dimensions S kx T�U 2 V�V ;Transpose W

ListConvolve X ky , Transpose Y tmp Z , Reverse [ Ceiling \ Dimensions ] ky ^�_ 2 `�`�`�`�a ;

Some examples:

Convolving an image with a single point (a delta function) with the Gaussian derivative kernels, gives the

kernels themselves., i.e. the pointspreadfunction. E.g. the second order to x, first order to y (at b =10 pixels)

and the eight order to x:

spike c Table d 0. , e 128 f , g 128 h�i ; spike j�j 64, 64 k+kml 1. ;

Block n�o $DisplayFunction p Identity q ,p1 r ListDensityPlot s gD t spike , 2, 1, 10. u�u ;p2 v ListDensityPlot w gD x spike , 8, 0, 10. y�y ; z ;

Show { GraphicsArray |~} p1, p2 ����� ;

The construction with $Di spl ayFunct i on is necessary to calculate but not display the plots. We read an

image with I mpor t and only use the first element [ [ 1, 1] ] of the returned structure as this contains the

pixeldata.

10 Romeny-c.nb

Page 11: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

im � Import � " bomb256. gif " ����� 1, 1 ��� ;We start with just blurring at a scale of � =4 pixels and show the result as image and height plot:

Block ��� $DisplayFunction � Identity � ,p1 � ListDensityPlot � gD � im , 0, 0, 4. ��� ;p2 � ListPlot3D � gD � 255 � im , 0, 0, 4 ��� ; � ;

Show � GraphicsArray �~� p1, p2 ��� , ImageSize ��� 440 , 220 ��� ;

A movie of a linear intensity scale-space is made with the Tabl e function for   exponentially running from

1 to e2.5 pixels in steps of e0.25 pixel. Double-clicking one of the resulting images starts the animation. Con-

trols are on the bottom windowbar.

im ¡ Import ¢ " mr128 . gif " £¥¤�¤ 1, 1 ¦�¦ ;ss § Table ¨ListDensityPlot © gD ª im , 0, 0, Exp «­¬M®�® , ImageSize ¯�° Dimensions ± im ²�² ,³µ´

, 0, 2.5 , .25 ¶+· ;

Romeny-c.nb 11

Page 12: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

12 Romeny-c.nb

Page 13: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

The sequence can be saved as an animated GIF movie (e.g. for use in webpages) with:

Romeny-c.nb 13

Page 14: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Export ¸ " d: ¹ tmp º scalespace . gif " , ss , " GIF" » ;The gradient ¼¾½ ½¿½ ½¿½ ½¿½ ½¿½ ½¿½¿½ ½¿½ ½¿½ ½¿½ ½ ½Lx

2 À Ly2 on a scale Á = 1 pixel:

grad  ListDensityPlot ÃÅÄÇÆ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ Æ ÆgD È im , 1, 0, 1 É 2 Ê gD Ë im , 0, 1, 1 Ì 2 , ImageSize Í�Î 128 , 128 Ï�Ð ;

To change the window/level (contrast/brightness) settings one must change the displayed range of intensity

values:

Show Ñ grad , PlotRange Ò�Ó 0, 30 Ô�Õ ;

Accuracy of differential operators

When we decrease the size of the kernel in the spatial domain, it becomes increasingly difficult to fit the

Gaussian derivative kernel with its zerocrossings. For a given order of differentiation we find that there is a

limiting scale-size below which the results are no longer exact. E.g. when we study the derivative of a ramp

with slope 1, we expect the outcome to be correct. Let us look at the observed derivative at the center of the

image for a range of scales (0.4 < Ö < 1.2 in steps of 0.1):

14 Romeny-c.nb

Page 15: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

im × Table Ø x, Ù y, 64 Ú , Û x, 1, 64 Ü+Ý ;b Þ Table ß�àµá , gDf â im , 1, 0, ãMä¥å+å 32, 32 æ�æ�ç , è�é , .4 , 1.2 , .1 ê�ë ;ListPlot ì b, PlotJoined í True , PlotStyle î Thickness ï 0.01 ð ,

PlotRange ñ�ò All , ó 0.8 , 1.4 ô+ô , AxesLabel õ�ö " ÷ " , " ø xL" ù , AxesOrigin ú�û 1, .8 ü�ý ;

0.4 0.6 0.8 1.2 þ0.9

1

1.1

1.2

1.3

ÿxL

The value of the derivative starts to deviate for scales smaller then say � = 0.6.

There is a fundamental relation between the order of differentiation, scale of the operator and the accuracy

required. We will derive now this relation.

The Fourier transform of a Gaussian kernel is again a Gaussian:

gauss � x_ , � _ � : � 1� � � � � � � � � � � � � � � �� � � � � ���� � � � � � � � � � � � � �2 � 2

Exp �� x2� � � � � � � �� � �2 � 2 � ;

fftgauss ��� _, � _ ��� FourierTransform � gauss � x, ��� , x, � �!#" 1$ $ $ $2 % 2 & 2'('('('('('('('('('('('('('('(''(' '(' ')+*(* *(* *(*(**

2 ,The Fourier transform of the n-th derivative of a function is - i .0/ n times the Fourier transform of the function:

fftgaussD 1�2 _, 3 _ 4�5 FourierTransform 6 D 7 gauss 8 x, 9;: , x < , x , =?>

@BADCFE 1G G G G2 H 2 I 2 JK(K(K(K(K(K(K(K(K(K(K(K(K(K(K(KK(K(K(K(K(K(K(KK(K K(K KL+M(M M(M M(M M M2 N

A smaller kernel in the spatial domain gives rise to a wider kernel in the Fourier domain, as shown below for

a range of widths of first order derivative Gaussian kernels (in 1D):

Romeny-c.nb 15

Page 16: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Plot3D O fftgauss P�Q , R�S , TVU , WYX , Z\[ , ]_^ , .4 , 2 ` , PlotPoints a 30,

AxesLabel bdc " e " , " f " , " fft " g , Axes h True , Boxed i True j ;

-20

2k

0.5

1

1.52l

0

0.1

0.2

0.3

0.4

fft

-20

2m

0.5

1

1.52n

We plot the Fourier spectrum of a kernel that shows aliasing:

FilledPlot oqp If r�sutwv�xyv�t , fftgauss z�{ , .5 | , 0 } , fftgauss ~�� , .5 �u� ,�V�, � 2 � , 2 Pi � , Fills ���Y�u� 1, Axis � , GrayLevel � .5 �u��� ,

Ticks �d�u���Y� , ��� , Automatic � , AxesLabel �d� " � " , " ����� , �¡  .4 ¢ " £u¤ ;

¥�¦ ¦ §0.1

0.2

0.3

0.4

¨ª©�«, ¬;­ .4 ®

¯�° ° ±0.1

0.2

0.3

0.4

²ª³�´, µ;¶ .4 ·

The error is defined as the amount of the energy (the square) of the kernel that is ’leaking’ relative to the total

area under the curve (note the integration ranges):

error ¸ n_, ¹ _ º¼» 100

½¿¾ À�ÁI Â;à 2 n fftgauss Ä�Å , Æ;Ç 2 È?ÉÊ Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ÊÊ Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ÊÊ Ê Ê Ê Ê Ê Ê ÊÊ Ê Ê Ê Ê ÊË

0Ì�Í I Î;Ï 2 n fftgauss Ð�Ñ , Ò;Ó 2 Ô?Õ100 Ö¡Ö 1 × 2 n Ø Gamma Ù 1Ú(Ú Ú Ú

2 Û n Ü�Ý 2 Gamma Þ 3ß(ß ß ß2 à n áãâdä 1 å 2 n æ 2 Gamma ç 1è(è è è2 é n, ê 2 ë 2 ì¡íî(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(îî(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(îî(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(îî(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(îî(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(î(îî(î(î(î(î(î(î(î(î(î(î(î(î(î(î(îî(î(î(î(î(î(î(îî(î(î(î î(îï

1 ð 2 n ñ 2 Gamma ò 1ó(ó ó ó2 ô n õ

16 Romeny-c.nb

Page 17: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

We plot this Gammafunction for scales between ö�÷ 0.2 ø 2 and order of differentiation from 1 to 10, and we

insert the 5% error line in it (we have to lower the plot somewhat to make the line visible):

Block ùqú $DisplayFunction û Identity ü ,p1 ý Plot3D þ error ÿ n, � ��� 6, ��� , .2 , 2 � , � n, 1, 10 , PlotRange All ,

AxesLabel � � " � " , " n" , " error %" � , Boxed � True , Axes � True � ;p2 � ContourPlot � error � n, ��� , ��� , .2 , 2 � , � n, 1, 10 � ,

ContourShading � False , Contours � 5 !#" ;c3d $ Graphics3D % Graphics & p2 ')(#( 1 *#*,+ . Line - pts_ . : /1032 Thickness 4 .01 5 ,

val 6 Apply 7 error , First 8 pts 9#9 ; Line : Map ; Append < #, val = &, pts >?>#@BAC>#> ;Show D p1, c3d E ;

0.51

1.52F

24

6810n

0

25

50

75

error %

0.51

1.52G

24

6810n

The lesson from this section is that we should never make the scale of the operator, the Gaussian kernel, too

small. The lower limit is indicated in the graph above. A similar reasoning can be set up for the outer scale,

when the aliasing occurs in the spatial domain.

Natural coordinates

The intensity of images and invariant features at larger scale decreases fast. This is due to the non-scaleinvari-

ant use of the differential operators. For, if we consider the transformation xH H H H HIKJ xL , then xM is dimensionless.

At every scale now distances are measured in a distance yardstick with is scaled with the scale itself, i.e.

scale-invariant. The dimensionless coordinate is termed the natural coordinate. This implies that the deriva-

tive operator in natural coordinates has a scaling factor: N nO O O O O O O O OPxQ n RTS n U nV V V V V V V V VW

xn .

Here we generate a scale-space of the intensity gradient. To study the absolute intensities, we plot every

image with the same intensity plotrange of {0,40}:

Romeny-c.nb 17

Page 18: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

im X Import Y " mr128 . gif " Z)[#[ 1, 1 \#\ ;p1 ] Table ^ grad _,`ba a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a agD c im , 1, 0, dfe 2 g gD h im , 0, 1, ifj 2 ;k

ListDensityPlot l grad , PlotRange m#npo 0, 40 q , DisplayFunction r Identity s ,ListDensityPlot tvu grad , PlotRange w#xby 0, 40 z , DisplayFunction { Identity |#}

, ~�� , 1, 5 �B� ;Show � GraphicsArray � Transpose � p1 �#� , ImageSize ��� 440, 171 �?� ;

Clearly the gradient expressed in the natural coordinates keeps its average output range. For a Laplacean

scale-space stack in natural coordinates we need to multiply the Laplacean with � 2:� 2� � � � � � � � ��x� 2 ��� 2� � � � � � � � �� y� 2 ��� 2 �,� 2� � � � � � � � ��

x2 ��� 2                 ¡ y2 ¢ , and so on for higher order derivative operators in natural coordinates.

Discrete Gaussian Kernels

Lindeberg [Lindeberg 1990] derived the optimal kernel for the case when the Gaussian kernel was discretized

and came up with the "modified Besselfunction of the first kind". In Mathematica this function is available as

Bessel I .

The "modified Besselfunction of the first kind" Bessel I is almost equal to the Gaussian kernel for £ > 1,

as we see below. Note that the Besselfunction has to be normalized by its value at x ¤ 0. For larger ¥ the

kernels become rapidly very similar.

18 Romeny-c.nb

Page 19: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

¦¨§ 2;

Plot ©«ª 1¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬¬ ¬ ¬ ¬ ¬ ¬­¯® ® ® ® ® ® ® ® ® ® ® ® ® ®2 °¨± 2

Exp ²,³ x2´ ´ ´ ´ ´ ´ ´ ´´ ´ ´2 µ 2 ¶ , 1· · · · · · · · · · · · · · · ·· · · · · ·¸¯¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹

2 º¼» 2

BesselI ½ x, ¾ 2 ¿ÁÀ BesselI Â 0, Ã 2 ÄBÅ ,Æx, 0, 8 Ç , PlotStyle ÈÊÉ RGBColor Ë 1, 0, 0 Ì , RGBColor Í 0, 0, 0 Î#Ï ,

PlotLegend Ð�Ñ " Gauss " , " Bessel " Ò , LegendPosition Ó Ô 1, 0 Õ ,LegendLabel Ö " ×ÙØ 2" , PlotRange Ú All , ImageSize Û Ü 330 , 166 ÝBÞ ;

2 4 6 8

0.05

0.1

0.15

0.2

Bessel

Gauss

ßáà 2

Gauge coordinates

Gauge coordinates are a very useful tool in computer vision, and in understanding the possible geometric

functionality of the front-end visusl system. Gauge coordinates are connected with the isophotes, lines of

equal brightness in 2D images (surfaces in 3D). Here are 10 equidistant isophotes of an image in 10 different

colors (/ @ stands for Map):

im â Import ã " mr256 . gif " ä)å#å 1, 1 æ#æ ; max ç Max è im é ;ListContourPlot ê im ë max, ContourShading ì False , Contours í 10,

ContourStyle î List ï3ð Hue ñ3òôó Range õ 10 öÁ÷ 10 ø , ImageSize ù ú 128 , 128 û#ü ;

In order to establish differential geometric properties it is easiest to exploit intrinsic geometry. This means

that we will define a new coordinate frame for our geometric explorations which is related to the local iso-

phote structure, so it is different in every different point. A straightforward definition of a new local coordi-

Romeny-c.nb 19

Page 20: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

nate frame in 2D is where we cancel the degree of freedom of rotation by defining gauge coordinates: we

locally ’fix the gauge’. The 2D unit vector frame of gauge coordinates ý v, wþ is defined as follows: w is the

unit vector in the gradient direction, i.e. the direction in which the intensity changes fastest, v is defined

perpendicular to w, i.e. tangential to the intensity isophote. The new framevectors are drawn below:

f ÿ x_ , y_ � : � x2 � y2;

p1 � ContourPlot � f � x, y � , � y, 2, 4.5 � , x, 2, 4.5 , ContourShading � False ,

Contours � Range 2, 100 , 4 � , DisplayFunction � Identity � ;frame � Graphics ��� PointSize � .02 � , Point ��� 3, 3 ��� ,

Arrow ��� 3, 3 � , 3 ! .5 "$# # # #2 , 3 % .5 &(' ' ' '2 )�* , Arrow +�, 3, 3 - , . 3 / .5 0$1 1 1 12 , 3 2 .5 3(4 4 4 42 576 ,Text 8 " v " , 9 3.8 , 2.2 :�; , Text < " w" , = 3.8 , 3.8 >�?A@�B ;

Show C�D p1, frame E , DisplayFunction F $DisplayFunction ,

Frame G False , ImageSize HJI 150, 150 K�L ;

v

w

The derivatives to v and w are by definition features that are invariant under orthogonal transformations, i.e.

rotation and translation. To apply these gauge derivative operators on images, we have to convert to the

Cartesian M x, y N domain. The derivatives to v and w are defined as:Ov PRQ Ly S x T Lx U yV V V V V V V V V V V V V V V VV V V V V V V VV V V V V V VWYX X X X X X X X X X X X X X X X X X X X

Lx2 Z Ly2 [ Li \ ij ] j^ ^ ^ ^ ^ ^ ^ ^^ ^ ^ ^ ^ ^ ^ ^_a` ` ` ` ` ` ` ` ` ` `Li Lib

w c Lx d x e Ly f yg g g g g g g g g g g g g g g gg g g g g g g gg g g ghji i i i i i i i i i i i i i i i i i i iLx2 k Ly2 l Li m ij n jo o o o o o o oo o o o o o o o opaq q q q q q q q q q q

Li Li.

We can alternatively see the derivatives to r v, ws as rotated over an angle t , with rotation matrixu cos vxwzy sin {}|$~� sin �}�(� cos �}�$������� Lx Ly� Ly Lx � . The second formulation uses tensor notation, where the index i or j stands for the range of dimensions. So

Li ��� Lx, Ly � in 2D and Li ��� Lx, Ly, Lz � in 3D. Likewise � j is the nabla operator ���� � � � � ��x

, �� � � � � � ��y � . The constant

tensors � ij and � ij are the symmetric Kronecker tensor   1 00 1 ¡ and the antisymmetric Levi-Civita tensor¢ 0 £ 1

1 0 ¤ respectively (in 2D). With this notation, we see that the derivative operator ¥ w is defined as the

derivative operator ¦ j rotated in the direction (through § ij ) of the unit length gradient vector Li¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨¨ ¨ ¨ ¨ ¨ ¨ ¨ ¨©«ª ª ª ª ª ª ª ª ª ª ªLi Li

, the

derivative operator ¬ v is defined as the derivative operator ­ j rotated in the perpendicular direction (through®ij ) of the unit length gradient vector Li¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯°a± ± ± ± ± ± ± ± ± ± ±

Li Li. So we encounter 3 types of notation for our geometrical exer-

20 Romeny-c.nb

Page 21: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

cises: Cartesian coordinate notation ² x, y ³ , tensor notation Li Li, and gauge coordinate notation ´ v, wµ . They

are essentially equivalent. In this paper we will elaborate mainly on gauge coordinates.

The definitions above are easily accomplished in Mathematica:

¶2 · IdentityMatrix ¸ 2 ¹ºaº1, 0 » , ¼ 0, 1 ½a½

¾ 2 ¿ Table À Signature Á� i , j Ã�Ä , Å j , 2 Æ , Ç i , 2 È�ÉÊaÊ0, Ë 1 Ì , Í 1, 0 ÎaÎ

jacobean Ï(ÐÒÑ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ Ñ ÑLx 2 Ó Ly 2 ;

dv Ô 1Õ Õ Õ Õ Õ Õ Õ Õ Õ Õ Õ Õ Õ Õ Õ ÕÕ Õ Õ Õ Õ Õ Õ Õ Õjacobean Ö Lx , Ly × . Ø 2. Ù D Ú #, x Û , D Ü #, y Ý�Þ &

ßLx, Ly à . á 2. âäã x#1, å y#1 æçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèçèççèçèçèçèçèçèçèçèçèçèçèçèçèçèçèççèçèçèçèçèçèçèççèçèçèçèçèçèçèççèç ç

jacobean&

dw é 1ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê êê ê ê ê ê ê ê ê êjacobean ë Lx , Ly ì . í 2. î D ï #, x ð , D ñ #, y ò�ó &

ôLx, Ly õ . ö 2. ÷äø x#1, ù y#1 úûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûûèûèûèûèûèûèûèûèûèûèûèûèûèûèûèûûèûèûèûèûèûèûèûûèûèûèûèûèûèûèûûèû û

jacobean&

The notation ( . . . #) & is a ’pure function’ on the argument #, e.g. ü #2 ý #5 þ & gives the sum of second

and fifth power of some argument, D[ x, #] & takes the derivative. This function can be applied to an argu-

ment by the familiar square brackets: ÿ #2 � #5 � & � zz � . Look in the Help browser to the function Func -

t i on for examples.

Now we can calculate any derivatie to v or w by applying the operator dw or dv repeatedly. Note that the Lx

and Ly are constant terms, in fact the combination Li � ij is precisely the rotation matrix to rotate the local

coordinate frame to the definitions of v and w. So after the application of dv and dw we need to substitute Lx

and Ly with the image derivatives with the command / . (which is the substitute command). Mathematica

needs to know what are the differentiation variables, so this function works when we explicitly state a func-

tion of x and y, such as L � x, y� (L for luminance).

Lw � dw � L x, y � � . � Lx ��� x L � x, y � , Ly ��� y L � x, y ������� Simplify

��� ��� ����� ��� ��� ��� ��� ��� ����� ��� ��� ��� ��� ��� ����� ��� ��� ��� ��� ��� ����� ��� ��� ��� ��� ��� ����� ��� ��� �L 0,1 !#" x, y $ 2 % L & 1,0 '#( x, y ) 2

Romeny-c.nb 21

Page 22: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Lww * dw + dw , L - x, y .�.�.0/ . 1 Lx 2�3 x L 4 x, y 5 , Ly 6�7 y L 8 x, y 9�:�;�; Simplify

<L = 0,1 >#? x, y @ 2 L A 0,2 B#C x, y D E 2 L F 0,1 G#H x, y I L J 1,0 K#L x, y M L N 1,1 O#P x, y Q RL S 1,0 T#U x, y V 2 L W 2,0 X#Y x, y Z\[^]`_ L a 0,1 b#c x, y d 2 e L f 1,0 g#h x, y i 2 j

Lvv k dv l dv m L n x, y opo�o0q . r Lx s�t x L u x, y v , Ly wyx y L z x, y {�|�}�} Simplify

~L � 0,2 �#� x, y � L � 1,0 �#� x, y � 2 � 2 L � 0,1 �#� x, y � L � 1,0 �#� x, y � L � 1,1 �#� x, y � �L � 0,1 �#� x, y � 2 L � 2,0 �#� x, y �\�^�`� L   0,1 ¡#¢ x, y £ 2 ¤ L ¥ 1,0 ¦#§ x, y ¨ 2 ©

Due to the fixing of the gauge by removing the degree of freedom for rotation (that is why Lv ª 0), we have

an important result: every derivative to v and w is an orthogonal invariant, i.e. an invariant property where

translation or rotation of the coordinate frame is irrelevant. This also means that polynomial combinations of

these gauge derivative terms are invariant. We now have the toolkit to make gauge derivatives to any order.

Applications of multiscale invariants on discrete images:’looking through the simple cells’

The definitions for the gauge differential operators « v and ¬ w need to have their regular differential operators

be replaced by Gaussian derivative operators. To just show the textual formula, we do not yet evaluate the

derivative by using temporarily HoldForm:

gauge2D ­ im_ , nv_ , nw_, ® _ ¯ : °Nest ± dw, Nest ² dv , L ³ x, y ´ , nv µ , nw ¶·. ¸ Lx ¹»º x L ¼ x, y ½ , Ly ¾»¿ y L À x, y Á�Â0à .

Derivative Ä n_, m_ÅÇÆ L ÈÊÉ x, y ËÍÌ HoldForm Î gD Ï im , n, m, ÐÒÑ�Ñ�Ó�Ó Simplify

So here is the Cartesian (in {x,y}) expression for Lvv:

Clear Ô im , ÕÒÖ ;gauge2D × im , 2, 0, 2 Ø

ÙgD Ú im, 0, 2, 2 Û gD Ü im, 1, 0, 2 Ý 2 Þ2 gD ß im, 0, 1, 2 à gD á im, 1, 0, 2 â gD ã im, 1, 1, 2 ä ågD æ im, 0, 1, 2 ç 2 gD è im, 2, 0, 2 éëêíìyî gD ï im, 0, 1, 2 ð 2 ñ gD ò im, 1, 0, 2 ó 2 ô

Every Gaussian derivative is evaluated as a separate image, and the invariant is the polynomial concatenation.

22 Romeny-c.nb

Page 23: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Ridge detection

Lvv is a ridge detector. Let us test this on an X-ray image of fingers and calculate Lvv on a scale õ = 2:

im ö Import ÷ " hands . gif " øÊù�ù 1, 1 ú�ú ;

With the function ReleaseHold we release the Hold function, so now gD is not just displayed as name, but

actually called and calculated:

Lvv û gauge2D ü im , 2, 0, 3 ý�þ�þ ReleaseHold ;

Block ÿ�� $DisplayFunction � Identity � ,p1 � ListDensityPlot � im � ; p2 � ListDensityPlot � Lvv ; ;

Show � GraphicsArray �� p1, p2 ��� , ImageSize ��� 439 , 138 ��� ;

Noise has structure too. Here are the ridges of uniform noise:

im � Table � Random ��� , � 128 � , � 256 �� ;noiseridges ! gauge2D " im , 2, 0, 3 #%$&$ ReleaseHold ;

ListDensityPlot ' noiseridges , ImageSize (�) 256 , 128 *�+ ;

We also recognize Lvv in the ’fundamental’ equation of Alvarez et al. [2], a nonlinear geometry driven diffu-

sion equation: , L-.-.-/- - -0s 1 Lvv.

Romeny-c.nb 23

Page 24: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Isophote curvature in gauge coordinates

Isophote curvature 2 is defined as the change w’’ of the tangent vector w’ in the gauge coordinate system.

When we differentiate the definition of the isophote (L = Constant) to v, we get:

D 3 L 4 v , w 5 v 6�687�7 Constant , v 9w :�; v < L = 0,1 >@? v, w A v BCBED L F 1,0 G@H v, w I v JKJMLCL 0

We know that Lv N 0 by definition of the gauge coordinates, so w’ = 0, and the curvature O = w’’ is found by

differentiating the isophote equation again:

PRQ w’’ S v T8U . Solve V D W L X v , w Y v Z&Z\[�[ Constant , ] v , 2 ^�_8` . w’ a v b%c�d 0, w’ ’ e v f�fgih L j 2,0 k@l v, w m v nKnopopopopopopopopopopopopopopopoopopopopopopopopopopopopopopopoopopopo opo o

L q 0,1 r@s v, w t v uKuwvSo xMy{z Lvv| | | | | | | | |

Lw. QED.

In Cartesian coordinates we recognize the well-known formula from classical textbooks:

}R~�� dv � dv � L � x, y ���&�� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � �� � � � �dw � L � x, y ��� � . � Lx �R� x L � x, y � , Ly ��� y L � x, y �&�\��� Simplify

���L � 0,2 �@� x, y � L � 1,0 �@� x, y   2 ¡ 2 L ¢ 0,1 £@¤ x, y ¥ L ¦ 1,0 §@¨ x, y © L ª 1,1 «@¬ x, y ­¯®L ° 0,1 ±@² x, y ³ 2 L ´ 2,0 µ@¶ x, y ·¹¸»º½¼ L ¾ 0,1 ¿@À x, y Á 2  L à 1,0 Ä@Å x, y Æ 2 Ç 3 È 2

To see this more clearly we write the previous expression of partial Cartesian derivatives in a more readable

notation by a pattern matching operation ( É ) where the Derivative function is replaced by L with a string of

x’s and y’s:

ÊRË . Derivative Ì n_, m_ÍÏÎ L ÐÏÑ x, y Ò¯Ó " L" ÔÖÕ Table × " x" , Ø n Ù&ÚÜÛKÝ Table Þ " y" , ß mà�á2 Lx Lxy Ly â Lxx Ly2 ã Lx2 Lyyäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpääpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpäpääpäpäpä äpä äpä äå

Lx2 æ Ly2 ç 3 è 2Here is an example of the isophote curvature at a range of scales for a sagittal MR image:

im é Import ê " mr128 . gif " ëÏì�ì 1, 1 í�í ;îRïðîòñ . Derivative ó n_, m_ôÏõ L öø÷ x, y ùûú HoldForm ü gD ý im , n, m, þ ÿ�ÿ ;

24 Romeny-c.nb

Page 25: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Block���

$DisplayFunction � Identity � ,p1 � ListDensityPlot � im � ;p2 Table ListDensityPlot � ����� ReleaseHold , PlotRange ����� 5, 5 ��� , ��� , 1, 3 ����� ;

Show � GraphicsArray � Partition Prepend ! p2, p1 " , 2 #�# , ImageSize $�% 400 , 400 &�' ;

The reason we see extreme low and high values is due to the singularities that occur at intensity extrema,

where the gradient Lw ( 0.

In a similar fashion we can derive and study the curvature of flowlines, which are defined as the curves

everywhere perpendicular to isophotes. The flowline curvature ) is given by *,+.- Lvw/ / / / / / / / / /Lw

.

Zerocrossings of the Laplacean have historically received much attention, due to the work of Marr and

Hildreth. The zerocrossings are however displaced on curved edges. Note that with the compact expression

for isophote curvature 02143 Lvv5 5 5 5 5 5 5 5 5Lw

we can establish a relation between the Laplacean and the proper second

order derivative to study for zerocrossings: Lww. From the expression of the Laplacean in gauge coordinates

Lww 6 Lvv 7 Lww 829 Lw we see that there is a deviation which is directly proportional to the curvature : .

Romeny-c.nb 25

Page 26: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Affine invariant corner detection

Corners can be defined as locations with high isophote curvature and high intensity gradient. It was proposed

by Blom [3]: ;=< n>@?4A LvvB B B B B B B B BLw

Lwn C�D Lw

n. An obvious advantage is invariance under as large a group as possi-

ble. Blom calcultated n for invariance under the affine transformationE x’y’ FHG a b

c d IHJ x y K . The derivatives transform as LMNNNN OPQPQPRP P PSxTUQUQURU U UVx

WX YYYY[Z a cb d \H]_^`Q`R`Q` `Q` ` `a

x’ bcRcQcRc cQc c cdy’ e .

The corner detectors f=g nh transform as

ikj nlnmpo a d q b cr 2 sut a Lx’ v c Ly’ w 2 xzy b Lx’ { d Ly’ | 2 } n ~ 3� � � � � � � �� � � �� �2 � 2 Lx’ Ly’ Lx’ y’ � Ly’

2 Lx’ x’ � Lx’2 Ly’ y’ �

This is a relative affine invariant of order 2 if n � 3 with the determinant D �p� ad � b c� of the affine transfor-

mation as order parameter. We consider here special affine transformations (D � 1). So a good corner-

detector is ��� Lvv� � � � � � � � �Lw

Lw3 � Lvv Lw

2. This feature has the nice property that is is not singular at locations

where the gradient vanishes, and through its affine invariance it detects corners at all 'opening angles'.

Note the positive (convex) and negative (concave) corners. We show corner detection at two scales:

im � N � Import � " utrecht256 . gif " �[��� 1, 1 ����� ;corner1 ��� gauge2D � im , 2, 0, 1 � gauge2D � im , 0, 1, 1 � 2 �k��� ReleaseHold ;

corner2 ��� gauge2D   im , 2, 0, 3 ¡ gauge2D ¢ im , 0, 1, 3 £ 2 ¤k¥�¥ ReleaseHold ;

Block ¦�§ $DisplayFunction ¨ Identity © ,p1 ª ListDensityPlot « im ¬ ;p2 ­ ListDensityPlot ® corner1 ¯ ;p3 ° ListDensityPlot ± corner2 ² ; ³ ;

Show ´ GraphicsArray µ�¶ p1, p2, p3 ·�¸ , ImageSize ¹�º 436 , 136 »�¼ ;

26 Romeny-c.nb

Page 27: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

T-junction detection

An example of third order geometric reasoning in images is the detection of T-junctions [24]. T-junctions in

the intensity landscape of natural images occur typically at occlusion points. In the figure below the circles

indicate a few particular T-junctions:

blocks ½ Import ¾ " blocks . gif " ¿ÁÀ�À 1, 1 Â� ;Block Ã�Ä $DisplayFunction Å Identity Æ ,

p1 Ç ListDensityPlot È blocks , ImageSize É�Ê 317 , 204 Ë�Ì ;p2 Í

Graphics ÎÐÏ Circle Ñ�Ò 221 , 178 Ó , 13 Ô , Circle Õ�Ö 157 , 169 × , 13 Ø , Circle Ù�Ú 90, 155 Û , 13 Ü ,Circle ÝÐÞ 148 , 56 ß , 13 à , Circle á�â 194 , 77 ã , 13 ä , Circle å�æ 253 , 84 ç , 13 è�é�è�è ;

Show ê�ë p1, p2 ì , AspectRatio í Automatic î ;

When we zoom in on the T-junction of an observed image and inspect locally the isophote structure at a T-

junction, we see that at a T-junction the derivative of the isophote curvature ï in the direction perpendicular

to the isophotes is high. In the figure below the isophote landscape of a blurred T-junction illustrates the

direction of maximum change of ð :

Romeny-c.nb 27

Page 28: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

im ñ Table ò If ó y ô 64, 0, 1 õ÷ö If ø y ù x && y ú 63, 2, 1 û , ü y, 128 ý , þ x, 128 ÿ�� ;Block ��� $DisplayFunction � Identity � ,

p1 � ListDensityPlot � im ;p2 ListContourPlot � gD � im , 0, 0, 7 , Contours � 15, PlotRange ����� 0.3 , 2.8 ����� ;

Show � GraphicsArray ��� p1, p2 ����� ;

When we study the curvature of the isophotes in the middle of the image, at the location of the T-junction, we

see the isophote ’sweep’ from highly curved to almost straight for decreasing intensity. So the geometric

reasoning is the "the isophote curvature changes a lot when we traverse the image in the w direction". It

seems to make sense to study ���� � � � � � � w

. We saw before that the isophote curvature ! is defined as "$#&% Lvv' ' ' ' ' ' ' ' 'Lw

. So

the Cartesian expression for the T-junction detector becomes

(*),+ dv - dv . L / x, y 0�0102 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 22 2 2 2 2 2 2 22 2 2 2 2dw 3 L 4 x, y 5�5

6. 7 Lx 8*9 x L : x, y ; , Ly <>= y L ? x, y @1ACB�B Simplify ;

tjunction D Simplify E dw FHGJICK . L Lx M*N x L O x, y P , Ly QSR y L T x, y U�V�U ;% W . Derivative X n_, m_Y[Z L \[] x, y ^`_ " L" acb Table d " x" , e n f1gihkj Table l " y" , m mn�o

1pqpqpqpqpqpqpqpqpqpqpqpqpqpqpqppqpqpqpqpqpqpqppqpqpqp p prLx2 s Ly2 t 3 uwv Lxxy Ly5 x Lx4 y 2 Lxy2 z Lx Lxyy { Lxx Lyy |`}Ly4 ~ 2 Lxy2 � Lx �w� Lxxx � 2 Lxyy �`� Lxx Lyy ���Lx2 Ly2 � 3 Lxx2 � Lx Lxxx � 8 Lxy2 � Lx Lxyy � 4 Lxx Lyy � 3 Lyy2 ���Lx Ly3 � Lx Lxxy � 6 Lxx Lxy � 6 Lxy Lyy � Lx Lyyy ���Lx3 Ly � 2 Lx Lxxy � 6 Lxx Lxy � 6 Lxy Lyy � Lx Lyyy ���

To avoid singularities at vanishing gradients through the division by � Lx2 � Ly

2 � 3 � Lw6 we use as our T-

junction detector  ¢¡¤£�¥¦ ¦ ¦ ¦ ¦ ¦ ¦§w

Lw6, the derivative of the curvature in the direction perpendicular to the isophotes

(an affine invariant?):

¨ª© Simplify « dw ¬H­[® dw ¯ L ° x, y ±1± 6 ² . ³ Lx ´*µ x L ¶ x, y · , Ly ¸*¹ y L º x, y »�¼1» ;

Finally, we apply the T-junction detector on our blocks at a rather fine scale of ½¿¾ 2:

28 Romeny-c.nb

Page 29: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

ÀªÁ,Àà. Derivative Ä n_, m_Å[Æ L ÇJÈ x, y ÉËÊ HoldForm Ì gD Í blocks , n, m, ÎÐÏ1Ï ;Ñ$Ò 2; ListDensityPlot ÓÕÔÃÖ1Ö ReleaseHold , ImageSize ×ÙØ 317, 204 Ú�Û ;

Compare the detected points with the circles in the input image.

Fourth order junction detection:

As a final fourth order example, we give an example for a detection problem in images at high order of

differentiation from algebraic theory. Even at orders of differentiation as high as 4, invariant features can be

constructed and calculated for discrete images through the biologically inspired scaled derivative operators.

Our example is to find in a checkerboard the crossings where 4 edges meet. We take an algebraic approach,which is taken from Salden et al. [37].

When we study the fourth order local image structure, we consider the fourth order polynomial terms from

the local Taylor expansion:

pol4 Ü 1Ý Ý Ý Ý Ý Ý Ý Ý4 Þ

ßLxxxx x4 à 4 Lxxxy x3 y á 6 Lxxyy x2 y2 â 4 Lxyyy x y3 ã Lyyyy y4 ä ;

The main theorem of algebra states that a polynomial is fully described by its roots: e.g.

ax2 å bx æ c çéè x ê x1 ë ì x í x2î . It was shown by Hilbert that the ’coincidencesness’ of the roots, i.e. how well

all roots coincide, is a particular invariant condition. From algebraic theory it is known that this

’coincidenceness’ is given by the discriminant, defined below (see also [1]):

Discriminant ï p_, x_ ð : ñWith ò�ó m ô Exponent õ p, x ö�÷ , Cancel ø,ùÕú 1 û 1ü ü ü ü

2m ý mþ 1 ÿ Resultant � p, � x p, x �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � �� � � � � � � �� � � � � �Coefficient � p, x, m�

���

Romeny-c.nb 29

Page 30: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

The resultant of two polynomials a and b, both with leading coefficient one, is the product of all the differ-

ences ai � bj between roots of the polynomials. The resultant is always a number or a polynomial. The discrim-

inant of a polynomial is the product of the squares of all the differences of the roots taken in pairs. We can

express our function in two variables x, y as a function in a single variable x� � � �y

by the substitution y � 1.

Some examples:

Discriminant 1� � � � � � � �2 ��� Lxx x2 � 2 Lxy x y � Lyy y2 � , x ��� . � y � 1 �

Lxy2 � Lxx Lyy

The discriminant of second order image structure is just the determinant of the Hessian matrix, i.e. the Gauss-

ian curvature. Here is our fourth order discriminant:

Discriminant � pol4 , x ��� . y ! 1 "

1#$#$#$#$#$#$#$##$#$#$#$#$#$#$# # #746496 % 36 Lxxxy2 Lxxyy2 Lxyyy2 & 54 Lxxxx Lxxyy3 Lxyyy2 '64 Lxxxy3 Lxyyy3 ( 108 Lxxxx Lxxxy Lxxyy Lxyyy3 ) 27 Lxxxx2 Lxyyy4 *54 Lxxxy2 Lxxyy3 Lyyyy + 81 Lxxxx Lxxyy4 Lyyyy , 108 Lxxxy3 Lxxyy Lxyyy Lyyyy -180 Lxxxx Lxxxy Lxxyy2 Lxyyy Lyyyy . 6 Lxxxx Lxxxy2 Lxyyy2 Lyyyy /54 Lxxxx2 Lxxyy Lxyyy2 Lyyyy 0 27 Lxxxy4 Lyyyy2 1 54 Lxxxx Lxxxy2 Lxxyy Lyyyy2 218 Lxxxx2 Lxxyy2 Lyyyy2 3 12 Lxxxx2 Lxxxy Lxyyy Lyyyy2 4 Lxxxx3 Lyyyy3 5

It looks like an impossibly complicated polynomial in fourth order derivative images, and it is. Through the

use of Gaussian derivative kernels each separate term can easily be calculated. We change all coefficientsinto scaled Gaussian derivatives:

discr4 6 im_ , 7 _ 8 : 9Discriminant : pol4 , x ;=< . > y ? 1, Lxxxx @ gD A im , 4, 0, BDC , Lxxxy E gD F im , 3, 1, GDH ,

Lxxyy I gD J im , 2, 2, KML , Lxyyy N gD O im , 1, 3, PMQ , Lyyyy R gD S im , 0, 4, TMUWV

Let us apply this high order function on an image of a checkerboard, and we add noise with twice the maxi-

mum image intensity to show its robustness, despite the high order derivatives:

30 Romeny-c.nb

Page 31: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

t1 X Table Y If ZW[ x \ 50 && y ] 50 ^`_W_ba x c 50 && y d 50 e , 0, 100 fhg 200 i Random jlk ,mx, 1, 100 n , o y, 1, 100 pWq ;

t2 r Table s If tWu x v y w 100 x 0 && y y x z 0 {}|W|�~ x � y � 100 � 0 && y � x � 0 � , 0, 100 ���200 � Random ��� , � x, 1, 100 � , � y, 1, 100 �W� ;

noisycheck � Transpose � Join � t1 , t2 ��� ;ListDensityPlot � noisycheck , ImageSize ��� 200 , 100 ��� ;

The detection clearly is rotation invariant, robust to noise, and there is no detection at corners:

ListDensityPlot � discr4 � noisycheck , 5 � , ImageSize  �¡ 200 , 100 ¢W£ ;

Conclusion

Biologically motivated Gaussian derivative kernels provide a solid framework for differential geometric

analysis in computer vision. In this paper a practical overview is given of some results up to fourth order of

differentiation. This multiscale analysis is applicable to all fields of computer vision, including nonlinear

geometry-driven diffusion [2, 30, 42], optic flow, stereo disparity analysis, orientation analysis, scale-time

[13], deep structure of images etc. Space constraints do not allow to elaborate on these issues in this paper.

This paper is an exerpt from a forthcoming book [35], where many of the issues above are treated in an

interactive way.

This paper has been written as a notebook in Mathematica 4.0, giving the possibility to the reader to experi-

ment with every treated subject himself. The high level of functions and speed of code and hardware now

available (this full notebook runs in 12 minutes on a Pentium II PC, 266 MHz, 128K memory, Win95) and

the easy interactive visualization possibilities makes the combination of textbook text and code a highly

tutorial toolkit on the desktop.

Romeny-c.nb 31

Page 32: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

Acknowledgement

The authors thanks the members of the Image Sciences Institute, particularly the members of the TGV

(’Tools of Geometry in Vision’) team for their contributions and discussions. Special thanks to Max Vier-

gever, director of ISI.

References[1] P. Abbott (ed.), "Tricks of the Trade", The Mathematica Journal, Wolfram Media Inc., vol. 7, no.2, 105-127, 1998.

[2] L. Alvarez, F. Guichard, P. L. Lions, and J. M. Morel. Axiomes et equations fondamentales du traitement d’images. C. R. Acad. Sci. Paris, 315:135--138, 1992.

[3] J. Blom, "Affine Invariant Corner Detection", in: PhD Thesis, Utrecht University, NL-Utrecht, 1991.

[4] Gregory C. DeAngelis, Izumi Ohzawa, and Ralph D. Freeman, "Receptive-field dynamics in the central visual pathways", Trends Neurosci. 18: 451-458, 1995.

[5] L. M. J. Florack, B. M. ter Haar Romeny, J. J. Koenderink, and M. A. Viergever, "Scale and the differential structure of images," Image and Vision Computing, vol. 10, pp. 376-388, July/August 1992.

[6] L. M. J. Florack, B. M. ter Haar Romeny, J. J. Koenderink, and M. A. Viergever, "Cartesian differential invariants in scale-space," Journal of Mathematical Imaging and Vision, vol. 3, pp. 327-348, November 1993.

[7] L. M. J. Florack, B. M. ter Haar Romeny, J. J. Koenderink, and M. A. Viergever, "Images: Regular tempered distributions," in Proceedings of the NATO Advanced Research Workshop Shape in Picture - Mathematical description of shape in greylevel images (Y.-L. O, A. Toet, H. J. A. M. Heijmans, D. H. Foster, and P. Meer, eds.), vol. 126 of NATO ASI Series F, pp. 651-660, Springer Verlag, Berlin, 1994.

[8] L. M. J. Florack, B. M. ter Haar Romeny, J. J. Koenderink, and M. A. Viergever, "Linear scale-space," Journal of Mathematical Imaging and Vision, vol. 4, no. 4, pp. 325-351, 1994.

[9] L. M. J. Florack, B. M. ter Haar Romeny, J. J. Koenderink, and M. A. Viergever, "The Gaussian scale-space paradigm and the multiscale local jet," International Journal of Computer Vision, vol. 18, pp. 61-75, April 1996.

[10] L. M. J. Florack, "Image Structure", Kluwer Academic Publishers, Dordrecht, the Netherlands, 1997.

[11] T. Iijima: "Basic Theory on Normalization of a Pattern (in Case of Typical 1D Pattern). Bulletin of Electrical Laboratory, vol. 26, pp. 368-388, 1962 (in Japanese).

[12] J.J. Koenderink: "The Structure of Images", Biological Cybernetics, vol. 50, pp. 363-370, 1984.

[13] J.J. Koenderink: "Scale-Time", Biological Cybernetics, vol. 58, pp. 159-162, 1988.

[14] T. Lindeberg: "Scale-space for discrete signals’’, IEEE Transactions of Pattern Analysis and Machine Intelligence, 12(3), 234--254, 1990.

[15] T. Lindeberg: "Scale-Space Theory in Computer Vision", Kluwer Academic Publishers, Dordrecht, Netherlands, 1994.

[16] T. Lindeberg: "Scale-space theory: A basic tool for analysing structures at different scales’’, J. of Applied Statistics, 21(2), Supplement on Advances in Applied Statistics: Statistics and Images: 2, pp. 224--270, 1994.

[17] T. Lindeberg, "On the axiomatic foundations of linear scale-space’’. In J. Sporring, M. Nielsen, L. Florack, and P. Johansen (eds.) Gaussian Scale-Space Theory, 75-97, Kluwer Academic Publishers, 1997.

32 Romeny-c.nb

Page 33: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

[18] P. Morrison, "Powers of Ten: About the Relative Size of Things in the Universe",W. H. Freeman and Company, 1985.

[19] M. Nielsen, "Scale-Space Generators and Functionals’’. In J. Sporring, M. Nielsen, L. Florack, and P. Johansen (eds.) Gaussian Scale-Space Theory, pp. 99-114, Kluwer Academic Publishers, 1997.

[20] M. Nielsen, P. Johansen, O.F. Olsen, J. Weickert (Eds.), Scale-space theories in computer vision, Lecture Notes in Computer Science, Vol. 1682, Springer, Berlin, 1999.

[21] W. J. Niessen, B. M. ter Haar Romeny, L. M. J. Florack, and M. A. Viergever, "A general framework for geometry-driven evolution equations," International Journal of Computer Vision, vol. 21, no. 3, pp. 187-205, 1997.

[22] W. J. Niessen, B. M. ter Haar Romeny, L. M. J. Florack, A. H. Salden, and M. A. Viergever, "Nonlinear diffusion of scalar images using well-posed differential operators," in Proc.of Computer Vision and Pattern Recognition, (Seattle, WA), pp. 92-97, IEEE Computer Society Press, 1994.

[23] E. Radmoser, O. Scherzer, J. Weickert, "Scale-space properties of nonstationary iterative regularization methods", to appear in Journal of Visual Communication and Image Representation (Special Issue on Scale-Space Theories in Computer Vision, invited paper), 2000.

[24] B. M. ter Haar Romeny, L. M. J. Florack, J. J. Koenderink, and M. A. Viergever, "Invariant third order properties of isophotes: T-junction detection," in Proc.7th Scand.Conf.on Image Analysis (P. Johansen and S. Olsen, eds.), Aalborg DK, pp. 346-353, August 1991.

[25] B. M. ter Haar Romeny, L. M. J. Florack, J. J. Koenderink, and M. A. Viergever, "Scale-space: Its natural operators and differential invariants," in Information Processing in Medical Imaging (A. C. F. Colchester and D. J. Hawkes, eds.), vol. 511 of Lecture Notes in Computer Science, pp. 239-255, Springer-Verlag, Berlin, July 1991.

[26] B. M. ter Haar Romeny and L. M. J. Florack, "A multiscale geometric model of human vision," in Perception of Visual Information (W. R. Hendee and P. N. T. Wells, eds.), ch. 4, pp. 73-114, Berlin: Springer-Verlag, 1993. Second edition 1996.

[27] B. M. ter Haar Romeny, L. M. J. Florack, A. H. Salden, and M. A. Viergever, "Higher order geometrical image structure," in Proc. Information Processing in Medical Imaging ’93, Flagstaff AZ (H. Barrett, ed.), (Berlin), pp. 77-93, Springer-Verlag, 1993.

[28] B. M. ter Haar Romeny, L. M. J. Florack, A. H. Salden, and M. A. Viergever, "Higher order differential structure of images," Image and Vision Computing, vol. 12, pp. 317-325, July/August 1994.

[29] B. M. ter Haar Romeny, W. J. Niessen, J. Wilting, and L. M. J. Florack, "Differential structure of images: Accuracy of representation," in Proc. First IEEE Internat. Conf. on Image Processing, (Austin, TX), pp. 21-25, IEEE, November, 13-16 1994.

[30] B. M. ter Haar Romeny (Ed.), "Geometry Driven Diffusion in Computer Vision". Series on Computational Imaging and Vision, Kluwer Academic Publishers, Dordrecht, the Netherlands, 1994.

[31] T. Lindeberg and B. M. ter Haar Romeny, "Linear scale-space: I. basic theory, II.Early visual operations". In: Geometry-Driven Diffusion in Computer Vision (B. M. ter Haar Romeny, ed.), Computational Imaging and Vision, pp. 1-38,39-72, Dordrecht, the Netherlands: Kluwer Academic Publishers, 1994.

[32] B. M. ter Haar Romeny, "Scale-space research at Utrecht University," in Proc. 12th International Conference on Analysis and Optimization of Systems: Images, Wavelets and PDE’s (M.-O. Berger, R. Deriche, I. Herlin, J. Jaffré, and J.-M. Morel, eds.), Lecture Notes in Control and Information Sciences, vol. 219, (CEREMADE / INRIA, Paris, France), pp. 15-30, Springer, London, June 26-28 1996.

[33] . M. ter Haar Romeny, "Applications of scale-space theory," in Gaussian Scale-Space Theory (J. Sporring, M. Nielsen, L. Florack, and P. Johansen, eds.), Computational Imaging and Vision, pp. 3-19, Dordrecht: Kluwer Academic Publishers, 1997.

[34] B. M. ter Haar Romeny, L. M. J. Florack, J. J. Koenderink, and M. A. Viergever, eds., "Scale-Space '97: Proc. First Internat. Conf. on Scale-Space Theory in Computer Vision", vol. 1252 of Lecture Notes in Computer Science. Berlin: Springer Verlag, 1997.

[35] B. M. ter Haar Romeny: "Front-End Vision and Multiscale Image Analysis. An Interactive Tutorial". Kluwer Academic Publishers, Dordrecht, the Netherlands, 2000.

Romeny-c.nb 33

Page 34: Front End Vision: A Multiscale Geometry Enginefaculty.petra.ac.id/resmana/private/pcd/Image-Processing...Front End Vision: A Multiscale Geometry Engine Scale-Space Theory in Computer

[36] A. H. Salden, B. M. ter Haar Romeny, L. M. J. Florack, J. J. Koenderink, and M. A. Viergever, "A complete and irreducible set of local orthogonally invariant features of 2-dimensional images," in Proceedings 11th IAPR Internat. Conf. on Pattern Recognition (I. T. Young, ed.), (The Hague, the Netherlands), pp. 180-184, IEEE Computer Society Press, Los Alamitos, August 30-September 3 1992.

[37] A. H. Salden, B. M. ter Haar Romeny, and M. A. Viergever, "Linear scale space theory from physical properties," J. of Mathematical Imaging and Vision, vol. 9. no.2, pp. 103-140, 1998.

[38] O. Scherzer, J. Weickert, "Relations between regularization and diffusion filtering", J. Math. Imagind and. Vision, in press, 2000.

[39] L. Schwartz, "Théorie des Distributions", Hermann, Paris, 1951, 2nd edition 1966.

[40] J. Sporring, M. Nielsen, L. Florack: "Gaussian Scale-Space Theory", Kluwer Academic Publishers, Dordrecht, the Netherlands, 1997.

[41] J. Weickert, S. Ishikawa, A. Imiya, "On the history of Gaussian scale-space axiomatics", in J. Sporring, M. Nielsen, L. Florack, P. Johansen (Eds.), Gaussian scale-space theory, Kluwer, Dordrecht, 45-59, 1997.

[42] J. Weickert, "Anisotropic diffusion in image processing," ECMI Series, Teubner, Stuttgart, 1998.

[43] J. Weickert, S. Ishikawa, A. Imiya, "Linear scale-space has first been proposed in Japan", J. Math. Imag. Vision, Vol. 10, 237-252, 1999.

34 Romeny-c.nb