THE DETECTION OF D IMA GE FEA TURES · Benjamin John Robbins A ugust . c Co p yr igh t b y Benjamin John Robbins ii. A b stract Accura t ed et ect ion an ... c ht h e sam ew ay t

THE DETECTION OF �D IMAGE FEATURESUSING LOCAL ENERGY

This thesis is

presented to the

Department of Computer Science

for the degree of

Doctor of Philosophy

of the

University of Western Australia

By

Benjamin John Robbins

August ��

c� Copyright ��

by

Benjamin John Robbins

ii

Abstract

Accurate detection and localization of two�dimensional ��D� image features �or �key�

points�� is important for vision tasks such as structure from motion stereo matching

and line labeling �D image features are ideal for these vision tasks because �D image

features are high in information and yet they occur sparsely in typical images

Several methods for the detection of �D image features have already been de�

veloped However it is di�cult to assess the performance of these methods because

no one has produced an adequate de�nition of corners that encompasses all types of

�D luminance variations that make up �D image features The fact that there does

not exist a consensus on the de�nition of �D image features is not surprising given

the confusion surrounding the de�nition of �D image features

The general perception of �D image features has been that they correspond to

�edges� in an image and so are points where the intensity gradient in some direction

is a local maximum The Sobel �� Canny �� and Marr�Hildreth �� operators all

use this model of �D features either implicitly or explicitly However other pro�les

in an image also make up valid �D features such as spike and roof pro�les as well as

combinations of all these feature types Spikes and roof pro�les can also be found by

looking for points where the rate of change of the intensity gradient is locally maximal

as Canny did in de�ning a �roof�detector� in much the same manner that he developed

his �edge�detector� While this allows the detection of a wider variety of �D features

pro�les it comes no closer to the goal of unifying these di�erent feature types to an

encompassing de�nition of �D features

The introduction of the local energy model of image features by Morrone and

Owens �� in �� provided a uni�ed de�nition of �D image features for the �rst

iii

time They postulated that image features correspond to points in an image where

there is maximal phase congruency in the frequency domain representation of the

image That is image features correspond to points of maximal order in the phase

domain of the image signal These points of maximal phase congruency correspond

to step�edge roof and ramp intensity pro�les and combinations thereof They also

correspond to the Mach bands perceived by humans in trapezoidal feature pro�les

This thesis extends the notion of phase congruency to �D image features As �D

image features correspond to points of maximal �D order in the phase domain of the

image signal this thesis contends that �D image features correspond to maximal �D

order in this domain These points of maximal �D phase congruency include all the

di�erent types of �D image features including grey�level corners line terminations

blobs and a variety of junctions

Early attempts at �D feature detection were simple �corner detectors� based on a

model of a grey�level corner in much the same way that early �D feature detectors

were based on a model of step�edges Some recent attempts have included more

complex models of �D features although this is basically a more complex a priori

judgement of the types of luminance pro�les that are to be labeled as �D features

This thesis develops the �D local energy feature detector based on a new uni�ed

de�nition of �D image features that marks points of locally maximum �D order in the

phase domain representation of the image as �D image features The performance of

an implementation of �D local energy is assessed and compared to several existing

methods of �D feature detection This thesis also shows that in contrast to most other

methods of �D feature detection �D local energy is an idempotent operator

The extension of phase congruency to �D image features also uni�es the detection

of image features �D and �D image features correspond to �D and �D order in

the phase domain representation of the image respectively This de�nition imposes

a hierarchy of image features with �D image features being a subset of �D image

features This ordering of image features has been implied ever since �D features

were used as candidate points for �D feature detection by Kitchen �� and others

Local energy enables the extraction of both �D and �D image features in a consistent

manner� �D image features are extracted from the �D image features using the same

iv

operations that are used to extract �D image features from the input image

The consistent approach to the detection of image features presented in this thesis

allows the hierarchy of primitive image features to be naturally extended to higher

order image features These higher order image features can then also be extracted

from higher order image data using the same hierarchical approach This thesis shows

how local energy can be naturally extended to the detection of �D �surface� and higher

order image features in �D data sets Results are presented for the detection of �D

image features in �D confocal microscope images showing superior performance to

the �D extension of the Sobel operator ��

v

Preface

Some of the work in this thesis has already been published Most of the work in

Chapters � to � appears in a technical report in the Department of Computer Sci�

ence �� and a more concise version of this work is to appear in Image and Vision

Computing �� I am the principal contributing author for both these papers

With the exception of the �D surface detector described in Section �� all of

the work presented in this thesis�including algorithms and implementations�is my

own The surface detection work has been principally performed by Chris Pudney

with my contribution being to the general methodology of implementation and the

optimizations with regard to performing the FFTs and applying the energy �lters

Various forms of this work have been published in the Proceedings of ANZIIS��

and Proceedings of the International Computer Science Conference �� with another

paper on this work to appear in the Journal of Assisted Confocal Microscopy

vi

Acknowledgements

First of all I would like to thank my supervisor Robyn Owens for her constant support

and guidance throughout my candidature She ensured that I got o� on the right foot

and always knew where I was and where I was going even when I didn�t particularly

at the beginning of my studies It was always reassuring to know that she would

quickly understand any problems I was grappling with and o�er helpful suggestions

although the speed she did this and its apparent ease to her were a little disconcerting

Robyn also strongly encouraged me to apply to travel to Switzerland to study at ETH

Z�urich which was both very bene�cial to my studies and a fantastic time

I would like to thank UWA and ETH Z�urich for supporting my stay in Switzerland

Thanks go to everyone in the Image Sciences �BIWI� group at ETH for making my

stay educational and enjoyable Olaf Vreni Martin Markus and the lunch crew

Gaudenz Friedrich Wolfram Tuomo Olof and Marjan made for an entertaining

time

Extra special thanks go to the K�ublers for making me feel like part of the family

during my stay in Switzerland� Olaf and Guni for showing me the beautiful Swiss

countryside while unsuccessfully trying to give me wanderschaden Dani for risking

the meat with an Aussie at his birthday party BBQ and Flo for being such a great

mate for losing the Kiwi accent �eh�� and for �nally getting his revenge for numerous

dumpings at Triggs Beach by taking me snowboarding on a hard�as�rock glacier I

would also like to thank the non�resident member of the K�ubler household Gaudie

for his friendship and for introducing me to the outdoor cinema and Roman for all

the basketball

I have bene�ted from many conversations with Mike Robbins Peter Kovesi and

vii

Chris Pudney regarding local energy Bruce Backman was a good sounding board for

ideas and always provided a di�erent perspective Friedrich Heitger was very helpful

with e�mail messages when I was trying to implement his work and while I was at

ETH taught me much of what I know about �lter design Olof Henricsson and I

also had many interesting discussions regarding the use of low�level features after our

weekly hit of tennis

Lachlan Partington and Chris Pudney provided valuable feedback through their

careful reading of a late draft of this thesis I also thank Chris for supplying both the

output for Figure � and numerous tips on LATEX

I would like to thank the Department of Computer Science at The University of

Western Australia and the Department of Education Employment and Training for

their support during this candidature

Thanks go to my family for their support of my studies especially my parents

who put me through school and then put up with me at home for most of my tertiary

studies I am forever endebted to my wife Karenza for her love and support throughout

my studies and to her I give special thanks

viii

Contents

Abstract iii

Preface vi

Acknowledgements vii

� Introduction �

� Review of �D Feature Detection �

�� What are �D Features� � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Goals of �D Feature Detection � � � � � � � � � � � � � � � � � � � � � � �

�� Review of �D Feature Detection � � � � � � � � � � � � � � � � � � � � � ��

�� Edge�based Schemes � � � � � � � � � � � � � � � � � � � � � � � ��

�� Grey�level�based Schemes � � � � � � � � � � � � � � � � � � � � ��

�� D Feature Classi�cation � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Discussion � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Local Energy ��

�� Examining Features at a Given Orientation � � � � � � � � � � � � � � � ��

�� Steerable Filters � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Oriented Energy � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Idempotency � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Summary � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

ix

� �D Feature Detection via Local Energy ��

�� De�nitions and Terminology � � � � � � � � � � � � � � � � � � � � � � � ��

�� Method � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Idempotency of �D Local Energy � � � � � � � � � � � � � � � � � � � � ��

�� A Uni�ed Approach to Feature Extraction � � � � � � � � � � � � � � � ��

�� Summary � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Implementation of �D Local Energy ��

�� Issues in Filter Design � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Orientation Selective S�Gabor Filters � � � � � � � � � � � � � � � � � � ��

�� Obtaining a Uniform Filter Coverage of All Orientations � � � � � � � ��

�� Setting the Orientation Selectivity of the Energy Filters � � � � ��

�� Determining the Number of Orientations at which to Apply the

Energy Filters � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Inappropriateness of Steerable Filters for �D Local Energy � � � � � � ��

�� Sharpening the �D Local Energy Response � � � � � � � � � � � � � � � ��

�� Varying the Energy Filters for the �st and �nd Pass of Oriented

Energy � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Post�processing the Energy Maps to Reduce Blurring � � � � � ��

�� Summary � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Experimental Results ��

�� Localization Accuracy of �D Image Features � � � � � � � � � � � � � � ��

�� Detection Accuracy of �D Image Features � � � � � � � � � � � � � � � � ��

�� Synthetic Data � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Real Data � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� D Feature Stability � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Temporal Stability � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Viewpoint Stability � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Comparisons and Observations � � � � � � � � � � � � � � � � � � � � � � ��

x

� Future Directions ��

�� Surface Feature Detection � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Implementation � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Results � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Discussion � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� High�order Feature Detection � � � � � � � � � � � � � � � � � � � � � � � ��

�� Discussion � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Summary � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Conclusions ��

Bibliography ��

A Optimizing �D Local Energy ��

xi

List of Tables

� Oriented energy residuals � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Complexity of �D local energy � � � � � � � � � � � � � � � � � � � � � � ��

xii

List of Figures

� A line drawing of a simple polyhedral scene � � � � � � � � � � � � � � � �

� The vertex classes enumerated by Guzman � � � � � � � � � � � � � � � �

� An illustration of a grey�level corner via Nagel�s de�nition � � � � � � �

� Corruption of edge information near �D features using a Laplacian of

a Gaussian operator � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Examples of Rohr�s corner model � � � � � � � � � � � � � � � � � � � � ��

� Attenuation via �D �lters � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Results of Heitger�s key�point scheme � � � � � � � � � � � � � � � � � � ��

� Freeman�s junction classi�cation � � � � � � � � � � � � � � � � � � � � � ��

� Points of maximal phase congruency correspond to image features � � ��

�� D feature detection via local energy using ��point and ��point �lters ��

�� A demonstration of the inability of �D �lters to exhibit orientation

selectivity � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Detection of �D image features via oriented energy � � � � � � � � � � � ��

�� A demonstration of �rst order idempotency using oriented energy op�

erators � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� D feature detection via �D local energy � � � � � � � � � � � � � � � � ��

�� Avoidance of false positive responses to �D image features by using �D

oriented �lters � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� A demonstration of the second order idempotency of �D local energy � ��

�� Demonstration of the blurring e�ect of increasing the orientation se�

lectivity of the oriented energy �lters � � � � � � � � � � � � � � � � � � ��

xiii

�� Comparison of Heitger�s �Stretched�Gabor� function to the standard

Gabor function � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� D plots of the power of a cosine function � � � � � � � � � � � � � � � � ��

�� D S�Gabor �lters � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Demonstration of how blurring in the oriented energy maps can lead

to missed �D features � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� A �negative� �D image feature � � � � � � � � � � � � � � � � � � � � � � ��

�� An illustration of the di�erence in blurring between Heitger�s key�point

scheme and �D local energy � � � � � � � � � � � � � � � � � � � � � � � ��

�� A demonstration of the problems caused by �sharpening� the �D local

energy map � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Demonstration of the reduction of false responses using �DLE� � � � � ��

�� Enhancing the resolution of �D local energy � � � � � � � � � � � � � � ��

�� Exact �D feature location for an ideal �� L�junction � � � � � � � � � ��

�� Three ideal �D image features � � � � � � � � � � � � � � � � � � � � � � ��

�� D feature localization via �D local energy � � � � � � � � � � � � � � � ��

�� D feature localization via �D local energy in the presence of noise � � ��

�� Performance of �D local energy on a synthetic test image � � � � � � � ��

�� Comparison of �D feature detectors on a noisy synthetic test image � ��

�� Comparison of �D feature detectors on sculpture image � � � � � � � � ��

�� Comparison of �D feature detectors on robot image � � � � � � � � � � ��

�� Comparison of �D feature detectors on boat image � � � � � � � � � � � ��

�� Demonstration of the temporal stability of �D feature detection via

local energy � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� D feature detection on a stereo image pair � � � � � � � � � � � � � � � ��

�� Comparison of local energy to �D Sobel operator on synthetic �D image

data � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Comparison of local energy to �D Sobel operator on a confocal micro�

scope image of a �ea � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Demonstration of the compression achieved by storing �lters in the

frequency domain after multiplying the odd�symmetric �lter by i � � � ��

xiv

Chapter �

Introduction

Of primary importance in the process of image segmentation is the reduction of data

without signi�cant ensuing loss of the information in the original image The extrac�

tion of image points that contain signi�cant variation is the most common approach

to this end Although feature detection has for a long time been closely linked to

the extraction of step�pro�led features from an image in reality image features may

consist of a variety of pro�les and so the term signi�cant variation is used here in a

broad sense

Furthermore although research into low�level image segmentation has mainly fo�

cused on the extraction of image points that vary signi�cantly in at least one ori�

entation image points that vary signi�cantly in two or more orientations are also of

importance The latter type of feature typically corresponds to some signi�cant higher

order structure in the scene being imaged such as object vertices and occlusion of

other objects These features are relatively scarce in the image furthermore they

constrain the detected image point to a ray through the imaged scene making them

useful for applications that require matching points in di�erent images such as stereo

matching �� and structure from motion ��

In this thesis the term ��D image feature� refers to image points that have a

signi�cant locally maximal variation in at least one orientation This variation may

be a step roof or spike discontinuity or a combination of these events Note that this

de�nition includes higher order features which are a subset of �D image features ��D

�

CHAPTER �� INTRODUCTION �

image features� are image points that have signi�cant locally maximal variation in

two orthogonal orientations Note that the term ��D image feature� is interchangeable

with the term �generalized edge� and the term ��D image feature� is interchangeable

with the increasingly common notion of a �key�point� and includes grey�level corners

and junctions In this thesis the terms �D feature and �D feature are used to stress

the underlying form of these features

Many early attempts at �D feature detection have been unable to reliably de�

tect and accurately localize �D image features While some of these shortcomings

have been addressed by modi�cation of these early attempts such as Deriche�s modi�

�cations �� to Beaudet�s DET operator �� and several new approaches o�er

increased performance and robustness �� the development of methods of �D

feature detection has been hampered by the lack of an adequate model that encom�

passes the enormous variety of these features

By de�nition �D image features vary in more than a single orientation This

makes characterizing these features in the spatial domain extremely di�cult due to

the large number of possible spatial variations that may make up a �D feature

This thesis extends local energy to the detection of �D image features The bene�t

of this approach is that features are de�ned as points of locally maximal order in the

frequency domain therefore avoiding the problem of characterizing �D image features

in the spatial domain Free of the constraints of a spatial characterization of image

features this thesis develops a general model of �D image features that includes grey�

level corners line terminations spots and a variety of junctions

This thesis extends local energy to the detection of �D image features via an

original uni�ed approach to the de�nition of image features �D and �D image features

are treated in a consistent manner thus imposing a hierarchy on image features with

higher order features being extracted from lower order image features Therefore

higher order features form a subset of lower order features This consistent treatment

of �D and �D image features in �D images extends naturally to �D �D and �D features

in �D image data underlining the �exibility of a uni�ed approach to the de�nition of

image features


The layout of the rest of this thesis is as follows� Chapter � introduces the mo�

tivation goals and common problems of �D feature detection and reviews previous

attempts at �D feature detection

Chapter � reviews the local energy model of feature detection introduced by Mor�

rone and Owens in �� Illustrations are provided to support their contention

that image features correspond to points of locally maximal phase congruency in the

phase domain of the image signal The use of oriented pairs of energy �lters to calcu�

late the energy of an image at a given orientation is also reviewed This technique is

fundamental to the development in this thesis of a �D feature detector based on local

energy

Chapter � extends local energy to develop a new method of �D feature detection

based on an original de�nition of �D image features The �D local energy model

presented in this chapter is conceptually simple yet it provides the �rst inclusive

de�nition of all types of �D image features The idempotent nature of �D feature

detection via �D local energy is also illustrated

Whilst the de�nition of �D local energy is simple there are a number of issues to

contend with in implementing the model particularly with respect to the design of

the �D orientation selective �lters used to extract features from the original image

Chapter � examines these issues and shows that the set of �lters used in the imple�

mentation of �D local energy in this thesis provide a uniform coverage of features at

any orientation This chapter also examines the problem of blurring caused by the

successive application of �D orientation selective �lters and the relationship between

�lter resolution and characteristic false responses to �D image features

The performance of the �D local energy model is examined in Chapter � Firstly

the localization accuracy of �D local energy is examined using synthetic image data

Then the performance of �D local energy is compared to several new �D feature

detectors using both synthetic and real image data The results show that the �D

local energy feature detection scheme performs on par with the key�point detection

scheme of Heitger et al� �� and provides clearly superior performance to the best of

the other methods tested

Chapter � shows how the uni�ed approach to image features developed in this thesis


permits the extension of local energy to the detection of �D �D and �D features in �D

image data Results are given for the implementation of �D �surface� feature detection

in �D image data using local energy while an outline is provided of how the extension

of this model to the detection of higher order features may be implemented

Chapter �

Review of �D Feature Detection

The extraction of salient features from an image is the most common approach to low�

level image processing where the goal is to eliminate redundant data so that a smaller

data set preserves a signi�cant portion of the information in the original image �D

image features allow this data reduction while constraining the detected image point

to a ray through the viewed scene making them useful for applications such as stereo

matching �� structure from motion �� and line labeling ��

�� What are �D Features�

Early attempts at �D feature detection were often methods for the detection of grey�

level corners �or L�junctions� in images �� These image points typically arise

from the projection of points that are object vertices or edges with high local curvature

in the scene While these L�junctions are important and perhaps the most common

type of �D feature it is important to detect many other types of �D intensity vari�

ations

Line labeling is one of the early endeavours in computer vision that met with

considerable success �� Line labeling is the classi�cation of each curve in

a line drawing as corresponding to either a depth or orientation discontinuity in the

scene with further sub�classi�cation of each kind of discontinuity and is a major

component of the interpretation of line drawings An example of such a line drawing

�

CHAPTER �� REVIEW OF �D FEATURE DETECTION �

with the junctions classi�ed according to their type is given in Figure �

Figure �� A line drawing of a simple polyhedral scene with the vertices labeled�

While there are several di�erent techniques for line labeling all rely on classifying

the vertices in the image that are formed where the lines terminate Guzman ��

was one of the �rst to classify junctions formed by straight line segments for this

purpose He enumerated nine classes of vertices that arise from straight line segments

and corresponding links between regions surrounding each vertex that belong to the

same body These vertex classes and links are shown in Figure � The junction type

can give important relational clues between regions in the image For example the

T�junctions in Figure � indicate occlusion of two surfaces or one surface and the

background by another surface meaning that the occluded surfaces must be further

from the viewpoint that the occluding surface�

It is clear from Figure � that �D feature detection must be able to accurately detect

a wide variety of �D intensity variations that are not represented by a simple model

of an L�junction as used by early attempts at �D feature detection Although some

subsequent attempts to model �D image features are more general in their de�nition

�Note that in real images� the occluding surface could be on either side of the continuing linedepending on whether the terminating line corresponds to a surface boundary or a texture edge inthe scene� For more details� see Freeman�s PhD thesis ��


(B) FORK or "Y" (C) ARROW

(D) "T" (F) PEAK

(H) MULTI (I) "K"

(A) "L"

(G) "X"

(E) MATCHED T

Figure �� The nine classes of vertices enumerated by Guzman with the links between regionsbelonging to the same body�


of �D features �� it is di�cult to enumerate all possible types of �D features with

a spatial model of �D intensity variations

Later this thesis develops a robust de�nition of �D image features that includes

but is not limited to all the types of features given in Figure � However for the

purpose of de�ning the goals and reviewing the current state of the art of �D feature

detection �D image features are de�ned as �points in an image with signi�cant local

variation in more than a single dominant orientation�

�� Goals of �D Feature Detection

�D feature detection is a low�level data�driven image processing task whose output

is typically used as input for some intermediate�level vision processing task Bearing

this in mind the goals of �D feature detection depend to an extent upon the task for

which the output is being used However accurate detection and localization of �D

image features can be considered as primary goals that are desirable no matter what

application they are used for

While the de�nition of �D image features as �points in an image with signi�cant

local variation in more than a single dominant orientation� is general enough to deal

with a wide variety of �D image features it is vague in the sense that it does not

de�ne what constitutes a signi�cant local variation for accurate detection and also

clarify exactly where this variation occurs for accurate feature localization

In contrast to the general de�nition of �D image features given above several

previous attempts at �D feature detection have been either implicitly or explicitly

based on a simple model of a grey�level corner �or L�junction� Nagel de�ned a grey�

level corner as �the location of the maximum planar curvature in the locus line of

steepest gray level� �� A grey�level corner marked with the location of the feature

by Nagel�s de�nition is shown in Figure � The basis for Nagel�s grey�level corner

de�nition was that at points de�ned by the model both components of optical �ow

could be determined Although this model gives a precise de�nition of the location

of grey�level corners and so allows accurate assessment of the performance of various

grey�level corner detectors it clearly does not�and was not intended to�model other


types of �D image features such as complex junctions and spot features and so is not

useful for the detection of these features

"L90smooth.plot"

Figure �� A grey�level corner marked with a cross at the feature location given by Nagel�s de�nitionof a grey�level corner as the location of maximum planar curvature in the locus line of steepest greylevel��

Although robust feature localization gives an accurate representation of the loca�

tion of the �D features in an image this does not provide any indication of the local

structure in the image near the feature point and so gives no illumination as to the

structure in the scene that caused the image feature Identi�cation of the local image

structure at a �D image feature gives the type of �D image feature Applications such

as line labeling require this information while others such as stereo correspondence

may bene�t from this extra information to reduce the number of possible correspond�

ences �� Typically �D features are classi�ed as one of the types shown in Figure �

but depending on the application may also include others such as spot features which

arise from small objects noise or specularities due to lighting conditions

Some methods of �D feature detection classify the feature type as part of the feature

CHAPTER �� REVIEW OF �D FEATURE DETECTION ��

extraction process by matching with templates of di�erent feature types ��

Most often however classi�cation of the type of �D feature occurs independently of

the extraction of the feature In that case classi�cation of the feature type can be

considered separate to the problem of feature detection and so need only be performed

if this information is required As this is often considered a separate issue previous

approaches to �D feature classi�cation are reviewed after a discussion of �D feature

detection

For applications that use �D features from a sequence of images another goal of

�D feature detection is temporal stability of the detected image features There are

two components of temporal stability that must be satis�ed in order to accurately

track �D features in a sequence of images Firstly stability in the detection of �D

image features is required so that �D feature points will not be lost over time due

to small changes in lighting position or viewpoint Secondly stability in the detected

location of �D image features is required in order to accurately measure motion of �D

image features in a sequence of images

�� Review of �D Feature Detection

Attempts to detect �D features in an image can be broadly categorized into two groups�

those that operate directly on the grey�scale image and those that �rst extract edges

from the image and operate on these edge points This section examines each of these

approaches in turn

�� Edge�based Schemes

Edge�based schemes rely on the assumption that edge points are �D features For

this reason they can mislocalize or not detect �D image features if the edges are

mislocalized or not extracted by the edge�detection scheme used to pre�process the

image Therefore methods that are based on prior segmentation of the image often

perform poorly particularly at sharp �D image features


Berzins �� Asada and Brady �� and Medioni and Yasumoto �� developed edge�

based approaches in the mid�eighties These approaches extract edges as zero cross�

ings in an edge map generated from the rotationally symmetric Laplacian of a Gaus�

sian edge operator Corner points are then extracted from the chain of edge points

by �nding points of locally maximal curvature using appropriate thresholds

Both of these edge operators ensure that the extracted edge points form a connected

set of points which is a desirable property when creating a curve�based representation

of an image However this property means that junctions will be represented as closed

curves even when the local image geometry is clearly formed by the intersection of

curves Therefore these operators produce an inaccurate representation of the image

at junctions This corruption of the local image geometry at complex junctions can be

seen clearly in Figure � of Medioni�s paper �� and is demonstrated below in Figure �

(A) (B)

Figure �� A� An image of a sculpture containing mainly T�junctions and L�junctions� B� Theedge points extracted using a Laplacian of a Gaussian operator� Note the smoothing e�ect near someof the L�junctions and the corruption of the T� and Y�junctions due to the fact that the closed curvesthat this operator produces are unable to represent these structures� � of Gaussian � pixels� Inputimage is ��x�� pixels� output image is ��x�� pixels� Note that the output image is twice the sizeof the input image in each dimension due to the interpolation used in �nding the zero crossings�

This means that neither of these methods can provide classi�cation of the junction

type due to the inaccuracy of the edge representation that they use Furthermore


these operators are sensitive to noise and produce a rounding e�ect at corners so that

the zero crossing occurs �inside� the corner location with this e�ect more pronounced

at acute corners

Clearly the use of rotationally symmetric edge operators distorts both the feature

position and local geometric structure as well as being sensitive to image noise

This amounts to a violation of all the goals of �D feature detection� poor localization

inability to determine feature type and due to the poor performance in the presence of

noise poor feature detection These methods are obviously inappropriate for accurate

extraction of �D image features

While the two attempts described above perform poorly extraction of corner fea�

tures from an edge map is not without merit provided that an accurate edge repres�

entation can be obtained Henricsson developed a robust methodology for contour

aggregation that utilizes �D features detected from curvatures in the edge map as well

as key�points extracted directly from the grey�level image �� This approach has

the advantage of detecting high curvature points that may otherwise be missed by

grey�level�based �D feature detection without sacri�cing accurate detection of sharp

�D image features and complex junctions

�� Grey�level�based Schemes

Grey�level�based feature detection schemes attempt to extract �D features directly

from the input image In general these methods perform better than edge�based

schemes because their input has not been pre�processed avoiding possible loss of

information vital to the accurate detection of �D image features

The most common approach to �D feature detection is to treat the image as an

intensity surface and use derivatives and curvature measurements of the surface to

detect �D image features Beaudet suggested saddle points on the intensity surface

as candidates for corner features �� He derived the convolution masks required

to calculate the determinant of the Hessian matrix using a second order Taylor�s

expansion of the intensity surface� The determinant of the Hessian matrix is given in

�In fact� Beaudet intended to calculate the numerator of Gaussian curvature to detect corners viadi�erential geometry� As Noble �� points out� the determinant of the Hessian matrix and Gaussian


Equation � Following a detailed investigation of the distribution of grey levels about

corner features Dreschler and Nagel �� also used Beaudet�s masks to determine

�Gaussian curvature� with pixels lying between the extrema of the Gaussian curvature

being treated as candidate corner features Nagel later more clearly de�ned L�junction

corner features as �the location of the maximum planar curvature in the locus line of

steepest grey level� shown in Figure � The determinant of the Hessian matrix is given

mathematically as

DET � IxxIyy � I�xy� ��

where Ixx� Iyy� and Ixy are the second directional derivatives of the image I with respect

to x y and both x and y respectively

Following the use of the �Gaussian curvature� to detect corner features several

methods were presented that attempted to measure the �cornerness� of a point as a

product of the gradient magnitude �the �edgeness� or strength of the feature� and the

rate of change of the direction of this magnitude along the edge contour ��

These methods rely on the edge orientation at each image pixel in order to measure

the rate of change of this direction along the edge contour This is a major drawback

of this type of feature detector because this information is unreliable near �D image

features especially features more complex than L�junctions and this adversely a�ects

the accuracy of these measures Kitchen and Rosenfeld �� used this measure to

detect corner features using a quadratic polynomial to approximate the grey level

image Their �cornerity� measure is�

cornerity �IxxI

�y � IyyI

�x � �IxyIxIy

I�x � I�y� ��

Zuniga and Haralick �� proposed a corner detector that is very similar to Kitchen

and Rosenfeld�s with a corner being de�ned as a signi�cant change in curvature

along an edge curve Rather than the quadratic polynomial used by Kitchen and

Rosenfeld they used Haralick�s facet model �� to �t a least squares bi�cubic

curvature are only equivalent at critical points�


polynomial surface to each pixel In fact Shah and Jain �� showed that if Zuniga

and Haralick�s bi�cubic polynomial is used to approximate the surface then the only

di�erence between Kitchen and Rosenfeld�s corner measure and Zuniga and Haralick�s

is an additional gradient term in the former This is compensated for by Zuniga

and Haralick by only considering edge points as candidate corner features that is

all corner features are also edge features Interestingly Nagel showed that if non�

maximum suppression is applied to the gradient magnitude before calculating the

�cornerity� measure then the Kitchen and Rosenfeld detector is equivalent to Dreschler

and Nagel�s algorithm In other words despite the di�ering ideas motivating them

all of the derivative�based methods mentioned thus far in this section are essentially

equivalent

A fundamental problem with the above methods is that they are based on corner

models of L�junction features and therefore the detection and especially localization of

other types of �D image features is unreliable These methods rely on detection of edge

features to provide the gradient magnitude However the gradient information near

L�junctions su�ers from characteristic mislocalization that increases with sharpening

of the angle of the L�junction resulting in imprecise feature localization Furthermore

these methods are highly sensitive to noise and require smoothing of the image prior

to feature extraction Although the smoothing reduces the e�ect of noise it causes

further mislocalization of corner features The result of this is that the detected feature

location may be inside the acute angle formed by the pair of lines that mark the edge

of the feature

Deriche and Giraudon conducted an analytical study of the behaviour of L�junction

features in scale space �� They found that the displacement of the feature location

inside the corner is a function of the standard deviation � of the blurring �lter

and gave exact measurements of the displacement in terms of � for varying corner

angles with acute angles su�ering from increased displacement They calculate the

determinant of the Hessian matrix �Beaudet�s measure� at two scales and then search

for the zero�crossing in the Laplacian of the image along the line joining the positions

of the local maximum of each point at both scales They later extended the model to

deal with trihedral vertices �T� Y� and ARROW�junctions etc� �� which have two


elliptic maxima provided that the contrast of the feature is su�cient

Deriche and Giraudon�s work appears to go a long way towards solving the problem

of mislocalized features using these gradient�based approaches The results they show

on a real image are good although unfortunately they provide the same single real

image in both their papers and this seems to consist primarily of high contrast L�

junctions They do provide detail from the image in their paper �� showing the

accurate localization of several trihedral vertices It appears that line terminations

and wire crossings are not so well detected although one can�t say for sure without

further evidence A further problem that exists is de�ning the search window for the

elliptic maxima when the displacement is large which occurs for acute angles �Deriche

and Giraudon mention in particular that angles less than �� may be problematic�

and when � is large

F�orstner developed a method for �D feature detection based on analysis of the

local gradient �eld at an image point �� as part of his framework for low level

feature extraction His operator works on a window centred on an estimated feature

location and re�nes it to sub�pixel accuracy Unlike the methods described above

F�orstner�s feature models are constructed in object�space rather than in the image

domain and the detection of image features is based on the projection of the object

feature models into the image domain F�orstner uses nonlinear �lters for feature

extraction which ensures that feature localization by this method is invariant with

respect to the aperture of the L�junction as shown by Blaszka and Deriche ��

Recently a new approach based on the construction of general �D image features

from elementary model functions has emerged Rohr de�ned a general analytical

model based on the superposition of elementary wedge�shaped model functions for

his �D feature detector �� Potential �D image features are matched to the

parametric models by minimizing a least squares �t directly to the grey level image

An advantage of using this approach is that �tting a model to the image explicitly

de�nes the type and orientation of �D feature detected

Rohr is able to construct an arbitrary number of feature types provided that they

can be represented by the superposition of wedge�shaped primitives Two examples

of �D image features constructed using this model are given in Figure � The use


of these wedge�shaped model primitives does however place several restrictions upon

the types of features that one is able to detect Since features must be large in spatial

extent the edge segments leading to a feature must be straight and bound regions

must be of relatively constant intensity It is also not possible to detect �D features

such as line terminations and spot features since it is impossible to construct these

using the model primitives

(A) (B)

(C) (D)

Figure �� Two examples of Rohr�s corner model� A� The surface mesh of an ideal �� grey�levelL�junction� B� Rohr�s corner model of the same feature� C� The surface mesh of a typical T�junction from a real image� D� The representation of C� formed by the superposition of two ��

wedge shapes via Rohr�s model�

A further problem with Rohr�s method is that it is extremely computationally ex�

pensive Blaszka and Deriche �� have developed a method based on similar primitive

features But by modeling the image blurring using an �D exponential �lter rather


than a Gaussian and by developing a new approach to the initialization of the min�

imization process they have substantially reduced the computation required They

investigated the localization accuracy of their approach using noisy synthetic data

compared to some of the traditional methods described above and showed that while

their algorithm is more computationally expensive their approach yields far better

results The localization accuracy of their model is veri�ed using real stereoscopic im�

ages with impressive results Generalization of the model to deal with curves rather

than straight edges and to deal with non�constant intensity regions are left as open

problems for future work

Smith and Brady developed a corner detector based on what they call the �SUSAN

principle� for Smallest Univalue Self Assimilating Nucleus �� This is a region�

based approach where the size of the region which is called a USAN is given by the

number of pixels within a �xed radius of the central pixel�known as the nucleus�that

have an intensity value within a certain range of that of the nucleus After a process

to eliminate false positive responses locally minimal USANs �using a �x� window�

are labeled as corner points

False positive responses are eliminated using two strategies The �rst rejects

isolated points or broken lines by �nding the centre of gravity of the USAN and

eliminating candidate points where the distance between the centre of gravity and the

nucleus is below a threshold The second strategy attempts to ensure that contiguity

is maintained within the USAN A pixel may not be labeled a corner unless all of the

pixels within the mask lying in the ray from the nucleus through the centre of gravity

are part of the USAN

Smith�s corner detector performs well on noisy synthetic images with relatively

small amounts of blurring This is not surprising given that the SUSAN corner detector

appears to be based on analysis of the properties of features in synthetic images The

idea of labeling all pixels in the mask that are within a certain range of grey levels of

the nucleus e�ectively eliminates image noise provided that the grey value deviation

due to noise is less than the range required for inclusion in the USAN and the signal

to noise ratio is high enough

Smith e�ectively treats complex �D image features as structures composed of


L�junctions with the position of the sharpest region�as this will have the smallest

USAN�determining the location of the feature The results presented in Chapter �

show that the performance of the SUSAN detector on real image data is not as good

as it is for synthetic images particularly if the image contains a large amount of

blurring In these circumstances localization tends to be poor with even simple L�

junctions being labeled several pixels from their apparent location and the number of

false negative responses can become large This method is however very fast

Mehrotra et al� developed a corner detector based on half�edge detection al�

gorithms �� Half�edges are detected as either gradient maxima or zero crossings of

the second derivative using directional masks that are zero in one half Mehrotra ar�

gues that these algorithms perform better at corner points because assumptions made

by standard edge detectors�such as the image intensity being an analytic function�

are violated at edge points However this model assumes that corners are formed at

�� and is therefore biased towards right�angled corners Results are only presented

for very simple images and this method does not include a model of more complex

junctions

Moravec developed a method of �D feature detection that has no explicit model

of these features �� Instead Moravec looked for �points of interest� where there

was a small area of large intensity variation either vertically horizontally or in the

two diagonal directions His �interest� measure is calculated using an un�normalized

local autocorrelation in four directions and then taking the lowest value as the result

This is followed by thresholding and non�maximal suppression

There are several problems with the formulation of Moravec�s model The response

is anisotropic because only four orientations are used for the autocorrelation and noisy

because a square window is used It is also sensitive to strong edges since the interest

measure is taken as the minimum of the autocorrelation measurements instead of the

variation in them

Harris and Stephens made a number of changes to Moravec�s �interest� operator in

de�ning what became known as the Plessey operator �� Firstly the autocorrelation

is estimated from �rst derivatives of the image using a Gaussian convolution kernel

which also �xes the problem of using a square window as Moravec does Secondly the


response is isotropic since variation in the autocorrelation over di�erent orientations

is measured using the principal curvatures of the local autocorrelation Despite the

improvements made in developing the Plessey operator it provides poor localization

of image features particularly at complex junctions

Cooper et al� also take the approach of not explicitly modeling �D image features

in the image �� Cooper used a dissimilarity test between small windows on an image

along the contour direction of edge features to detect �D image features This method

relies upon an edge map for strength and gradient direction of potential �D features

so it su�ers from some of the same problems as Kitchen and Rosenfeld�s method most

notably poor feature localization at junctions It is also sensitive to image noise due

to the dissimilarity test In the same paper they present another more accurate real�

time scheme based on the direct estimation of Kitchen�s �cornerity� measure Robbins

and Owens have shown Cooper�s DEK �Direct Estimation of Kornerity � method

to be more robust than both Kitchen and Rosenfeld�s original implementation and

Beaudet�s method ��

Heitger et al� propose a new approach to �D feature detection based on simulating

cortical simple complex and end�stopped cells in biological visual systems �� Odd�

and even�symmetrical orientation selective �lters are combined to give a measure of

local energy to detect �D image features while pseudo di�erentiation of the oriented

energy maps along their respective orientations is used to detect �D image features

This is in accordance with neurophysiological evidence �� that �D variations

are taken care of by simple and complex cells while �D features such as corners

line�ends blobs and highly curved segments are detected by end�stopped cells

Heitger�s method is interesting for several reasons Most importantly Heitger does

not attempt to model �D image features in either the image or the object domain �D

image features are instead treated as terminations in the response maps of the oriented

energy operators that are due to actual termination of a �D feature �or �general edge�

using the terminology of �� at a junction or due to an apparent termination of a

general edge due to curvature of the feature

Another novel aspect of Heitger�s work is that it is perhaps the �rst �D feature de�

tector to accurately detect these features in complex images using orientation selective


�lters The output of these �lters provides the basis for the analysis of �D feature

termination by the end�stopped operators This is accomplished by �nding local max�

ima in the pseudo �rst and second directional derivatives parallel to the orientation of

the �lter pair Line�endings corners and junctions are indicated by local extrema in

the �rst directional derivative while blobs and strong curvature correspond to local

extrema in the second directional derivative

Unfortunately the �D directional derivatives used by Heitger also give a non�

zero response for all orientations that do not coincide with the �D feature These

characteristic false positive responses are attenuated using an inhibition scheme that

responds only to �D features that are not also �D image features Although inhibition

is a feature of biological visual systems the inhibition scheme presented by Heitger

is complicated and is only required because the �D directional derivatives exhibit no

orientation selectivity The problem is not bad design of the �D �lters themselves but

rather that �D �lters are incapable of exhibiting orientation selectivity �D �lters must

be used in order to achieve orientation selectivity as is illustrated in Figure � This

is this thesis� key criticism of Heitger�s method� the features at several orientations

are accurately extracted however the subsequent processing is dominated by e�orts

to reduce the e�ect of using directional derivatives to extract �D features from the

oriented energy maps

Despite the use of a �D process to extract features from the oriented energy maps

Heitger�s inhibition scheme works well and based purely on the set of features extrac�

ted his method extracts more features with less false responses and more accurate

localisation than any other �D feature detector that the author has been able to in�

vestigate An example of the quality of the feature detection and localization provided

by his scheme is shown in Figure �

Freeman also presented an approach to �D feature detection based on oriented

energy �lters �� Like Heitger he also used derivatives of energy at di�erent orient�

ations to extract �D image features The goal of his work was �to develop a system

to form an initial interpretation of the physical origin of observed image structures�

The emphasis of his work therefore was on the classi�cation of image features rather

than their accurate localization so this work is discussed in the following section


(A) (B) (C) (D)

Figure �� An illustration of how a �D �lter provides an attenuated response to a feature not exactlyaligned with the �lter while a �D �lter produces a maximal response� Contour maps of vertical odd�symmetrical �D A�� and �D B�� lters are overlayed on a vertical bar dotted lines indicate negativevalues�� Both of these �lters produce zero response to the feature at this orientation� C� and D� areidentical to A� and B� respectively� except the bar has been slightly rotated so that it is no longerexactly aligned with the �lters� The response of the �D �lter C� is maximal while the responseof the �D �lter to this feature is attenuated� It is obvious that the false response of the �D �lteroccurs regardless of the pro�le of the �lter because the �D �lter is incapable of exhibiting orientationselectivity�

(A) (B)

Figure �� A test image of a building at MIT that contains a variety of �D features types A�� The�D key�points detected by Heitger�s scheme overlayed on a reduced contrast version of the originalimage B�� Eight �lter orientations were used with default �lter parameters and using the defaultfeature threshold of �� of the global maximum value� Image size is ��x�� pixels�


�� D Feature Classi�cation

Classi�cation of the type of �D feature is the process of labeling observed �D features

according to their physical origin in the scene Classi�cation of �D image structures

usually relies on matching �D feature models to the image data Most often these �D

feature models are constructed as grey�level models in the image domain ��

or projections of an object model into the image domain �� although they can also

be constructed in terms of edge terminations in feature maps ��

The parametric model�based approaches to �D feature detection of Rohr ��

and Blaszka �� discussed in section �� classify features as part of the detection pro�

cess Therefore the problems previously mentioned with the detection of �D features

via this approach also apply to their classi�cation The greatest of these problems

is that the models are quite restrictive Wedge�shaped parametric models are simply

unable to represent �blob� or line�termination type �D features and may be inaccurate

when modeling highly curved or textured features where the straight line and homo�

geneous area assumptions do not hold Therefore this type of approach is only able

to classify a subset of all �D features and may be unreliable when the local image

information at a �D feature cannot be accurately represented by the model

Freeman �� proposed three operators for the detection of L� T� and X�junctions

These operators all assume that there are only two dominant orientations at a �D

feature and in analogy to �end�stopping� operators in biological visual systems clas�

si�cation is based on the termination or non�termination of �D contours at the �D

feature Classi�cation is determined by the number of end�stopped contours at a �D

feature according to Figure �

Freeman�s method of feature classi�cation is even more restrictive than the para�

metric model�based approaches Freeman only considers junctions formed by the

intersection of two edge contours restricting classi�cation to only L� T� and X�

junctions Although Freeman�s scheme is only able to represent three di�erent types

of �D features it has the advantage that it should be able to deal with quite general

local intensity pro�les because it operates on edge contours rather than the constant

grey�level wedge shaped model used by the parametric model�based schemes


NO

YES

YES NO

Orientation Aend-stopped?

end-stopped?Orientation B

Figure �� Junction classi�cation via Freeman�s scheme� If both contours exhibit end�stopping thenthe feature is classi�ed as an L�junction� If exactly one contour exhibits end�stopping then the featureis classi�ed as a T�junction� If both contours are not end�stopped then the feature is classi�ed as anX�junction�

Perona �� introduced scalable oriented �lters for junction analysis that operate

directly on the grey�level image He also designed �lters to measure �end�stoppedness�

and ��sided �lters that may be used to analyse edge features arriving at a junction

from only one side of the junction feature However Perona did not provide a method

for junction classi�cation� he simply provided �lters that may be used as tools for an

approach similar to that described below

Michaelis and Sommer �� present a model for the classi�cation of �D image

features using steerable scalable �lters similar to those proposed by Perona They use

a model of junctions that �consists of straight edges and lines that intersect in one

point� and classify junctions by the �D features that constitute them

Both double� and one�sided functions are used to analyse local �D feature orient�

ation with highly orientation selective �lters required to resolve features that have a

small di�erence in orientation For the results presented in their paper the authors

used �� basis functions in orientation and four in scale The computational complexity

of this approach �and the complexity of �D feature classi�cation in general� means

that it must be used within the framework of a vision system that has some kind of


attentional mechanism so that classi�cation is only attempted after the �D features

have been extracted from the image

An advantage of this approach over the parametric wedge model methods is that

it is based on �D features and so is independent of region contrast However it is

based on the intersection of straight �D features so may be unable to reliably classify

junctions constructed from highly curved features although to what extent this is the

case is unclear from the results

The primary di�erence between the detection and classi�cation of �D image fea�

tures is that the former must be as general as possible so that all possible features

are detected while the latter requires a speci�c model of each type of �D feature that

must be classi�ed in order to distinguish between di�erent types of features This

thesis focuses on the development of a general model for the detection of �D image

features Although information about the type of �D feature can sometimes be useful

in later stages of processing this thesis does not present a method for �D feature

classi�cation since classi�cation requires speci�c models of �D image features

�� Discussion

The importance of a general model of �D features is essential to the classi�cation of

these features due to the large number of local �D intensity variations that are �D

features All previous approaches to �D feature detection have been shown to model

only a subset of these �D features � �� or to have no model

of the features at all �� In Chapter � a general de�nition of �D image features

is presented that encompasses all types of �D features with an implementation of a

method for the detection of these features based on this model described in detail in

Chapter �

In contrast to the detection of �D image features the classi�cation of these features

relies upon matching speci�c models of di�erent �D image features either directly to

the image or to information extracted from the image data Therefore several di�erent

models of �D features are required for the classi�cation of these features For this

reason a uni�ed approach to feature detection and classi�cation is not appropriate


However while attempts to both detect and classify �D image features at the same time

cannot achieve the generality required to detect all classes of �D image features this is

not to say that these two operations must be completely separate �D image features

can be used to direct more complicated and computationally expensive methods of

�D feature classi�cation The classi�cation then requires far less computation than if

it were applied at every point in the input image

Chapter �

Local Energy

Much of the work on feature detection in the past has focused on de�ning appropriate

grey�level image structures to be detected and then developing methods to extract

these features � �� In contrast to this methodology Morrone and Owens ��

postulate that image features occur at points of high congruency in the phase domain

of the image signal That is feature information is encoded at points where the phase

angle deviation of the components of the frequency representation is small

Phase congruency is derived from the frequency representation of the image which

is formed by the Fourier Transform The Fourier series expansion of a �D function

F windowed over the interval �� and extended periodically beyond this window

is

F �x� ��Xn��

An sin��nx � �n��

where �n is the phase o�set of the nth component of the expansion and An is its

amplitude

At points of maximal phase congruency there is order in the image data and so

such points are high in information Although this model of image features makes

no a priori assumption about the shape of image features points of maximal phase

congruency correspond to a wide range of feature pro�les �D step roof and spike

pro�les as well as combinations thereof all correspond to local maxima in phase

congruency Furthermore Morrone et al� showed that phase congruency can also

��

CHAPTER �� LOCAL ENERGY ��

predict the phenomenon of Mach bands that humans perceived in trapezoidal image

pro�les �� The points where Mach bands are perceived also correspond to points

of maximal phase congruency

Figure � illustrates that points of maximal phase congruency correspond to image

features In this �gure points of maximal phase congruency for three simple �D

signals are shown to occur where features in these pro�les are perceived by humans

In Figure ��A� and �B� the Fourier components clearly are totally in phase at the point

of the image features The ends of the plateaux in Figure ��C� are also points of locally

maximal phase congruency although it is harder to see the in�phase components in

this image because of their small amplitudes and because not all Fourier components

of the signal are exactly in phase at these points

The phase congruency function as de�ned by Venkatesh and Owens �� is

PC�x� � max��

PnAn cos��nx� �n � ��P

nAn

� ��

Using the Taylor series expansion of cos�x�� x�� for small x� the phase

congruency function can be rewritten as

PC�x� � max��

��P

nAn��nx� �n � ��

�P

nAn

� ��

This expression is maximized when � is equal to the weighted mean phase angle

of all the terms in the Fourier series expansion of F That is PC�x� is maximized

when

� � � �

PnAn��nx� �n�

�P

nAn� ��

where � is the weighted mean phase Therefore phase congruency can be expressed

as

PC�x� � ��P

nAn��nx� �n � ��

�P

nAn� ��

where An��nx � �n � �� can be regarded as a weighted variance term where the

deviation is measured from the average phase � and weighted by the amplitude An

Although it is possible to calculate phase congruency directly �� it is a rather

awkward quantity to calculate However Venkatesh and Owens �� showed that


(A)

(B)

(C)

Figure �� A demonstration that image features correspond to points of maximal phase congruencyin the frequency representation of the signal� �D pro�les thick lines� of a step A�� triangle B�� andtrapezoid C� are shown overlayed with the �rst several components of the Fourier representationof these features broken lines�� and the sum of these Fourier components thin unbroken lines�� Ineach case� maximal phase congruency occurs at the location of the step A� and the roof B�� as wellas at the ends of the trapezoidal pro�le C� where Mach bands are perceived�


phase congruency is proportional to local energy and therefore local maxima in phase

congruency correspond to local maxima in local energy The local energy of a �D

signal can be de�ned as the square root of the sum of squares of the signal convolved

with a quadrature pair of �lters �Equation �� A quadrature pair of �lters consists of

an even� and an odd�symmetric �lter that have zero mean identical L� norm and are

orthogonal For any point x in the signal Morrone and Owens �� de�ned the local

energy E�x� as

E�x� �qO�

even�x� � O�odd�x��

where Oeven�x� is the signal convolved with the even�symmetric �lter and Oodd�x� is

the signal convolved with the odd�symmetric �lter at the point x

Morrone and Owens �� extended the model for use with images by applying it

separately in orthogonal orientations They showed that two orthogonal orientations

are su�cient to give a global measure of local energy along with a feature orientation

at each point in the image The local energy E at any point �x� y� in an image is

then

E�x� y� �qEx�x� y� � Ey�x� y��

and the orientation of the feature is

orientation�x� y� � arctan�Ey�x� y��Ex�x� y��

where Ex�x� y� and Ey�x� y� are the local energy at the point �x� y� with respect to x

and y respectively

Venkatesh and Owens �� showed that accurate detection of �D image features is

possible via local energy even with simple ��point �lters at only two orientations with

a more detailed discussion of �lter design given by Venkatesh in her PhD thesis ��

Despite the limited support ��point and ��point �lters provide for quadrature function

pairs such as the even� and odd�symmetric Gabor functions these �lters perform

well in detecting features of small spatial extent This demonstrates the robustness

of local energy with respect to the �lters used even if the �lters distort the phase

information a point which is made by Morrone and Burr �� Figure �� demonstrates


the performance of local energy using ��point and ��point �lters on a synthetic test

image

It is clear from Figure �� that local energy produces a locally maximal response at

�D image features for both step�pro�led and roof�pro�led features For some vision

tasks it is useful to be able to distinguish between these types of features once they

have been extracted from the image

Venkatesh and Owens show how analysis of the odd� and even�symmetric �lter

responses at image features can be used to determine the type of the �D image fea�

ture �� For each �D image feature corresponding to a local maximum in the �D

local energy map the even� and odd�symmetric �lter response may be either a local

minimum a local maximum or a zero crossing The eight possible combinations of

these pairs �given that a pair of zero crossings is impossible by the de�nition of energy�

correspond to the pro�les of positive and negative going step edges roof edges and

trapezoidal edges

For example a step edge corresponds to a zero crossing in the even�symmetric

�lter response and either a local maximum response for the odd�symmetric �lter �for

positive going step pro�les� or a local minimum response �for negative going step

pro�les� A further advantage of using the odd� and even�symmetric �lter responses

for feature classi�cation is that these values are required for feature detection so no

further transformations are required for classi�cation However �D feature classi�ca�

tion in real images is more di�cult than described above as image features may be a

combination of �D feature pro�les and therefore not rigidly �t the groupings provided

by Venkatesh and Owens

�� Examining Features at a Given Orientation

Freeman noted that it is often desirable to analyse an image at several orientations

for image processing tasks such as texture analysis �� image compression ��

motion analysis � �� and image enhancement �� This requires �lters that are

tuned to a particular orientation that attenuate the response to features not at the

orientation of the �lter


(A) (B)

(C) (D)

Figure �� D features detection via local energy using ��point and ��point �lters� A� A synthetictest image� B� The energy map generated using a pair of ��point �lters over only two orientations�C� The features extracted from B� using non�maximal suppression in the horizontal and verticaldirections� D� The image features extracted from A� via local energy using ��point �lters� Note thatlines in the input image are extracted as a pair of features using the ��point �lters C�� as these �ltersmark the step on either side of the line as a feature� ��point �lters� odd �

p��

p�� even

�p��

p��

p�� point �lters� odd �� even ��

Images are ��x�� pixels�


Attenuation of responses to features not aligned to the �lter orientation cannot be

achieved using �D �lters as they are unable to �nely discriminate feature orientation

The reason for this is demonstrated in Figure � �D �lters are able to provide an

attenuated response to features not exactly aligned with the �lter orientation and so

�D �lters are required to achieve orientation selectivity Figure �� demonstrates the

e�ect of the inability of �D �lters to respond to features only at a given orientation

compared to �D �lters that are able to do this

(A) (B)

Figure �� A demonstration of the inability of �D �lters to exhibit orientation selectivity� A�The result of convolving a vertically oriented �D energy �lter pair with the synthetic test image inFigure ��A�� B� The result of convolving Figure ��A� with an orientation selective energy �lterpair at the same orientation� Note the attenuation in the response of the �D �lter to the boundariesof the discs B� as the orientation of the edge moves further from the orientation tuning of the �lter�In contrast to this� the response of the �D �lter A� is binary� either a maximal response for featureorientations within �� of the �lter or zero otherwise� Image size is ��x�� pixels�

�D orientation selective �lters are required to extract features selectively from an

image on the basis of their orientation The next step is to e�ciently apply these

�D �lters over many orientations An elegant solution to this problem proposed by

Freeman �� is discussed in the next section


�� Steerable Filters

A simple approach to �nding the response of a �lter at many orientations is to apply

rotated versions of the �lter over a few orientations and then interpolate between the

responses Freeman showed that there exists a class of �lters such that the �lter at

an arbitrary orientation can be synthesized as a linear combination of a set of basis

�lters �� He called these �lters steerable �lters

Perhaps the simplest steerable �lter set and the one used as an example in Free�

man�s thesis is derived from the �rst derivative of a symmetric Gaussian function

Using Freeman�s terminology the �rst derivative of a �D Gaussian is written as G��

and that function rotated by �� as G�� It can easily be shown that a �lter at an

arbitrary orientation � can be synthesized as a linear combination of G�� and G��

� �

G�� cos��G��

� � sin��G��

� � ��

A further computational advantage comes from this approach because convolution

is a linear operation Therefore the result of convolution at an arbitrary orientation

can be synthesized from the convolution results of the basis set Letting R�� represent

the result of convolution of an image I with G�� and similarly for R��

� then the result

of convolution of I with G�� is

R�� cos��R��

� � sin��R��

� � ��

The �lters derived from Equation �� exhibit very little orientation selectivity and

respond to features up to �� from the orientation tuning of the �lters The advantage

of such a small degree of orientation selectivity is that only two �lters are required

to form a basis set to steer the �lters However as the orientation selectivity of

a �lter increases so does the number of �lter orientations required to form a basis

set Freeman provides the minimum number of �lter orientations required for steering

�lters of a given orientation selectivity provided that the �lters are constructed in a

certain manner


The advantages of using steerable �lters notwithstanding there are certain situ�

ations when there is no advantage in their use Primarily they are useful when one

wishes to apply orientation selective �lters to a single image over many �or a con�

tinuum of� orientations There may be no advantage in using steerable �lters for an

application where the result of convolution at a single orientation is required if the

required �lter�s� can be easily constructed at the desired orientation

�� Oriented Energy

The term oriented energy was introduced by Rosenthaler et al� �� to describe the

energy of an image at a given orientation calculated via orientation selective �lters

The local energy of an image is then formed as a combination of the oriented energy

over all orientations

Figure �� demonstrates that �D �lters are incapable of exhibiting orientation se�

lectivity Therefore �D orientation selective �lters are required to calculate oriented

energy Rosenthaler used modi�ed �D Gabor �lters de�ned in the frequency domain

to measure oriented energy These �lters are examined in Section �� These orienta�

tion selective �lters are larger and therefore more computationally expensive to apply

even in a single orientation than simple �D �lters

Since orientation selective �lters respond only to features near their tuned ori�

entation they must be applied at enough orientations to ensure that the combined

response of all �lters is uniform for features at all orientations� Provided that this

condition is met the energy of an image can be expressed simply as the sum of the

energy of the component orientations�

total energy�x� �nXi��

qO�

i�odd�x� � O�i�even�x��

where Oi�odd�x� and Oi�even�x� are the responses of the image at the point x to the ith

orientation of the odd� and even�symmetric �lters respectively

�The term �enough orientations� is de�ned� and this number is calculated for a given orientationselectivity� in Section ��


The use of oriented energy �lters for the detection of �D image features is illustrated

in Figure �� Note that for a given contrast pro�le local energy is higher at �D image

features than �D features as illustrated at junctions formed by straight lines in this

image This occurs because �D image features produce a signi�cant energy response

at a number of orientations so the energy as a sum of the component orientations will

naturally be greater at �D features than �D features

(A) (B)

Figure �� An illustration of the detection of �D image features via oriented energy� A� A synthetictest image� B� The �D local energy response to A�� Note the increased response magnitude at �Dimage features� Images are ��x�� pixels�

The fact that the energy response at �D image features is greater than at �D image

features is central to this thesis With the development of the theory of local energy

Morrone and Owens hypothesized that �D features correspond to points of locally

maximal order �phase congruency� in an image This thesis contends that �D image

features occur at points of locally maximal �D order in an image This de�nition of �D

features makes no assumption about the local spatial variation of �D features yet as

will be shown in the following chapter includes grey�level corners line�terminations

blobs and a variety of junctions

While it is clear that for a given contrast pro�le local energy is greater at �D

image features than their �D counterparts it is not obvious how this knowledge could


be used to extract �D image features from the �D local energy map of real images

where noise and edge contrast variation along a �D feature may produce the same

e�ect The ability to measure the local energy of an image at di�erent orientations

allows the detection of �D image features via local energy by detecting �D features at

a speci�c orientation How this can be achieved is shown in the following chapter

�� Idempotency

The extraction of salient features from an image is usually considered as part of the

process of forming a primal sketch �� While the exact form of the primal sketch is

debatable it can be considered to be a form of data compression which implies that

the perceptual content of the original image can be reconstructed from the compressed

data Furthermore the notion that the primal sketch can be represented as a real image

means that the formation of the primal sketch must be an idempotent operation since

otherwise one is lead to an in�nite regression in the image�s interpretation

As the extraction of image features is an important element in the formation of the

primal sketch it is essential that feature detection be idempotent that is there should

be no further change when it is applied to its own output Owens et al� �� showed

that the �D local energy feature detector satis�es this property They also examined

various odd� and even�symmetric functions used in the calculation of local energy and

showed in particular that those based on odd� and even�symmetric Gabor functions

are idempotent Therefore successive feature detection via �D oriented energy yields

the same �D image features

Note that it is not the �ltering via local energy that is idempotent but rather the

entire process of feature detection via local energy That is successive application of

the local energy �lters will produce further change to the input as features spread due

to the smothing introduced by the �lters However the process of feature detection

via local energy which includes the extraction of features from the local energy map

using non�maximal suppression or some other process is idempotent as the same set of

features is extracted from successive application of feature detection via local energy

The �rst order idempotency of local energy is demonstrated in Figure �� where the


same �D image features of the test image in Figure ��A� are detected after one pass

�A� and three passes �B� of local energy for the horizontal orientation channel

(A) (B)

Figure �� A demonstration of the �rst order idempotency of the oriented energy operator� The�D image features detected after one pass A�� and three passes B�� of the oriented energy operatorfor the horizontal orientation channel� Image size is ��x�� pixels�

Most �D feature detectors are not idempotent operators This can be easily seen

for the case of derivative step edge detectors such as the Sobel operator �� and

Canny�s detector �� These operators are sensitive to step changes and so mark

features with a pair of responses �anking the feature location when applied to their

own output Venkatesh �� proved that derivative feature detectors and the Marr�

Hildreth �Laplacian of a Gaussian� operator �� are not idempotent

However local energy is not the only idempotent feature detector Several mor�

phological operations are idempotent the simplest being the opening and closing op�

erators ��

�� Summary

This chapter illustrated Morrone and Owens� hypothesis that image features occur at

points of high congruency in the phase domain of the image signal Several simple


features in �D image signals were demonstrated in this chapter to correspond to local

maximum congruency in the phase domain representation of the signal Local energy

provides a simple means of �nding points of maximum phase congruency in real signals

by convolution of the signal with a quadrature pair of �lters

For many image processing tasks it is often desirable to analyse an image at

several orientations This task requires orientation selective �lters because �D �lters

are unable to selectively respond to features at a given orientation The use of steerable

�lters to e�ciently �nd the response of a �lter steered �rotated� to an arbitrary angle

was also reviewed in this chapter

Quadrature pairs of orientation selective �lters can be applied to an image to

produce oriented energy Oriented energy is simply the local energy of an image at

a given orientation The sum of oriented energy over all orientations gives the local

energy of an image The following chapter details how oriented energy is exploited in

order to detect �D image features using local energy

The property of idempotency is important for feature detection since it implies that

the set of extracted features is a unique interpretation of the image The idempotent

nature of the local energy feature detector via oriented energy was illustrated In

contrast to local energy most feature detectors notably those based on the detection

of step features are not idempotent operators

Chapter �

�D Feature Detection via Local

Energy

The previous chapter illustrated that �D and �D features occur at points of maximal

�D and �D order in the phase domain of the image signal respectively While it is

simple to extract points of maximal �D phase congruency from an image using local

energy extraction of points of maximal �D phase congruency in an image has not

previously been achieved

This thesis proposes the �D local energy model as a solution to the problem of

extracting �D features from an image within the framework of local energy The

central idea behind the model is that points of maximal �D phase congruency in

an image �corresponding to �D features� occur where there is maximal �D phase

congruency at more than a single dominant orientation Points of maximal �D phase

congruency at a given orientation correspond to local maxima in the oriented energy

at that orientation The degree of �D phase congruency at this orientation can then

be found by re�applying oriented energy in the orthogonal orientation to this output

Put simply �D features at a given orientation correspond to local maxima in the

oriented energy of an image at that orientation �D features at that orientation cor�

respond to the �D image features of the oriented energy in the orthogonal orientation

That is the same approach is used to detect �D image features from �D image feature

data as is used to extract the �D features from the original image

��

CHAPTER �� D FEATURE DETECTION VIA LOCAL ENERGY ��

Before this method is discussed in detail a few de�nitions are provided and some

new terminology introduced

�� De�nitions and Terminology

For the remainder of this paper the terms �oriented energy� and �local energy� are used

to refer to the energy response at a single orientation and the sum over all orientations

respectively Similarly the term ��D oriented energy� refers to the application of the

orthogonal oriented energy operator upon the �D oriented energy map at a single

orientation The ��D local energy� is the sum of the �D oriented energy maps over all

orientations ��D image features� and ��D image features� refer to local maxima in the

�D and �D local energy maps respectively

This can be put mathematically as follows�

�D oriented energyi � Ei��Ei�I��

and

�D local energy �NXi��

�D oriented energyi� ��

where I is the input image N is the number of orientation channels Ei is the �D

oriented energy at orientation channel i i� denotes the orientation channel orthogonal

to i �D oriented energyi is the �D oriented energy map for the orientation channel

i and �D local energy is the �D local energy map from which the �D features are

extracted as local maxima

Referring to Equation � recall that energy is de�ned as the square root of the

sum of the squared responses to an even� and odd�symmetric �lter pair Since the

combined response of a �lter pair is always used to calculate energy for the remainder

of this thesis a quadrature oriented �lter pair is referred to as an oriented energy �lter

With regard to the orientation of oriented energy �lters a �lter whose maximal radial

variation occurs in the vertical direction is said to be vertically oriented but tuned to

horizontal image features That is a �lter is tuned to features perpendicular to the

orientation of the �lter


Evaluation of �D feature detectors requires examination of their output on real

and synthetic images In this thesis the notation of Cooper et al� �� is adopted for

this process with each instance of a feature in the input image that is not detected

being labeled a false negative response Similarly a point marked as a �D image

feature that does not correspond to a �D feature in the input image is labeled as a

false positive response

�� Method

This thesis contends that �D image features occur at points where there is maximal

�D order in the phase domain of the image signal and that these points occur where

�D local energy is a maximum Therefore in order to detect �D image features �rst

the �D oriented energy is calculated at several orientations Then the oriented energy

operator in the orthogonal orientation is re�applied to the �D oriented energy maps

to produce �D oriented energy maps The �D oriented energy for each orientation is

then summed to produce a �D local energy map the maxima of which correspond to

the �D image features

The calculation of �D local energy involves three simple steps�

� for each orientation calculate a �D oriented energy map�

� for each orientation calculate the �D oriented energy map by applying the ortho�

gonal oriented energy operator to the �D oriented energy map at this orientation�

and �nally

� locate �D image features as the local maxima in the �D local energy map �see

Equation ��

This de�nition of �D image features uni�es the detection of �D and �D image

features �D and �D image features correspond to maximal �D and �D order in

the phase representation of an image respectively Therefore as has been noted by

Freeman �� Kitchen �� and others all �D image features are also �D features

That is �D image features are a subset of �D image features and can be extracted


from �D feature data in much the same manner that the �D features are extracted

from the raw image data

The method presented in this thesis uses orientation tuned �D �lters exclusively

for the extraction of both �D and �D image features This eliminates problems of

false responses to �D image features at o�set orientations although care must be

taken when designing these �D �lters to avoid excessive blurring while still providing

adequate orientation selectivity� For e�ciency reasons the orientation selective �lters

are developed in the frequency domain and all convolutions are performed as complex

multiplication in this domain Energy calculations are performed in the spatial domain

after back transformation of the �lter application in the frequency domain

A demonstration of this method is shown in Figure �� Figure ��A� shows a

synthetic test image along with the �D oriented energy response for the vertical ori�

entation �B� and the �D local energy �C� The �D oriented energy for the vertical

orientation �D� is provided next to the �D local energy �E� and the �D image features

extracted as local maxima from the �D local energy image overlayed on the original

image �F�

Note the accurate detection and localization of all the various types of �D image

features in Figure ��F� Although it is clear from Figure ��D� that as one would

expect �D oriented energy does not respond evenly to features at all orientations the

�D local energy formed as the sum of all the �D oriented energy maps provides an even

response to features at all orientations �E� The smoothness of the �D local energy

map makes extraction of local maxima straightforward Furthermore the absence of

post�processing of this map means that the smoothness property of local energy is

maintained

It is important to understand why the second pass of the oriented energy �lters

is made in the orthogonal orientation to the �rst application of the �lters for each

orientation This is performed at the orthogonal orientation to ensure that the features

extracted on the second pass of oriented energy correspond to the �D features at that

orientation It does not mean that �D local energy is modeled upon �� L�junctions

or that this adversely a�ects the detection of ��way and higher order junctions In fact

�These issues are discussed in detail in Section ��


(A) (B)

(C) (D)

(E) (F)

Figure �� An illustration of the detection of �D image features via �D local energy� A� A synthetictest image� B� �D oriented energy for the vertical orientation� C� �D local energy as the sum of�D oriented energy over all �� orientations� D� �D oriented energy for the vertical orientation�calculated by applying the same oriented energy �lter as was used to extract B�� only rotated by�� to the orthogonal orientation� E� The �D local energy formed as the sum of all �� D orientedenergy maps� F� The �� D features extracted from E� using a �x� window to �nd �D maximaand a threshold of �� of the global maximum �D local energy value� Image size is ��x��


non �� L�junctions spot features and other higher order junctions are represented in

this model as they produce �D oriented energy responses at a number of orientations

and these �D oriented energy responses sum to produce a local maxima in the �D

local energy as is demonstrated for these types of features in Figure ��D� �E� and

�F�

The steps used to calculate �D local energy are similar to those used by Heitger

in his key�point detection scheme with the main di�erence being the method for

extracting features from the �D oriented energy map �D local energy detects �D

features within an orientation channel by re�applying the oriented energy operator as

opposed to Heitger�s method of using �D �lters to estimate �rst and second derivatives

of the �D oriented energy maps and then inhibiting characteristic false responses that

are caused by �D �lters �� However this is an important di�erence because it

means that the �D local energy model does not use any inhibition whatsoever

Figure �� demonstrates the di�erence between the key�point scheme and �D local

energy Figure ��A� shows the output of �D oriented energy on the test image in

Figure ��A� and the combined end�stopped response before the inhibition stage using

the key�point scheme �B� Note the false responses to �D features caused by using

�D �lters compared to the �D oriented energy response which does not exhibit this

problem and so avoids the necessity of an inhibition step

A further advantage of the �D local energy model is that it is conceptually far

simpler than Heitger�s scheme because it exploits the model of �D image features

described at the start of this section��D features are points where locally there is

�D order in the phase domain of the image signal The bene�ts of looking for points

of �D order in the phase domain of the image signal are twofold� �rstly detection

is simpli�ed by looking for order in the phase domain of the image as opposed to

grey�level variation and secondly by using a model that is not based on the spatial

grey�level data all types of �D grey�level variations are able to be detected as �D

image features


(A) (B)

Figure �� A� The �D oriented energy of Figure ��A� for the orientation �� from the horizontal�B� The combined end�stopped response at the same orientation using Heitger�s key�point detectionscheme before inhibition is performed� Image size is ��x��

�� Idempotency of �D Local Energy

The �D local energy model involves the successive application of orthogonal pairs of

oriented energy operators in order to extract �D image features As discussed in Sec�

tion �� Owens et al� �� showed that the operators used to extract �D features in

orthogonal orientations are idempotent Therefore �D local energy which is calcu�

lated by successive application of these operators is idempotent That is successive

application of �D local energy will produce the same �D image features Most �D fea�

ture detectors that are based on grey�level corners do not have this property because

they do not model spot features and so they are unable to detect these features when

applied to their own output Morphological transformations are again the exception

to this guideline as the opening�closing residue corner detector of Noble �� is

capable of detecting spot features in its own output

The second order idempotency of the �D local energy operation is demonstrated

in Figure �� where with the exception of one feature near the bottom of the triangle

in the top�left corner identical �D image features are detected after one pass �A�


and three passes �B� of �D local energy After successive applications of �D local

energy this point is not classi�ed as a �D image feature because the response of the

strong feature just below interferes with the response at this point This is due to

�lter resolution limitations discussed in Section ��

(A) (B)

Figure �� A demonstration of the second order idempotency of �D local energy� The �D imagefeatures detected after one pass A�� and three passes B�� of �D local energy for the horizontalorientation channel� � �� pixels� and the image size is ��x�� pixels�

One should note that the type of feature is not necessarily preserved by an idem�

potent feature detector This is demonstrated by Figure �� where the original image

is composed of both step and roof features �Figure ��A�� while the features detected

after three passes �Figure ��B�� are all roof edges The result for �D image features

is similar�on successive application of the �D local energy model all features are spot

features

�� A Uni�ed Approach to Feature Extraction

An important aspect of the �D local energy model is that it provides a uni�ed approach

to feature extraction While it has long been recognized that all �D image features

are also �D image features until now there has been no e�ort to extract �D image


features from �D feature data in the same way that the �D features are originally

extracted from the input image In doing so a hierarchy of image features is clearly

established Working at the pixel level �D image features are clearly a subset of the

set of all points in an image and �D image features are a subset of the �D image

features

This notion of a hierarchy of image features can be extended to �D and higher

image data In the case of spatial �D data �D features correspond to the surfaces

in the image data the edges of these surfaces and line�like structures correspond to

�D image features and the junction of three or more edges lines the end�points of

lines or localized points in the image data correspond to �D image features For a

sequence of �D images �D image features are simply �D spatial features that have

some signi�cant temporal variation By this de�nition these �D features are a subset

of the static �D image features

The local energy model is also able to naturally extend to the detection of �D �D

and �D image features in �D data using oriented energy �lters This has already been

implemented for the detection of �D image features in spatial �D image data ��

using a single pass of oriented energy over several orientations and is discussed in

detail in Section �� Detection of higher order features in �D data is based on exactly

the same concept as detection of �D features in �D image data with one two and

three passes required for the detection of �D �D and �D features in �D image data

respectively

�� Summary

In this chapter a method for the detection of �D features was developed based on

a model of �D features as points of maximal �D phase congruency in the frequency

domain of the image signal This model treats �D image features simply as �D features

in a �D oriented energy map and so uni�es the detection of �D and �D image features

This model is also signi�cant because it provides a de�nition of �D features that for

the �rst time is able to represent all types of �D intensity variations that make up

these features


This chapter illustrated the idempotent nature of feature detection via �D local

energy While this is an important property for feature detection it can be simply

shown that most other �D feature detectors are not idempotent operators since they

are unable to detect spot features

The model for the detection of �D image features developed in this chapter uni�es

the detection of image features This model also introduces the concept of a hierarchy

of image features and explains how the model can be simply extended to the detection

of �D �D and �D features in �D image data

In the following chapter the implementation of �D local energy is discussed and its

performance examined in terms of detection and localization of �D image features

Chapter �

Implementation of �D Local Energy

The concept of �D local energy is straightforward For each orientation the �D ori�

ented energy �lters are applied to the image �rstly at that orientation and then in

the orthogonal orientation to the �D oriented energy output The �D oriented energy

responses are then summed to produce �D local energy However implementation of

�D local energy raises several issues that must be addressed particularly with regard

to the �lters Care must be taken in the design of the �lters and the choice of the

number of orientations to apply them to produce the best results

For e�ciency the �D local energy model is implemented using computations both

in the frequency and the spatial domain Speci�cally convolutions are performed in the

frequency domain �as complex multiplications� and the results are transformed back

into the spatial domain for the oriented energy calculation Therefore an initial fast

Fourier transform �FFT� is required for the input image and then for each orientation

channel four inverse FFTs and one forward FFT are required� this is two inverse FFTs

after each application of the quadrature �lters and one forward FFT of the �D oriented

energy map into the frequency domain Therefore for N orientations the number of

FFTs required to calculate �D local energy is � � N�� N � �

All these transforms between the spatial and frequency domain are expensive in

terms of the processing power that they require It would be possible to convert the

�lters into the spatial domain and then do all of the processing there However the

di�culty in accurately representing the �lters in the spatial domain and the high cost

��

CHAPTER �� IMPLEMENTATION OF �D LOCAL ENERGY ��

of convolution with the large �lters that are required negates the savings that are

made by eliminating the FFTs Furthermore recent optimization of the method used

to apply and store the �lters has reduced the computational e�ort considerably This

optimization is described in detail in Appendix A For a ��x�� pixel image with

�� grey levels �D feature detection over ten orientation channels requires under ��

seconds of CPU time on a DEC Alpha ��!��S Over ��" of this time is spent

performing FFTs This is implemented using the Visual Image Processing �VIP�

�� library of the Department of Computer Science at The University of Western

Australia

The next few sections discuss some of the issues in the implementation of �D local

energy paying particular attention to the design and application of the �D �lters

�� Issues in Filter Design

The fact that �D features are extracted directly from the �D local energy map means

that the design of the oriented energy �lters is critical to the performance of this

method The two most important criteria are the ability of the �lters to extract

features from as broad a range of scales as possible and secondly to respond only to

features closely aligned with the orientation of the �lter in order to avoid false positive

responses to �D image features This seems to indicate that broad bandwidth highly

orientation selective �lters are required

However there are several other considerations in the selection of oriented energy

�lters The �rst consideration is computational complexity As the orientation selectiv�

ity of the �lters increases so does the number of orientations required to produce an

even �lter coverage of all orientations Furthermore as the orientation selectivity of

the energy �lters increases so does the width of the �lters in the spatial domain This

heightens the blurring e�ect of the �lters on the input image and therefore decreases

the e�ective resolution of the energy �lters

The increased blurring of the oriented energy output caused as the orientation

selectivity of the oriented energy �lters increases is demonstrated in Figure �� Note

also that while the false responses to �D features not aligned with the orientation of


the energy �lters decrease as the orientation selectivity increases the spatial size of

the �lter response perpendicular to the orientation of the �lter increases

(A) (B)

Figure �� A demonstration of the blurring e�ect of increasing the orientation selectivity of theoriented energy �lters� The �D oriented energy responses to Figure ��A� for the orientation channel�� from the vertical using highly orientation selective �lters A� and less orientation selective �ltersB�� The highly orientation selective �lters respond less to �D image features although producing alarger spatial response to �D image features than the less orientation selective �lters� Images are��x�� pixels�

It is clear then that highly orientation selective �lters are able to discriminate fea�

tures tuned to the orientation of the oriented energy �lters more precisely than less

orientation selective �lters Therefore highly orientation selective �lters are able to

eliminate all false positive responses to �D image features through successive attenu�

ation of the response to these features on both the �rst and second passes of oriented

energy However this ability comes at the cost of requiring more orientations and

more importantly increased blurring of the oriented energy maps particularly as

a consequence of the �lters being re�applied to the oriented energy images for the

second pass in order to �nd �D image features Obviously some tradeo� must be

made between the degree of response to �D image features and the amount of blurring

caused by the �lters The issue of blurring of the energy maps and its relationship

with the bandwidth and orientation selectivity of the energy �lters is discussed in the


following subsections

�� Orientation Selective S�Gabor Filters

The �lters used in the implementation of �D local energy presented in this thesis are

the �D Stretched�Gabor �or S�Gabor� �lters developed by Heitger et al� �� The

S�Gabor functions have similar spatial �ltering properties to the Gabor functions but

have no DC component These �lters are constructed in frequency space and are �D

polar separable with the radial variation provided by an S�Gabor and the angular

variation provided by the power of a cosine function The S�Gabor determines the

response characteristics of the energy �lters while the power of a cosine function

determines the orientation selectivity of the oriented energy �lters �D plots of the S�

Gabor function are given in Figure �� while the power of a cosine function is illustrated

in Figure ��

For bandwidths greater than about one octave the even�symmetric Gabor function

has a non�zero DC component �ie� non�zero mean in the spatial domain� which makes

it sensitive to changes in the absolute intensity of an image A zero DC component

for both �lters in an energy pair is a requirement for them to be in quadrature so

this makes the standard Gabor function useless if a high bandwidth is required The

S�Gabor function as de�ned by Heitger et al� is a variation of the Gabor function that

has had the DC component of the even�symmetric function removed and so allows the

e�ective use of �lters with bandwidths greater than one For small bandwidths �about

one octave or less� the S�Gabor functions approach Gabor functions

Heitger generates functions of zero mean by introducing a gradual decrease in the

sine and cosine components of the Gabor functions as the distance from the origin

increases This gives the following equations for the S�Gabors�

R�r� � Godd�r� � e�r�

�� sin ��v�r��r��

for the odd�symmetrical function and

R�r� � Geven�r� � e�r�

�� cos ��v�r��r��


(A) (B)

(C) (D)

Figure �� S�Gabor� functions� A comparison of both even� and odd�symmetric S�Gabor functionsto the standard Gabor functions for a bandwidth of approximately �� octaves� A� Odd�symmetricS�Gabor solid lines� and Gabor dashed lines� functions� B� Even�symmetric S�Gabor and Gaborfunctions� C� Fourier spectra imaginary part� of the odd�symmetric functions� D� Fourier spectrareal part� of the even�symmetric functions� Note the non�zero DC component of the even�symmetricGabor function in D��


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-1.5 -1 -0.5 0 0.5 1 1.5

(A)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-1.5 -1 -0.5 0 0.5 1 1.5

(B)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-1.5 -1 -0.5 0 0.5 1 1.5

(C)

Figure �� The power of a cosine function cos�mx� for A� m �� B� m �� and C� m ��


for the even�symmetrical function where � determines the width of the Gaussian

envelope in pixels v�

is the frequency at the origin and ��r� speci�es the frequency

sweep given by the function

��x� � �� e��x�� e��x��

�

� �

��

where � is chosen numerically so that the even symmetric S�Gabor integrates to zero

Expanding the �lters into �D is achieved using a power of a cosine function to

express the angular variation given by

F �r� � � R�r� � cos�m��

where � determines the orientation of the �lter the positive integer m determines the

orientation selectivity of the �lter F �r� � is the �lter in polar coordinates and R�r�

is de�ned above for the even� and odd�symmetrical functions

The �D S�Gabor �lters given by Equation �� are shown in Figure �� The Fourier

spectra of the even� and odd�symmetric �lters are shown above the spatial representa�

tion of these �lters For �ltering images the width of the Gaussian envelope � should

be chosen so that the �lter responds to features at a broad range of scales As always

there is a tradeo� here with both the undesired blurring due to the �lters and the

usually desired response to large scale features increasing with � Furthermore as the

�lters are applied in the frequency domain � must be chosen so that the �lters can

be accurately represented in this domain Speci�cally one must ensure that the �lter

goes to zero before the Nyquist frequency �the edge of the Fourier spectra images in

Figure �� to avoid �ringing� when the �lter is transformed to the spatial domain

Polar separability of the Fourier spectrum of the oriented energy �lters ensures

that the spatial frequency tuning of the �lters is independent of the orientation of

the �D variation being measured This ensures that the �lters combine to produce

a local energy output for features regardless of their orientation although naturally

this energy response is attenuated for features not aligned with the energy �lters A

further advantage of using polar separable �lters is that the orientation selectivity of a

�lter is independent of its radial pro�le Therefore the radial variation of a �lter �and


(A) (B)

(C) (D)

Figure �� D S�Gabor �lters� Fourier spectra of A� the even�symmetric� and B� the odd�symmetric S�Gabor �lters� Note that the real part of the even�symmetric �lter and the imaginarypart of the odd�symmetric �lter are shown� C� The even�symmetric �D S�Gabor �lter� and D�the odd�symmetric S�Gabor �lter� The background grey levels represent zero� with larger negativeand positive values corresponding to darker and lighter shades respectively� Note that the Fourierspectra tend to zero before the Nyquist frequency at the border of images A� and B�� The frequencyrepresentations A� and B�� are ��x�� pixels� The spatial representations C� and D�� are ��x��pixels�


thus its shape� can be altered without a�ecting the number of orientations required to

smoothly cover all orientations This thesis makes use of this property of the oriented

energy �lters in the following section

�� Obtaining a Uniform Filter Coverage of All Ori�

entations

It is desirable that the strength of the �D local energy response be independent of �l�

ter orientation Therefore for �lters of a given orientation selectivity it is important

that they are applied at enough orientations to provide a response with no anisotrop�

ies Furthermore the degree of blurring introduced by the �lters increases with their

orientation selectivity and so the orientation selectivity of the �lters must be chosen

carefully to minimize blurring

Therefore this thesis approaches the problem of determining the orientation se�

lectivity and the number of orientations required by �rstly choosing �lters with an

appropriate degree of orientation selectivity and then determining the minimum num�

ber of orientations at which these �lters must be applied to obtain an even coverage

of all orientations

�� Setting the Orientation Selectivity of the Energy Filters

Given the con�icting goals of minimizing the blurring introduced by the energy �lters

and making the �lters highly orientation selective so as to reduce the response to

features at o�set orientations it is important that the orientation selectivity of the

�lters is set such that the maximal residual response to a feature not aligned to the

orientation of the �lter is small Choosing a level of orientation selectivity that achieves

this requires empirical observation of real images combined with a statistical analysis

of the �lter responses As the �lters are polar separable the orientation selectivity of

the �D S�Gabor oriented energy �lters given in Equation �� is determined entirely by

the power of a cosine function since this gives the angular variation

As the oriented energy �lters are successively applied at orthogonal orientations


the residual �D oriented energy response to a �D image feature is given by

cos�m�� cos�m��

where � is the angle between the orientation tuning of the �lters and the normal to the

edge This simpli�es to

cos�m�� sin�m��

The maxima of this function occur at �� in the range �� therefore the

maximal residual �D oriented energy response to a �D image feature will occur when

the orientation of the feature is �� from that of oriented energy �lters The �D

oriented energy residual response to an ideal �D step feature at �� is given by the

power of a cosine function �since cosx�� x� and is shown for various values of

orientation selectivity �m in Equation �� in Table �

The �D oriented energy residual is given by Equation �� since the �lters are applied

at orthogonal orientations However for the case � � �� Equation �� reduces to

cosm�� to give the residuals shown in Table � However these residuals assume that

a �D feature can be constructed that gives a maximum response to both the �rst and

second pass of oriented energy which is clearly impossible for �D �lters since this

would essentially require an ideal feature of equivalent spatial width as the �lters in

orthogonal orientations Therefore the bottom row of Table � contains �D oriented

energy residuals obtained from real images These measure the response to an ideal

�D feature �� from the orientation of the oriented energy �lters as a percentage of an

ideal �� L�junction of identical contrast also aligned �� from the orientation of the

�lters

Note that for successively increasing values of m the theoretical �D residual de�

creases to �� of its previous value while the measured values halve Although the

measured residual values decrease to approximately �� of their value at each iter�

ation the maximum measured value halves due to the broader spatial width of the

oriented energy �lters Therefore as these tables show the residual expressed as

a percentage of the maximum value does not drop as quickly in real images as the

theoretical ideal


orientation selectivitym � � m � � m � � m � � m � �

�D residual �theoretical� ��" ��" ��" ��" ��"�D residual �measured� ��" ��" ��" ��" ��"�D residual �theoretical� ��" ��" ��" ��" ��"�D residual �measured� ��" ��" ��" ��" ��"

Table �� Oriented energy residual responses to an ideal step edge for various levels of orientationselectivity� The edge is aligned �� from the tuning of the oriented energy �lters�

Observation of the application of these �lters to many images indicates that an

orientation selectivity of m � � produces the largest acceptable level of residual re�

sponse that in turn results in reduced blurring without introducing false responses

The residual �D oriented energy response to �D image features for this degree of ori�

entation selectivity shown in Table � is a little over �" Empirical evidence suggests

that this is su�cient to eliminate characteristic false responses to �D image features

�� Determining the Number of Orientations at which to

Apply the Energy Filters

Having determined a satisfactory orientation selectivity of the oriented energy �lters

the next step is to determine the number of orientations required to provide an even

�lter coverage of all orientations For an ideal �D step edge the contribution of each

�lter is simply given by the radial variation term �that is the power of a cosine� using

the angle between the normal to the edge and the �lter orientation as the argument

The local energy response for a given number of �lter orientations over all feature

orientations is obtained by simply summing the contribution of each �lter For the case

m � � only two �orthogonal� orientations are required since the local energy response

is given by cos��cos�� which simpli�es to cos��sin�� So

two orientations provide a perfectly isotropic response For an orientation selectivity

of m � � this analysis shows that only six orientations are required since this provides


a constant response over all orientations That is

cos�� cos�� cos��

cos�� cos��

The response remains isotropic if the number of �lter orientations is increased bey�

ond the minimum required number Therefore an even response over all orientations

is provided for both eight and ten �lter orientations if the orientation selectivity is set

to m � �

Empirical experimentation has shown that �� orientations produce a more sat�

isfactory response for energy �lters with this degree of orientation selectivity The

reason for this is that the response from any single orientation channel makes up a

smaller proportion of the summed responses which results in smoother �D and �D

local energy maps particularly along aliased edges in synthetic images Therefore

�lters with an orientation selectivity of m � � over �� orientations are used as the

default value in all of the results in this thesis

Note that the residual responses for �D and �D oriented energy given in Table �

are worst�case responses for a single orientation In practice the residual response to

�D image features in the �D local energy map is smaller since at most orientations

the residual response is far less than the �� worst�case orientation shown in Table �

Using �� orientations and an orientation selectivity of m � � the maximum residual

�D local energy response to �D image features is ��" of the response to a �� L�

junction of the same contrast

�� Inappropriateness of Steerable Filters for �D Local

Energy

The use of steerable �lters has the potential to reduce the amount of computation

required for many applications that require �lters at many orientations Unfortunately

calculation of �D local energy is not such an application This is primarily because of

the manner in which the second pass of oriented energy is performed


Firstly if the �lters are applied at N orientations for the �rst pass of oriented

energy then there are N separate input images for the second pass One of the

primary advantages of using steerable �lters is that once the basis �lters have been

applied to an image then the result of applying a rotated version of the �lter at any

orientation can be synthesized from the outputs of the basis set For �D local energy

if N basis �lters are required then the second pass will require applying these N

basis �lters to N �D oriented energy maps This means that N� convolutions would

be required using steerable �lters compared to the N convolutions required by the

current implementation

Secondly for the second pass of oriented energy exactly one oriented energy pair of

�lters is applied at a known orientation to each of the N �D oriented energy maps and

this is clearly done most e�ciently by calculating this �lter directly and then applying

it to the �D oriented energy map In essence steerable �lters are not required as the

�lters do not need to be steered to any given orientation because �D image features

have no �preferred� orientation Therefore no bene�t is derived from using steerable

�lters to calculate �D local energy

The implementation of �D local energy presented in this thesis requires the ap�

plication of �D oriented energy �lters at a given number of orientations without any

interpolation of results for other orientations and with the same �lters often being

reused for di�erent images possibly even for both the �rst and the second pass of

oriented energy for a given image Therefore it is far more e�cient to pre�calculate

the �lters o��line and load the necessary �lters when required than to generate the

�lters from scratch This approach reduces the overhead for calculating and applying

the �lters

�� Sharpening the �D Local Energy Response

Blurring introduced by orientation selective �lters may lead to weak features being

�swamped� by nearby stronger features and therefore not being detected This is

illustrated in Figure �� where a clear but weak feature is not detected due to the

presence of a very strong feature nearby Examination of the �D local energy map


shows that there is a signi�cant response to the weaker feature but because the ramp

down from the strong �D feature is greater at the feature than the �D local energy

response to the feature itself it is not a local maximum and so is not classi�ed as a

�D image feature

(A) (B) (C)

Figure �� A demonstration of blurring in the oriented energy maps can lead to missed �D features�A� A synthetic test image� B� The �D local energy response to the image� C� The extracted �Dimage features overlayed on a reduced contrast version of the original image� Note that one corner ofthe grey square near the centre of the image is not detected due to the response to the higher contrastL�junction nearby� This is illustrated in the �D local energy map B� where the tail of the responseto the stronger L�junction causes the response to the weaker feature not to be a local maximum�Energy maps are generated using �lter parameters of � �� pixels� and m � over �� orientations�Images are ��x�� pixels�

Another problem that excessive blurring can cause is the introduction of a false

positive response to �negative �D image features� These features can occur with large

amounts of blurring if the combined response of a pair of nearby features is greater

than their individual responses This results in a false response to a �negative �D

image feature� between the two real features

Such false responses are called �negative �D image features� due to the fact that

they are detected as a response to a local minimum in the oriented energy map on

the second pass of oriented energy How this occurs is demonstrated in Figure ��

In this �gure the two closely spaced features in the input image �the vertices of

the dark triangles� are extracted as local maxima in the �D oriented energy map

tuned to horizontal image features However when oriented energy is applied for the

second time in order to detect �D image features in this map the horizontal inverted


roof pro�le created by the proximity of the response to the �D image features causes a

stronger response than the step features on either side of it at the image feature location

and so a �D feature is detected between the locations of the �D image features

Although it is natural to expect that the energy response to a feature is independent

of a �� phase shift to the feature in an input image the same does not hold true

when the input image is an energy image A �D oriented energy map indicates the

strength of �D features at a given location and orientation in an image Therefore

ideally we do not wish to label local minima in this map as �D image features

(A) (B) (C) (D)

Figure �� A demonstration of how negative� �D image features occur� The input image A�contains two �D features of similar strength and minimal separation formed at the vertices of thedark triangles� The horizontal �D oriented energy map B� shows the features resolved to two distinctfeatures� C� The �D oriented energy map obtained by applying vertical oriented energy �lters toB� shows that the two features are labeled as a single maxima� due to the large absolute responseof the even symmetric �lter to the inverted roof feature that occurs between the maxima in the �Doriented energy image� The false response to this negative� �D image feature is shown overlayedupon the original image in D��

It is clear that both the false positive and false negative responses described in this

section are caused by blurring of the energy maps caused by the oriented energy �lters

themselves In this section we attempt to reduce these problems by examining several

methods to reduce the amount of blurring in the �D local energy map before extracting

the �D image features from this map Two main strategies have been employed in

attempting to reduce this blurring� the �rst is to vary the oriented energy �lters for

the �rst and second pass the second is to post�process the oriented energy and local

energy maps in order to sharpen the response to image features


�� Varying the Energy Filters for the �st and �nd Pass of

Oriented Energy

So far this thesis has assumed that the most appropriate �lters to implement �D

local energy are broad�bandwidth with appropriate orientation selectivity in order to

minimize responses to �D image features These requirements are well met by the

�D S�Gabor �lters outlined in Section �� While this certainly seems to be true for

detecting features in the raw input image it is by no means clear that these �lters are

the best for extracting �D features from the �D oriented energy maps

The primary di�erence between an input image and a �D oriented energy map is

that the former typically contains features over a wide range of scales while the latter

has already been blurred by applying the oriented energy �lters and so does not exhibit

such a di�erence Therefore large bandwidth becomes less important than reducing

blurring on the second pass of oriented energy In other words broad bandwidth

�lters are important in order to detect features at a range of scales in the �rst pass of

oriented energy while �lters that have a small spatial extent and consequently reduce

blurring are desirable for the second pass of oriented energy

The blurring in the �D oriented energy maps can be reduced by using di�erent

�lters for the second pass of oriented energy than the �rst pass By choosing �lters for

the second pass of oriented energy that have a smaller spatial extent the blurring is

reduced and consequently the ability to resolve the local energy response to the correct

image features is enhanced However it is impossible to construct arbitrarily small

�in spatial extent� �lters in the frequency domain as the high frequencies required to

construct such �lters are cut�o� at the Nyquist frequency e�ectively limiting the size

of the �lters that may be constructed in the frequency domain without introducing

�ringing�

The goal of producing oriented energy �lters of small spatial extent in the fre�

quency domain without introducing ringing is to ensure that the �lters are as broad

as possible in the frequency domain while smoothly tailing o� to zero at the Nyquist

frequency Truncated Gaussian �lters are used for this purpose as the smoothness of


the Gaussian function produces the broadest possible frequency range�which trans�

lates to a narrow spatial range�without introducing ringing The radial variation

functions in Equations �� and �� are replaced with Gaussian functions centred at

half the Nyquist frequency and shifted downwards so that the function is zero at the

Nyquist frequency The Gaussian centred at half the Nyquist Frequency is given by

R�r� � e�� r

Nyquist� �

��

where � speci�es the size of the Gaussian and Nyquist is the maximum discrete fre�

quency given by Nyquist The truncated Gaussian function is obtained by subtracting

the value of R at the Nyquist frequency from R to ensure no ringing This gives the

equation for the truncated Gaussian

R��r� � R�r� �R�Nyquist��

Although these �lters provide a compact response they only marginally reduce

the amount of blurring in the �D oriented energy images since most of the blurring

occurs on the �rst pass of oriented energy Consequently even the use of very small

�lters for the second pass of oriented energy has only a minor impact on the �D local

energy response in real images Therefore this approach is abandoned in favour of

performing some post�processing operations on the energy maps in order to reduce

blurring

�� Post�processing the Energy Maps to Reduce Blurring

All of the problems with the implementation of �D local energy discussed so far can be

attributed to insu�cient spatial resolution of the oriented energy �lters The blurring

introduced by the use of these �lters is compounded for �D local energy because the

�lters are applied successively Some of the various methods of post�processing the

energy maps in an attempt to �sharpen� the response to features in the �D local energy

map�and so increase the resolution of this implementation of �D local energy�are

outlined in this section

Heitger �� reduces the size of the spatial response to features in the �D oriented

energy �C�operator� maps in order to sharpen the key�point response to image features


by using orthogonal derivatives of this map to inhibit the response to surrounding

features Using this approach Heitger is able to substantially reduce the width of the

spatial response to �D image features as illustrated in Figure �� In Figure ��A�

the spatial extent of the response to �D image features in Heitger�s key�point map is

clearly less than that of the �D local energy map using the same �lter set shown in

Figure ��B�

(A) (B)

Figure �� The di�erence in the size of the spatial response to �D image features between Heitger�skey�point scheme and �D local energy� A� Heitger�s key�point map in response to the image inFigure �A�� B� The �D local energy map for the same input image using the same oriented �lters�Note the much larger spatial response to �D image features in the �D local energy map B� comparedto the key�point map A�� This di�erence is partly due to the fact that the oriented energy �lters areapplied twice for each orientation in �D local energy� which increases the spatial size of the responsein B�� and also because the spatial size of the response in Heitger�s scheme is thinned using aninhibition scheme based on orthogonal derivatives�

There are several di�erent methods for sharpening the �D local energy map using

inhibition schemes similar to that of Heitger A simple means of sharpening the �D

local energy map is to generate an inhibition map by taking orthogonal �rst derivatives

of the �D oriented energy maps and subtracting this weighted inhibition map from

the �D local energy map A modi�cation to this method is to apply the inhibition

map to the �D oriented energy responses and then calculate the �D oriented energy

from the �thinned� �D oriented energy map


Although both of these methods successfully reduce the spatial response to features

in the �D oriented energy map the response becomes jagged along the edge of features

in the map producing spurious local maxima in the �D local energy map This is es�

pecially evident along aliased edges in synthetic test images using the second method

where there are several false responses produced on either side of these features with

the amount of shifting from the feature dependent upon the separation�and therefore

the amount of thinning provided�used for the derivative calculation These problems

are illustrated in Figure ��

These methods perform poorly because they adversely a�ect the smoothness of the

�D local energy map As �D �lters are e�ectively being used to modify the local energy

maps they introduce the same false response problems that are faced by Heitger�s key�

point scheme Although using �D �lters to approximate the �rst derivatives applied to

the �D oriented energy maps reduces the characteristic responses to aliased �D image

features compared to simple di�erences the spurious false positive responses at new

local maxima in the �D local energy map remain

The problems that these approaches face emphasizes the importance of smoothness

of the �D local energy map It is the smoothness of this map that allows the direct

extraction of �D image features Therefore it is vital that any attempt to sharpen

the �D local energy map does so in a manner such that the smoothness of the map is

preserved

A simple smoothness preserving method of sharpening the �D local energy response

is multiplying the �D local energy response by the �D local energy response This

forms a new �D local energy map that is the sum of the original map plus a weighting

factor times the product of �D and �D local energy in the spatial domain That is

�DLE� � �DLE � ��DLE � �DLE� ��

where �DLE and �DLE are the �D and �D local energy maps respectively and � is

a weighting constant �� is used in this thesis�

This measure thins the �D local energy map well and because it is the product of

two smooth maps �DLE� is also smooth However it produces a strong sharpening

e�ect on the energy map at the expense of increased response to �D image features


(A) (B)

(C) (D)

(E) (F)

Figure �� A demonstration of the problems caused by sharpening� the �D local energy map� A��D oriented energy of Figure �� in the vertical orientation channel� B� The sharpened� �D orientedenergy at this orientation� C� The �D local energy map� D� The �D local energy map producedfrom the sharpened� �D oriented energy maps� Note that the spatial area of the response to �D imagefeatures is not signi�cantly reduced� However� the response to the features that exhibit aliasing isnot smooth and is displaced on either side of the features� E� The �� D image features extractedfrom C� using the default feature strength threshold of �� F� The �� D image features extractedfrom D� using the default feature strength threshold of �� Note the false responses in F� due toloss of smoothness in the �D local energy map� Image size is ��x�� pixels�


relative to the response to the �D image features The reason for this increased

response to �D features is that the product of �D and �D local energy at a point is

often much greater than the �D local energy response �since this is the product of two

values greater than one� making it di�cult to choose a value for � that provides a

sensible balance between sharpening and response to �D image features

Despite the problems with using �DLE� outlined above this measure can signi�c�

antly and positively reduce the e�ect of blurring on the accuracy of feature detection

as illustrated in Figure �� This �gure demonstrates how both the false positive and

false negative responses in Figures �� and �� are eliminated using this new measure

(A) (B) (C)

(D) (E) (F)

Figure �� A demonstration of the reduction of false responses using �DLE�� A� The �D localenergy map of Figure ��A�� B� The �DLE� map of the same input using the same oriented energy�lters �� Note the sharper response to �D image features in B�� C� The �D featuresextracted from B�� Note that in contrast to Figures ��C� all �D image features have been detected�D� The �D local energy map of Figure ��A�� E� The �DLE� map of the same input using thesame oriented energy �lters �� Again note the sharper response to the �D image featuresthat allows the two �D features to be successfully extracted F��


To bring the values being summed in Equation �� to equivalent magnitudes a

square root is introduced to the product in the equation This gives the new form

�DLE�� DLE � �

p�DLE � �DLE ��

While Equation �� makes choosing the weighting constant � less di�cult and

more signi�cantly reduces the relative response to �D features it does so at the cost

of a decreased reduction in the amount of sharpening compared to �DLE�

The costs and bene�ts of �DLE� and �DLE�� are best illustrated by example

Figure �� compares the output of these measures to �D local energy on a noisy real

image �DLE� �Figure ��C�� clearly produces the most spatially compact response

to �D image features but it is the most contrast sensitive as illustrated by the false

negative responses to the low contrast corners of the windows underneath the balcony

in the left hand side of the image Both �DLE� and �DLE�� clearly respond to

�D image features with several false positive responses to �D image features being

extracted These measures are sensitive to noise along high contrast �D features and

characteristically produce false responses to these features as demonstrated by the

number of false responses to the high contrast step edges in this image However the

new measures are able to resolve the closely spaced features near the top of the stairs

at the bottom right hand side of the image that cannot be extracted from the �D local

energy map due to the large spatial response to these features resulting in a single

blurred response

Despite the di�erences in the feature maps in Figure �� the sets of extracted

features are quite similar As with the design of the �lters the con�ict is between

compact response to features �less blurring� and minimal response to �D image fea�

tures The alternate measures of feature strength examined in this section in order

to reduce blurring are unnecessary provided that the responses to individual features

do not signi�cantly overlap such that local maxima are lost In fact any �lter�based

approach faces these problems at some scale and in practice the �lter set used for this

implementation only su�ers these problems at very �ne scales �Negative� �D image

features are simply the result of detecting features in an image at a scale smaller than


(A) (B)

(C) (D)

(E) (F)

Figure �� A demonstration of the reduced blurring in the �D local energy map after multiplicationby the �D local energy map� A� The �D local energy map of Figure �A�� B� the �� D imagefeatures extracted from A� using a feature threshold of �� C� The �DLE� response to the sameinput image �� D� The �� D image features extracted from C� using a feature thresholdof �� E� The �DLE�� response to the same input image �� F� The �� D image featuresextracted from E� using a feature threshold of �� All thresholds are expressed as a percentageof the global maximum feature response�


that which the �lters can actually discriminate

While these methods are provided to illustrate how blurring in the �D local energy

map can be substantially reduced empirical evidence suggests that these measures are

not as robust with respect to false responses as straightforward �D local energy This

behaviour is consistent with the design of the �D �lters discussed in Section �� That

is increasing the energy �lters� orientation selectivity increases the amount of blurring

caused by the �lters and conversely reduced blurring occurs at the expense of ori�

entation selectivity which leads to increased false positive responses to �D image fea�

tures Furthermore the simplicity of the �D local energy map provides a more elegant

measure of �D feature strength which is confused by complicated post�processing pro�

cedures to sharpen the energy map Therefore the original straight�forward method

of extracting �D features via local energy is retained for the results presented in the

following chapter

�� Summary

Although the idea behind �D local energy is simple there are several issues that must

be addressed in its implementation Most of the problems that must be solved centre

on the design of the �D orientation selective �lters used to calculate �D local energy

Heitger�s �D S�Gabor �lters �� are used in this thesis in order to calculate �D local

energy although any orientation selective quadrature pair of �lters may be used to

measure �D local energy

This chapter examined the con�ict between increased orientation selectivity and

increased blurring and showed how to select a level of orientation selectivity that is

an acceptable tradeo� between the goals of minimal blurring and minimal �D local

energy response to �D image features This chapter also shows how the number of

orientations required for uniform coverage of all orientations can be found once the

orientation selectivity is decided This provides an even response to image features

regardless of their orientation

This chapter demonstrated that steerable �lters are not suited to the calculation of

�D local energy because the orientations at which the image is examined are known in


advance and the second pass of oriented energy uses a di�erent input image at each

orientation

The level of orientation selectivity required to minimize the response of �D local

energy to �D image features causes blurring in the �D local energy map that can lead

to both false positive and false negative responses to �D image features This chapter

investigated several methods to reduce the amount of blurring before the �nal extrac�

tion of �D image features from the �D local energy map Several of these methods

introduced new problems that stress the importance of maintaining the smoothness of

the map from which the features are extracted Multiplying the �D local energy map

by the �D local energy map before extraction of image features substantially reduced

the amount of blurring at the expense of increased response to �D image features In

post�processing the �D local energy map as in �lter design there is a con�ict between

reducing blurring and decreasing the response to �D image features

Another feature of �D local energy is that once the �lters have been created there

are no parameters to tune or thresholds to set apart from the �nal threshold on

feature strength to extract the �D image features from the �D local energy map This

increases the robustness of the implementation as �D local energy can be applied to

a wide variety of images using the same �lters and without requiring any knowledge

of how the operator works

Chapter �

Experimental Results

This chapter investigates the detection and localization accuracy of this implementa�

tion of �D local energy and compares its performance with two other �D feature de�

tectors Smith and Brady�s SUSAN �corner� detector �� which actually responds

to �D contrast variations rather than modeling grey�level corners� and Heitger�s key�

point detection scheme �� An earlier paper by Robbins and Owens �� showed that

Cooper�s DEK feature detector �� is superior to the DET operator �� and Kitchen�s

�cornerity� measure �� Despite the DEK algorithm�s superior performance to these

early corner detectors the results in this chapter show that it performs poorly in noisy

synthetic images compared to the implementation of �D local energy developed in this

thesis

The results presented in this chapter are generated using �lter parameters of � �

�� pixels and orientation selectivitym � � for both passes of oriented energy over ��

orientations Local maxima of the �D local energy map are extracted and labeled as

�D image features using a window size of �x� using a feature threshold expressed as

a percentage of the global maximum �D local energy value Thresholds for Heitger�s

key�point scheme are expressed as a percentage of the global maximum key�point

strength while the SUSAN corner detector uses a brightness threshold for feature

selection All thresholds are hand tuned to give minimal false responses with maximal

detected image features With the exception of the SUSAN feature detector features

are marked by a black cross overlayed upon a reduced contrast version of the original

��

CHAPTER �� EXPERIMENTAL RESULTS ��

image The SUSAN corner detector used for these results is �SUSAN Version ��

which is available for research purposes from the author and has not been modi�ed

Therefore features detected by the SUSAN detector are marked with boxes overlayed

on the original image

A further note should be made here about the thresholds used in this chapter With

the exception of Kovesi�s phase congruency feature detector �� all feature detect�

ors produce a dimensional measure of feature strength and so threshold levels must be

adjusted for di�erent images Kovesi�s phase congruency �D feature detector provides

a normalised dimensionless measure of �D feature strength and so image�independent

feature thresholds can be set in advance Currently no �D feature detectors produce

such a normalised measure of feature strength so feature thresholds must be set on an

image by image basis The setting of thresholds is further complicated in real images

because of the lack of a general model of �D image features discussed at the beginning

of this thesis Therefore the thresholds for each �D feature detection scheme have

been hand�set for each image to subjectively produce the best feature set with the

emphasis being on reducing the number of false�positive responses

�� Localization Accuracy of �D Image Features

Correct localization of �D image features is an important requirement for nearly all

higher�level applications that use �D image features However just as it is di�cult

to decide exactly what constitutes a �D image feature it is also di�cult to determine

the precise location of a �D image feature in real image data The problem can be

simpli�ed by concentrating on synthetic image data where the precise feature location

is known and then comparing the computed results with the model For this analysis

a corner model similar to that of Deriche and Giraudon �� is used That is

the �D feature is constructed from wedge shapes providing ideal step edges and then

smoothed with a rotationally symmetric �D Gaussian This model of an edge produces

a precise sub�pixel �D feature location �Figure �� that is at the midpoint of the four

pixels �using a standard rectangular array of image data� surrounding the intersection

of the ideal step edges Ideally one would like a �D feature detector to correctly


localize the exact position of �D features to sub�pixel accuracy or at least be able to

provide sub�pixel precision if required

Figure �� The exact feature location for an ideal �� L�junction� Image pixels are represented bythe squares outlined in black� The �D feature location at the intersection of four pixels about thejunction is marked with a white cross�

One result of the exact corner location lying between pixels is that when the corner

position is localized to pixel precision there is always a single pixel ambiguity in the

exact location about the extracted �D feature location As illustrated in Figure ��

any of the four pixels surrounding the white cross could be said to be the location of

the corner within single pixel precision To eliminate this localization ambiguity in this

section we will use the output of the raw �D local energy map for �D feature location

and not the pixel location extracted as the local maximum Since �D local energy

produces a smooth response to �D image features sub�pixel precision localization of

�D features can be achieved within the framework of �D local energy by �tting a

surface to the �D local energy map and extracting the maxima of this surface

For demonstration of the localization accuracy of the �D local energy model three

synthetic �D features have been generated and are shown in Figure �� The �rst is a

�� L�junction the second a �� L�junction and the third a T�junction The images

are generated with ideal step edges that have been smoothed with a Gaussian ��

pixels� The exact �D image feature location in each case is at the centre of the image

Figure �� shows the results of applying the �D local energy model to the three


(A) (B) (C)

Figure �� Three ideal �D image features� A� A �� L�junction� B� a �� L�junction� and C� aT�junction� All images have been smoothed by a Gaussian � �� pixels� and are ��x�� pixelsin size�

images in Figure �� The results are quite remarkable For the �� L�junction the

response is nearly perfectly symmetrical about the sub�pixel location of the �D feature

thus locating its precise position For the other two �D features as expected the

response is not perfectly symmetrical yet it is centred on the location of the �D

feature with greater than pixel precision

The displacement of feature location that occurs in the �D local energy map and

the �D local energy map is far less than the �� pixels that occurs with the DET

operator for a �� angle �� For �D local energy it is actually less than �� pixels

as this is equivalent to a half pixel displacement with the �D Gaussian of � � ��

pixels used for these images and the displacement in Figure �� is less than half a

pixel Therefore the �D local energy model provides accurate sub�pixel detection and

localization of �D image feature points without requiring that a range of scales be

used This supports the hypothesis that �D image features correspond to points in an

image where the �D local energy is locally maximal

Furthermore although the �D feature localization will obviously be in�uenced

by noise�since it a�ects the shape of the �D image feature�the �D �lters and the

use of multiple orientations reduces the e�ect of single pixel intensity �uctuations

characterized by Gaussian noise This makes �D feature detection via the �D local

energy model robust in terms of detection and localization even with a large degree of


(A) (B) (C)

Figure �� Closeups of the �D local energy maps produced at the image centres in response tothe three ideal �D image features in Figure �� D �lter parameters are � �� pixels� and m ��These are ��x�� cutouts of the centre of the �D local energy maps�

noise in the image as is demonstrated Figure �� Here the features are still accurately

localized even with Gaussian noise added to give a minimum signal to noise ratio of

�� for �A� and �B� and �� for �C� that is several times greater than could be expected

in a real image� In each case the extracted �D feature location is well within a single

pixel of its true position

(A) (B) (C)

Figure �� Closeups of the �D local energy maps produced at the image centres in response to thethree ideal �D image features in Figure �� but corrupted with Gaussian noise of � �� grey�levels��D �lter parameters are � �� pixels� and m �� These are ��x�� cutouts of the centre of the �Dlocal energy maps�

�The minimum signal to noise ratio is taken as the ratio of the minimum feature contrast and thestandard deviation of the added Gaussian noise�


�� Detection Accuracy of �D Image Features

A critical requirement of any �D image feature detection scheme is that it be able to

accurately detect the existence and correct location of the �D features in an image

with as few false responses as possible It must be able to extract all manner of �D

image features and so requires a model that is not restricted to a speci�c class of

features or even several classes of features The same holds true for the detection of

�D image features Many earlier attempts to extract �D image features �for example

Canny �� and Sobel �� were restricted to models that were unable to detect roof�

pro�led features and su�ered defects such as producing a pair of responses to these

features

This thesis contends that points of high phase congruency of the frequency rep�

resentation of an image correspond to all types of image features This implies that

local energy maxima corresponding to points of maximal phase congruency allow

uniform representation of all �D and �D image features In the same way that local

energy uni�es the detection of step roof and line �D feature pro�les �and combinations

thereof� into a single model of �D image features it also uni�es into a single model

the variety of image variations that make up �D features

It is essential that a �D feature detector respond to all types of �D image features

without producing false positive responses due to the nature or implementation of the

model However if false positive responses do occur then it is preferable that they

fall on �D image features rather than completely insigni�cant image structures The

reason for this is that at least a �D feature is a signi�cant image structure whereas

responses due solely to the feature detector do not correspond to any signi�cant local

structure in the image and consequently are totally undesirable An example of this

undesired behaviour is the �phantom edges� that occur at in�ections in the intensity

function using Marr and Hildreth�s edge operator ��

It is usually considered desirable that the response to a �D image feature be

proportional to the strength of the feature� That is the �D feature detector provides

�Kovesi argues for a local measure of feature strength based on phase congruency that is inde�pendent of feature contrast �� However� this measure still attaches a strength to the feature basedon the degree of phase congruency� In this respect then the feature detector still produces a value


a measure of strength that is related to human perception of the strength of the local

�D image variation This is di�erent from the global importance of an image feature

which is dependent upon the structure of the scene and therefore requires the use of

higher level knowledge This level of processing is concerned with a purely data driven

�bottom up� approach and therefore an assessment of the global importance of an

image feature cannot �and should not� be made However providing a measure of the

strength of a feature is important as this allows selection of only the most important

features with low strength signals that may be due to noise being discarded

The following subsections investigate the success of the �D local energy model with

respect to these criteria using both synthetic and real image data

�� Synthetic Data

This section investigates the behaviour of the �D local energy model on a synthetic

test image that contains a variety of �D image features It also examines how the

model responds to the addition of a large amount of Gaussian noise and compares the

results to those of three other �D feature detectors

Figure �� shows the original image �A� and the �� D features detected via �D

local energy overlayed on a reduced contrast version of the original image �B� For the

default threshold value of ��" all �D image features are detected However a few

false positive markings are made due to the aliasing e�ects on the edges of the smaller

discs Note the accurate localization of image features by this method remembering

that due to discrete image e�ects a single pixel leeway must be given for the exact

feature position

Figure ��A� is the test image in Figure ��A� corrupted by Gaussian noise of

�� grey�levels giving a signal to noise ratio of �� for the weakest image feature

The �D image features detected using �D local energy on this image are shown in �B�

While a few of the lower contrast �D features are pushed below the threshold and

therefore are no longer detected no additional responses are made due to the noise in

the image In fact the presence of the image noise �and the higher threshold� reduces

proportional to the strength of a feature�


(A) (B)

Figure �� A� A synthetic test image containing a variety of �D image features� B� The �� Dfeatures detected via �D local energy� All �D features are detected although there are a few responsesto the aliased edges of the smaller discs� Image size is ��x�� pixels�

the e�ect of aliasing on the edges of the small discs so that there are less responses

to these features Most of the salient �D image features of all types are still detected

as those features that are no longer detected are low contrast T�junctions and the low

contrast L�junction of the light square at the top left of the image Furthermore the

detected features of varying types are accurately localized despite the large amount

of noise present in the input image

For comparison the �D features detected using Heitger�s scheme �Figure ��C��

the SUSAN corner detector �D� and Cooper�s DEK operator �E� are also given

Heitger�s scheme also performs well on this image with only three low contrast T�

junctions at the top�left of the image undetected There is a false positive response

on the straight line feature of the polygon however this line exhibits the e�ects of

aliasing Although there are more responses to the aliased edges of the small discs by

this method than by �D local energy the threshold for this method was set lower and

these highly aliased edges do not occur in real images

The SUSAN corner detector also performs well on this test image Apart from

the four shallow angle L�junctions on the irregular shaped polygons all �D image


(A)

(B) (C)

(D) (E)

Figure �� A� A synthetic test image with added Gaussian noise � �� grey�levels�� B� The ��D features detected via �D local energy using a threshold of �� of the global maximum� C� The�� key�points located using Heitger�s method on the same image� with a threshold of �� of theglobal maximum response� D� The �� D features detected using the SUSAN corner detector witha brightness threshold of �� E� The �� D features detected using the DEK scheme after �rstblurring the image with a Gaussian � � pixels�� using a threshold of �� Image size is ��x��pixels�


features are extracted with good localization Two features on non�aliased �D edges

are falsely detected due to the high level of noise in the image However all of the

low contrast T�junctions on the left hand side of the image are detected

The performance of the DEK feature detector in Figure �� is very poor by com�

parison to the output of the other methods While it is true that like most other

derivative�based schemes the DEK feature detector is sensitive to noise even after

blurring the results are poor with respect to both localization and detection

Although the SUSAN detector produces very good results for the noisy synthetic

test image in Figure ��A� this detector is in e�ect optimized for this type of image

because of the scheme used to group pixels within a local region that are of similar

brightness to the central pixel For real images that may contain blurred features this

method does not fare so well as the results in the following subsection illustrate

�� Real Data

This subsection investigates the performance of the �D local energy model using real

image data First the performance of this method is examined on a simple real image

where the desired �D image features appear quite obvious and then on more complex

images containing several di�erent types of image features Comparison is again made

with other �D feature detection schemes

Figure ��A� contains a simple real image of a geometric sculpture in front of a

textured background The desired �D image features are quite obvious and are all

well de�ned by the intersection of straight �D image features The image contains

several di�erent types of �D image features

The response of the �D local energy model to this image is shown in Figure ��B�

A threshold of ��" of the global maximum �D local energy value produced �� D

image features Four signi�cant junctions in the scene are not detected by the �D

local energy method at this threshold and two markings are made at small spot

features in the original image that do not lie on the edge of any object in the scene

However the majority of the salient �D features in the image are detected and well

localized

The response of the key�point scheme is shown in Figure ��C� A total of �� D


(A) (B)

(C) (D)

Figure �� A� An image of a geometric sculpture in front of a textured background� B� The ��D image features detected using �D local energy with a threshold of �� of the global maximum �Dlocal energy� C� The �� D image features detected using the key�point scheme �lter parameters�� pixels� m �� and eight orientation channels� with a threshold of �� of the global key�pointmaximum� D� The �� D image features detected using the SUSAN corner detector with the defaultfeature strength threshold of �� Image size is ��x�� pixels�


image features are detected using a threshold of ��" Localization of the detected

features is excellent with only two low�strength features unmarked In contrast the

�� features detected by the SUSAN corner detector do not include six signi�cant �D

features in this image and localization of some of the detected features particularly

the L�junctions formed at the boundary of the sculpture is poor There are also several

�D features marked along the edge of �D image features that do not appear in the

output of �D local energy or the key�point operator

Figure �� compares the results of these feature detectors on a more complex real

image that is slightly out of focus and so contains several blurred �D features Both

the �D local energy model and the key�point scheme detect a similar set of �D image

features for this image although the �D local energy model fails to detect the two low

contrast L�junctions formed by the workspace surface and the background as well as

the low curvature L�junctions on the left hand side of the robot arm The superior

resolution of the key�point scheme is also obvious on and near the end�e�ector of the

robot where it detects a few �D image features that are not detected by �D local

energy due to the response of nearby features

The performance of the SUSAN corner detector on this image is inferior to the

other two methods Several L�junctions of varying contrast are either undetected

or poorly localized This �gure clearly demonstrates the limitations of this scheme

for the detection of �D image features characterized by smoothly changing intensity

gradients

Figure ��A� is a more complex image that contains a wide variety of �D image

features and includes those that arise due to the intersection of �roof� or �ramp� pro�led

�D image features The �� features detected in this image using �D local energy are

shown in �B� Most of the strong �D features in Figure ��A� are detected via �D local

energy as well as many low contrast and irregular features such as line terminations

and X�junctions caused by overlapping wires while spurious responses are low

The set of �D image features extracted from Figure ��A� by the key�point scheme

are similar to those detected via �D local energy However the key�point scheme is

unable to detect several of the X�junctions caused by the overlapping wires in the

rigging that are detected by the �D local energy model Also the �D features caused


(A) (B)

(C) (D)

Figure �� A comparison of �D feature detectors on a real image of a robot arm� A� ��x��pixel input image� B� The �� D image features extracted from A� using �D local energy for ��orientations with m �� and a feature threshold of �� C� The �� D image features extractedby Heitger�s key�point detector for � orientations with m �� and a feature threshold of �� D�The �� D image features extracted using the SUSAN detector with a brightness threshold of ��The SUSAN corner detector clearly performs poorly on this image� with both large detection andlocalization errors�


(A) (B)

(C) (D)

Figure �� A comparison of �D feature detectors on a real image of a �shing boat cabin and rigging�A� Input image ��x�� pixels�� B� The �� D image features detected via �D local energy usingthe default threshold level of �� C� The �� D image features detected by the key�point schemewith a threshold of �� D� The �� features detected by the SUSAN corner detector using abrightness threshold of ��


by the occlusion of the wires by the rigging on the left hand side of the image are

not detected by the key�point scheme although they appear in the �D local energy

output The reason for this is that the inhibition scheme used by Heitger to remove

characteristic false responses to �D image features also inhibits these types of features

and so they are not detected This clearly shows how the inhibition scheme can fail

The performance of the SUSAN corner detector in Figure ��D� is again inferior

to the other two methods Most of the X�junctions caused by the wires and several

other junctions around the cabin are not detected Localization is also inferior to the

other methods

Figure ��A� contains so much detail that it is di�cult to know what the �expected�

set of �D image features should be as is the case with most real complex images

However focusing on one part of the original image and then looking at the response

in that area by these methods often shows that the features detected correspond to the

structure of the objects in the scene An example of this is the response to the cabin

and its windows in the lower left part of the image and the occlusions of the cabin

caused by the masts� the marked features correspond to the structures in the scene

The detection by �D local energy of the weak occlusion features caused by the masts

and rigging further illustrates the versatility of this model

�� D Feature Stability

A major application of �D feature detection is the registration of these features in

di�erent images This usually means matching �D image features in a pair of stereo

images of a scene or tracking �D image features in a sequence of images

Temporal stability of feature detection and localization and feature stability from

di�erent viewpoints are important for these major applications of �D image features

The following two sub�sections investigate both the temporal and viewpoint stability

of the �D image features extracted via the �D local energy model


�� Temporal Stability

Temporal stability of detected �D image features is important for applications that

track these features in a sequence of images It is important that features do not drop

out or appear from frame to frame and that the detected feature location corresponds

to the same point on an object in the scene from frame to frame This simpli�es

tracking of features in a temporal sequence of images and enhances the accuracy of

displacement and other measurements made on the feature data

Another aspect to feature stability in image sequences is that it is desirable that

parameters �such as �lters parameters and feature thresholds� used in the feature

extraction process are as stable as possible That is the parameters used remain

constant or have smooth temporal variation

Figure �� demonstrates the temporal stability of �D local energy on a sequence

of images of a highly textured cube moving from left to right Despite the intensity

variations on the surface of the cube due to its texture and the lighting conditions only

the �D features corresponding to the corners of the cube are extracted in all images

from the sequence This is achieved using a constant feature threshold of ��" of the

global maximum �D feature strength and larger energy �lters ��

Note the strong responses to the real image features in Figure ��B� and that the

extracted feature locations appear to match the same points on the object in the scene

despite the moderate size �lters used to extract the features

Figure �� demonstrates that �D local energy performs well even in the presence of

large amounts of surface texture and that the temporal stability of �D local energy

in terms of feature location and detection are high

�� Viewpoint Stability

The property of viewpoint stability implies that for di�erent viewpoints a feature

detector extracts a similar set of features with similar localization using constant

feature detection parameters and thresholds It is important not only that an accurate

set of �D features is extracted from an image but also that the extracted features are

similar for small changes in the viewpoint


(A) (B)

(C) (D)

(E) (F)

Figure �� Demonstration of the temporal stability of �D feature detection via local energy� A�The �rst of a sequence of real images of a textured block moving from left to right� B� The �D localenergy map extracted from A�� Notice the strong response to the features which correspond to pointsof interest in the scene compared to the relatively weak response to texture on the cube�s surface� C�� F� The �D features extracted from images in the sequence overlaid on reduced contrast versionsof the input images� Filter parameters are � � pixels� over �� orientations� with a feature thresholdof �� for all images� Images are ��x�� pixels�


The extraction of similar sets of �D image features from a stereo pair of images

simpli�es the correspondence task between the two feature sets Consistent feature

localization is also important for the matching task as the projected �D location of

the feature is dependent upon the location of the feature in each of the stereo images

Figure �� shows the result of applying �D local energy to a stereo pair of an oblique

view of a wall with a whiteboard and posters The same set of �lter parameters and

feature thresholds were used to extract features from the left and right images of

the stereo pair The set of features in the left and right images of the stereo pair

are similar in parts of the scene that are contained in both images particularly with

respect to the response near the power outlet on the wall and the features extracted

on the two more distant posters that are at di�erent scales in each image

One problem with the output of Figure �� is that the bottom corners of the two

more distant posters have been extracted as a single feature in the right hand image

as their separation in this image is smaller than the left image The same problem

has occurred at the termination of the vertical lines below the bottom right corner of

the whiteboard with a pair of features extracted from the right image and a single

feature from the left Despite these minor inconsistencies in most cases it appears

that the same point in the �D scene has been extracted from both viewpoints using

the same set of parameters for the feature extraction

A worthwhile future extension to the �D local energy model would be the addition

of sub�pixel precision to the localization of image features The obvious way of achiev�

ing this within the current framework is to �t a surface in the neighbourhood of the

pixel�precision local maxima in the �D local energy map and extract local maxima

from the �tted surface to sub�pixel precision by analytical means

One bene�t of extracting �D features to sub�pixel precision is that a more thorough

quantitative investigation of the localization performance of �D local energy could be

undertaken using stereo views of a known scene and then measuring the error in the

reconstructed scene Such analysis was performed by Blaszka and Deriche to illustrate

the localization performance of their model�based approach �� The extension of �D

local energy to sub�pixel precision feature localization and subsequent analysis and

reconstruction of a known �D scene is a useful direction that could be taken for further


(A) (B)

(C) (D)

Figure �� Feature detection via �D local energy on a stereo image pair� A� The left and B�right images of a stereo pair of a whiteboard and poster�adorned wall� The features extracted using�D local energy from C� the left� and D� right image of the stereo pair� C� and D� contain ��and �� D image features respectively� Default �lter parameters� and a feature strength threshold of�� were used for feature detection in both input images� Images are ��x�� pixels�


work on this model

�� Comparisons and Observations

In this chapter the accuracy of detection and localization of the �D local energy model

has been investigated and compared with several existing methods of �D feature de�

tection The results show that the performance of �D local energy is on par with that

of the key�point scheme of Heitger which to my knowledge provides the best detection

and localization of �D image features of any other algorithm currently available

Although the SUSAN corner detector performs well on synthetic images it is some�

what less impressive when applied to real images particularly with regard to detection

and localization of L�junction features However this operator is approximately an

order of magnitude faster than both the �D local energy and key�point implementa�

tions The DEK feature detector is clearly inferior to all of the above methods and is

particularly sensitive to image noise although it too is much faster than the �D local

energy and key�point schemes

The results achieved on real image data using the �D local energy model and the

key�point scheme demonstrate that by using broad bandwidth �lters it is possible to

achieve accurate localization of image features from a single scale of �lters Further�

more the accurate sub�pixel localization of image features that is possible using the

�D local energy model upon synthetic data also shows that only a single scale is needed

to accurately identify and localize �D image features

This chapter also showed the output of �D local energy for a real stereo pair of

images Testing on several stereo image pairs indicates that the response of �D local

energy is stable with respect to the set of features extracted and their localization

Extraction of �D features from the �D local energy map to sub�pixel precision would

facilitate a more thorough investigation of the localization accuracy and stability of

this implementation of �D local energy and would be a bene�cial extension to the

work presented in this thesis

The performance of this implementation of �D local energy illustrated in the results

in this chapter is encouraging The results show that a straightforward application of


the model robustly detects and localizes features without any tuning or tweaking of

the process These results support the hypothesis of this thesis that �D image features

correspond to local maxima in the �D local energy of an image

Chapter �

Future Directions

In this thesis the local energy model has been extended to the detection of �D image

features with the implementation of this model producing good results This thesis

opens up further avenues for extending local energy in at least three ways� �rstly the

addition of sub�pixel precision to the extraction of features from the �D local energy

map would be an end in itself but would also allow more thorough quantitative invest�

igation of the localization performance of the implementation presented in this thesis

Secondly the theory for the detection of �D image features via local energy presented

in this thesis is based on the assertion that �D image features correspond to points

of maximal �D phase congruency Previously phase congruency has been a trouble�

some measure to calculate e�ciently and without strong in�uence by noise Kovesi

has proposed novel solutions to these problems �� and along with this thesis

opened the way for the direct extraction of �D image features via phase congruency

There is clearly room to merge these two ideas in the future Thirdly the extension of

local energy to the detection of higher order features in �D image data as outlined in

Chapter � would certainly be worthwhile Features extracted by such a scheme would

be ideally suited to registration of �D image data since the points are isolated in �D

space The computational complexity of such an approach would surely necessitate

addressing several implementation issues

The previous chapters have shown how local energy may be extended to the detec�

tion of �D image features in �D image data This chapter illustrates how the consistent

��

CHAPTER �� FUTURE DIRECTIONS ��

treatment of image features developed in this thesis allows the detection of �D �D

and �D features in �D image data and provides some preliminary results on this work

�� Surface Feature Detection

The detection of features in �D image data is important for interpretation of these

images To date most feature detectors for �D image data have been extensions

of gradient�based �D feature detectors �� Consequently these feature

detectors are typically sensitive to noise but more importantly they are unable to

accurately extract some types of features common in these images

Roof�pro�led features such as membranes and thin wire�like structures are quite

common in �D data such as confocal microscope images of biological organisms These

features are systematically mislocalized by step�edge detectors that mark points with

a sharp intensity gradient as features

Local energy is ideally suited to the detection of features in these images due to its

ability to accurately extract features that have line� or roof�pro�les or a combination

of both of these feature types

The detection of surface features in �D image data is analogous to the problem of

�D feature detection in �D images That is features correspond to points that have

signi�cant variation in any direction

To detect �D image features in �D image data the �D orientation selective oriented

energy �lters are applied at several orientations to produce �D oriented energy maps

As for the �D case these oriented energy maps are summed to give the �D local energy

map by Equation �� The local maxima in the �D local energy map correspond to the

�D image features in the �D image data

�� Implementation

The methodology for extracting �D features from �D image data is similar to the

process for �D images To reduce the computational requirements of convolution

the �lters are applied in the frequency domain via multiplication with the Fourier


Transform of the image As with �D images symmetries in the �lters are exploited

to increase the speed of computation�

Orientation selective �D S�Gabor �lters can be constructed by extending the co�

ordinate system to �D polar coordinates The radial variation is given by the S�Gabors

of Equation �� and �� while the angular variation is given by a power of a cosine�

F �r� � �� R�r� � cos�m�q

��

where � and �� determine the orientation of the �lter the positive integer m de�

termines the orientation selectivity of the �lter F �r� � �� is the �lter in �D polar

coordinates and R�r� is de�ned for the even� and odd�symmetric functions in Equa�

tions �� and ��

However for the results presented in this chapter the �lters of Pudney et al� ��

�� are used These �lters are constructed from banks of oriented �D Morlet

wavelets Each wavelet is made up of either a �D sine wave �for the odd�symmetric

�lter� or cosine wave �for the even�symmetric �lter� oriented to the �D directional

tuning of the �lter modulated by a �D Gaussian The odd�symmetric wavelet tuned

for frequency f and oriented along the unit vector v is given by the formula�

M evenf��v �x� y� z� �

�p��

e�x��y��z�

�� cos��

where

� � kf�

� � ��f�

� � v �

��

x

y

z

��

and the odd�symmetric wavelet has the cosine replaced with a sine

The orientation selectivity of the wavelet is determined by the ratio between its

width � and its frequency f which is controlled by k The �� term in Equation �� is

used to ensure that wavelets at all scales have equal amplitude in the Fourier Domain

�see Appendix A for details�


Rescalings are obtained by scaling f to build �lters that combine the �D Morlet

wavelets at a number of frequency scales �Freq� given by the formula�

Geven�v

�x� y� z� �X

f�Freq

M evenf��v �x� y� z��

for the even�symmetric �lter and

Godd�v

�x� y� z� �X

f�Freq

Moddf��v �x� y� z��

for the odd�symmetric �lter

For the results in the following section the �lters are constructed from wavelets

tuned at six frequencies Freq � �� cycles per

pixel That is the sine and cosine functions have wavelengths of � � ��!� ��!��

��!��! ��!�� respectively The orientation selectivity provided by the ratio of

width to frequency was k � �

The �lters are applied at �� orientations corresponding to half of the directions

to a voxel�s �� neighbours This number of orientations was chosen to simplify the

non�maximal suppression scheme described below since it eliminates the need for

interpolation between voxels It should be noted that these orientations are not evenly

spaced Furthermore the uniformity of coverage provided by the �lters at varying

orientations has not yet been investigated as has been done for the �D oriented energy

�lters in Section ��

The local maxima in the �D local energy map formed by the sum of the oriented

energy maps correspond to the �D features in the �D image data These local maxima

are found using non�maximal suppression For each voxel in the �D local energy

image its �D local energy response is compared with those of its neighbours in the

orientation for which the largest �D oriented energy response is obtained at that point

If the voxel is not greater than its neighbours then it is suppressed

Although this technique of non�maximal suppression works well when there is a

single dominant orientation at each voxel it often performs poorly at more complic�

ated structures which are quite common in �D image data When multiple features

intersect only the strongest orientation is chosen which can result in suppression


of valid image features Following non�maximal suppression a threshold is used to

eliminate weak features with the remaining voxels labeled as surface features

�� Results

In this section the results of surface detection via local energy in �D image data are

compared to the output of a �D extension of the Sobel operator The input images

used are a synthetic ��x��x�� image and a ��x��x�� confocal microscope image of

a �ea Both of the input images are isotropic that is the voxels are cubes

The detected surfaces are rendered using texture mapping For the detected sur�

faces an image volume is constructed where surface voxels are given the value of

the corresponding voxel in the original image and all other voxels are set to zero

The opacity of image voxels is provided by a look�up table with non�surface voxels

transparent The volume is then rendered using ray tracing

Figure �� shows the synthetic test image �A� and the output of local energy �B�

and the �D Sobel operator �C� on this image The input image consists of a sphere

containing a solid cube intersected by three orthogonal planes Each object voxel

has an ��bit value of �� or �� corresponding to the sphere cube and planes

respectively The background is a diagonal ramp of values from � to ��

(A) (B) (C)

Figure �� A comparison of the results of local energy to the �D Sobel operator on a syntheticimage� A� The input image� B� The output of the local energy surface feature detector� C�The output of the �D Sobel operator� The images were produced using a surface voxel renderingtechnique�


The most obvious di�erence between the surfaces detected by the �D Sobel op�

erator �C� and the local energy operator �B� in Figure �� is that the former clearly

produces a thick dual response to the intersecting planes while local energy produces

a single accurate response This is because the Sobel operator responds to step pro�

�led features and so responds to the step on either side of ramp pro�le present at

these surfaces Although it is di�cult to see in Figure ��C� the �D Sobel operator

also produces a dual response to the surface of the sphere as well as extracting fewer

voxels on the surface of the sphere

Figure �� shows a confocal microscope image of a �ea �A� and the output of local

energy �B� and the �D Sobel operator �C� on this image This image demonstrates

the superior detection of surface features using local energy compared to the �D Sobel

operator The local energy method clearly detects more surfaces and re�ects the

surface structure more accurately than the �D Sobel operator in this image despite

the low threshold set for the Sobel operator The appendages of the �ea clearly

illustrate the excellent performance of �D local energy for this image particularly

when the output is compared to that of the �D Sobel operator Note that a far greater

proportion of the surface of the �ea is detected via �D local energy than by the �D

Sobel operator in this image The improved performance of local energy is due partly

to its use of more orientations and a broader range of scales for �ltering than the �D

Sobel as well as the fact that the oriented �lters used used by �D local energy respond

only to features tuned near the orientation of the �lters

�� Discussion

This section has shown how �D �surface� features may be detected in �D image data

via local energy using �D oriented energy �lters The results provided illustrate the

superior performance of local energy compared to the �D Sobel operator for surface

feature detection in �D image data

The following section shows how the local energy scheme for �D feature detection

in �D images may be extended to the detection of higher order image features in �D

image data


(A)

(B)

(C)

Figure �� A comparison of the results of local energy to the �D Sobel operator on a confocalmicroscope image of a �ea� A� The input image� B� The output of the local energy surface featuredetector� C� The output of the �D Sobel operator� Note the superior detection of surfaces via localenergy compared to the �D Sobel output� The images were produced using a surface voxel renderingtechnique�


�� High�order Feature Detection

In �D image data �D features correspond to all surfaces and sharp intensity variations

�D features include the intersection of surfaces wire�like structures and sharp surface

variations while �D features correspond to the intersection of three or more surfaces

and localized points More speci�cally �D �D and �D image features correspond to

signi�cant variation in an image in at least one two and three orthogonal orientations

respectively and therefore �D features are a subset of the �D features which are a

subset of the �D image features

A common problem in �D image analysis is the registration of two or more of

these images These images may be of the same subject but taken at di�erent times

or imaged by di�erent techniques or they may be otherwise similar images such as

occurs when matching an image to a template Reliable and accurate extraction of �D

features could signi�cantly reduce this problem in much the same way that �D image

features are used for matching in stereo and motion analysis because they constrain

a point in �D space

Extending �D local energy to the detection of higher order image features in �D

image data is conceptually simple due to the uni�ed de�nition of features developed

in this thesis That is �D features correspond to features in the original image �D

features are the features in the �D feature maps and for �D image data �D features

are the features in the �D feature maps

Therefore as for the detection of �D features in �D image data �D and �D feature

detection in �D image data can be achieved via local energy by subsequent application

of oriented energy �lters at orthogonal orientations In analogy to the detection of �D

image features in �D data the �D image features in �D image data correspond to the

sum of the N oriented energy responses�


Ei�I��

where I is the input image and Ei is the oriented energy operator at orientation i


The �D image features correspond to �D local energy which is obtained by re�

applying the oriented energy �lters to the �D oriented energy maps at all other ori�

entations This is given by�


N�j ��iXj��

Ej�Ei�I��

The N�N � �� D oriented energy responses each form a plane to which there is a

unique normal Applying energy at this normal orientation to the �D oriented energy

map gives the �D oriented energy The sum of these N�N � �� D oriented energy

maps gives �D local energy�


N�j ��iXj��

E��Ej�Ei�I��

If the N �D �lter orientations are evenly spaced �for example N � �� orientations

for half a dodecahedron� then in total only N oriented energy �lter pairs need to

be generated since E� will correspond to a unique �lter orientation for each pair

Ej�Ei�I��

Despite the fact that only N �lter orientations are required if the �D oriented

energy �lters are evenly spaced the number of �lter convolutions required to calculate

�D local energy is far greater than that required to calculate �D local energy in �D

image data This also means that the number of Fourier Transforms required to

calculate �D local energy will be greater than �D local energy in �D image data and

so more computationally expensive

The number of �lter convolutions �performed as complex multiplications in the

Fourier domain� and the number of real�to�complex FFTs �with complex�to�complex

FFTs counting double� are shown in Table � The consequences of this increased com�

putational load are further compounded in light of the fact that �D images typically

contain far more data than �D images

�� Discussion

Although the uni�ed approach to image features developed in this thesis allows for a

straightforward extension of oriented energy to the detection of image features in �D


convolutions �� FFTs

�D local energy ��D image� N �N � ��D local energy ��D image� �N �N � ��D local energy ��D image� N �N � ��D local energy ��D image� N � N�N � �� N� � N � ��D local energy ��D image� N � �N�N � �� N� � �N � �

Table �� Processing requirements of local energy in �D image data compared to �D image data�N is the number of �lter applications� convolutions is the number of applications of the energy �lterpairs required� and �� FFTs is the number of real�to�complex Fast Fourier Transforms required tocalculate the listed quantity�

image data the issue of �lter design with regard to uniform coverage of all orientations

is yet to be addressed Further work in this area is required especially because if

perfectly even �lter spacing is to be maintained then a maximum of �� orientations�

corresponding to half a dodecahedron�may be used For so few orientations it may be

di�cult to obtain uniform coverage of all orientations without introducing signi�cant

response to �D image features for �D and �D feature detection in �D image data

Although there have been several recent extensions to existing methods of �D

feature detection to �D images �� it is unclear how any of the methods of

�D feature detection discussed in Chapter � could be extended to the detection of �D

features in �D image data The methods based on grey�level models�such as Rohr�s

method �� and Blaszka�s method ��would be particularly di�cult to extend

due to the complexity and vast number of �D feature types

�� Summary

This chapter listed some of the avenues for future work extending local energy based

on the work presented in this thesis In particular this chapter illustrated how local

energy may be extended to the detection of �D �D and �D image features in �D image

data Some preliminary results have already been obtained� �D feature detection in

�D images via local energy has been implemented and the results compare favourably


with an extension of the Sobel operator to �D images

An outline for the extension of local energy to the detection of higher order image

features in �D image data was also presented and implementation of the model will

be the subject of future work As with �D local energy it is anticipated that �lter

design will be critical with respect to the performance of the model

The ease with which local energy may be extended to the detection of features in

�D image data illustrates the advantage of local energy and phase congruency and in

particular the uni�ed model of image features developed in this thesis over traditional

methods that attempt to characterize features by spatial representations

Chapter �

Conclusions

This thesis has discussed the importance of the detection of �D features in images

for problems such as structure from motion stereo matching and line labeling due to

the high information content and relative sparseness of these features in images The

literature shows that there has been a resurgence in interest in �D feature detection in

recent years although none of the methods proposed is based on a model that includes

all types of �D image features This goal of unifying the detection of all types of �D

features is the motivation of this thesis

The original contributions made in this thesis to the study of �D image feature

detection are outlined below�

� This thesis introduces a general model of �D image features based on local

energy The fundamental hypothesis is that �D image features correspond to

points of maximum �D phase congruency in the phase domain of the image

signal This model includes all types of �D image features including L� T� Y�

X� and ARROW�junctions as well as spot features This is the �rst time that

a single simple model has been able to unify all �D image features

� This model has been implemented using orientation selective quadrature �lters

This implementation has been shown to be able to detect a wide range of �D

image features including all of those outlined above The implementation has

been optimized to reduce computation requirements by a novel approach to �lter

��

CHAPTER �� CONCLUSIONS ��

storage detailed in Appendix A Importantly once the oriented energy �lters

have been designed the model is devoid of any special parameters or thresholds

that need tweaking apart from the usual feature strength threshold at the end

of the process

� The design of the orientation selective energy �lters used to calculate �D local

energy has been investigated Through careful design excellent results are ob�

tained despite the con�icting goals of minimal blurring and minimal �D local

energy response to �D features The oriented energy �lters used for the imple�

mentation of �D local energy in this thesis ensure that the �lters are applied at

an appropriate number of orientations such that a uniform coverage of all ori�

entations is obtained Polar separable �lters are an advantage here as they allow

the determination of the number of orientations to be independent of the radial

pro�le of the �lters since the orientation selectivity is determined solely by the

angular variation function This allows the radial pro�le to be varied without

a�ecting the required number of orientations for a given orientation selectivity

� The idempotent nature of the implementation of �D local energy presented in

this thesis using Heitger�s S�Gabor �lters �� was demonstrated This is in

contrast to most other �D feature detection schemes that are not idempotent

operators because they do not model spot features

� The problem of blurring in the �D local energy map was further investigated as

this can lead to both false positive and false negative �D image feature responses

Various methods were examined in order to reduce the amount of blurring in

the �D local energy map before the extraction of �D image features by non�

maximum suppression A new map �DLE� was formed by adding the �D local

energy map to a weighted product of �D and �D local energy substantially

reducing blurring and its e�ect on correct extraction of �D image features

� Chapter � investigated the detection and localization accuracy of �D local en�

ergy showing� in particular the excellent localization performance of this model


without needing to resort to multi�scale analysis of the image to correct charac�

teristic mislocalization as is required for other �D feature detection schemes ��

�� The detection accuracy of �D local energy was shown to be on par with

the key�point detector of Heitger �� and superior to the SUSAN corner de�

tector �� particularly for real images

� Chapter � demonstrated the detection of surfaces in �D image data via local

energy Surfaces in �D are �D features and so may be detected by applying

oriented energy �lters over several orientations and summing the results in a

similar manner to the extraction of �D features from �D images using local

energy Extending the analogy further edge� and corner�like features in �D

correspond to �D and �D features respectively These higher order features may

be extracted from �D images using the same methodology as higher order ��D�

features are extracted from �D images� oriented energy �lters are successively

applied at orthogonal orientations to yield �D and �D features in �D image data

after two passes and three passes of the oriented energy �lters respectively

� The extraction of higher order features by the successive application of oriented

energy �lters at orthogonal orientations uni�es the detection of �D �D and �D

image features In other words higher order image features are extracted from

lower order image feature maps This establishes a hierarchy of image features

�D features are a subset of all image points �D image features are a subset

of the �D image features with �D features being a subset of the �D features

Although the idea that �D features are a subset of �D image features is not new�

Kitchen and Rosenfeld made this explicit in their corner detector in ��

the extension of local energy to the detection of higher order image features

clearly illustrates why this is the case� features correspond to congruency in the

phase domain of the image signal with higher order features corresponding to

higher order phase congruency

The applications of local energy in early vision processing are still expanding I

trust that the uni�ed model of image features and the techniques for exploiting this


model developed in this thesis will assist in the ongoing and future development of

local energy

Bibliography

�� E H Adelson and J R Bergen Spatiotemporal energy models for the perception

of motion Journal of the Optical Society of America A ��#��

�� H Asada and M Brady The curvature primal sketch IEEE PAMI ��#��

��

�� P R Beaudet Rotationally invariant image operators In International Joint

Conference on Pattern Recognition pages ��#��

�� V Berzins Accuracy of laplacian edge detectors Computer Vision� Graphics

and Image Processing ��#��

�� T Blaszka and R Deriche Recovering and characterizing image features using

an e�cient model based approach Technical report INRIA Number ��

�� M Bowmans K H Hohne U Tiede and M Riemer ��D segmentation of MR

images of the head for ��D display IEEE Transactions on Medical Imaging

MI��#�� June ��

�� J F Canny Finding edges and lines in images Master�s thesis MIT AI Lab

TR��

�� M B Clowes On seeing things Arti�cial Intelligence � pages ��#��

�� J Cooper S Venkatesh and L Kitchen Early jump�out corner detectors In

IEEE Transactions on Pattern Analysis and Machine Intelligence ��

��

BIBLIOGRAPHY ��

�� J G Daugman Complete discrete ��D Gabor transforms by neural networks for

image analysis and compression IEEE Transactions on Acoustics� Speech� and

Signal Processing ��#�� July ��

�� R Deriche and G Giraudon Accurate corner detection � An analytical study

In IEEE International Conference on Computer Vision pages ��#�� Osaka

Japan December ��

�� R Deriche and G Giraudon A computational approach to corner and vertex

detection International Journal of Computer Vision ��#��

�� L Dreschler and H�H Nagel Volumetric model and ��D trajectory of a moving

car derived from monocular TV�frame sequence of a street scene Computer

Vision� Graphics� and Image Processing ��#��

�� W F�orstner A framework for low level feature extraction In J�O Eklundh

editor rd European Conference in Computer Vision volume II pages ��#��

Stockholm Sweden ��

�� W T Freeman Steerable Filters and Local Analysis of Image Structure PhD

thesis MIT Media Lab TR�� June ��

�� W T Freeman and E H Adelson The design and use of steerable �lters for

image analysis enhancement and wavelet representation IEEE PAMI ��#

��

�� A Guzman Computer recognition of three�dimensional objects in a scene Tech�

nical report MIT MAC�TR��

�� R M Haralick Edge and region analysis for digital image data Computer

Vision� Graphics� and Image Processing ��#��

�� R M Haralick Digital step edges from zero�crossings of second directional

derivatives IEEE Transactions on Pattern Analysis and Machine Intelligence

��#��

BIBLIOGRAPHY ��

�� C G Harris and M Stephens A combined corner and edge detector In th

Alvey Vision Conference pages ��#��

�� D J Heeger Optical �ow using spatiotemporal �lters International Journal of

Computer Vision ��#��

�� F Heitger L Rosenthaler R von der Heydt E Peterhans and O K�ubler Sim�

ulation of neural contour mechanisms� from simple to end�stopped cells Vision

Research ��#��

�� O Henricsson and F Heitger The role of key�points in contour detection In rd

European Conference on Computer Vision volume � pages ��#�� Stockholm

Sweden ��

�� S L Hong and T O Binford Stereo correspondence� A hierarchical approach

In Proc� Image Understanding Workshop volume � pages ��#��

�� D H Hubel and T N Weisel Receptive �elds binocular interaction and func�

tional architecture in the cat�s visual cortex Journal of Physiology ��#��

��

�� D H Hubel and T N Weisel Receptive �elds and functional architecture in

two non�striate visual areas �� and �� of the cat Journal of Neurophysiology

��#��

�� D A Hu�man Impossible objects as nonsense sentences Machine Intelligence

� pages ��#��

�� L Kitchen and A Rosenfeld Gray�level corner detection Pattern Recognition

Letters pages ��#��

�� H Knutsson and G H Granlund Texture analysis using two�dimensional quad�

rature �lters In IEEE Computer Society Workshop on Computer Architecture

for Pattern Analysis and Image Database Management pages ��#��

BIBLIOGRAPHY ��

�� H Knutsson R Wilson and G H Granlund Anistropic nonstationary image

estimation and its applications� Part � # restoration of noisy images IEEE

Transactions on Communictations ��#��

�� P Kovesi C F Fisher and D Huynh A visual information processing system

Technical report Department of Computer Science The University of Western

Australia ��

�� P D Kovesi A dimensionless measure of edge signi�cance In The Australian

Pattern Recognition Society� Conference on Digital Image Computing� Tech

niques and Applications pages ��#�� December �� Melbourne

�� P D Kovesi Image features from phase congruency Technical Report ��!�

Department of Computer Science The University of Western Australia ��

�� H K Liu Two� and three�dimensional boundary detection Computer Graphics

and Image Processing ��#��

�� J Malik Interpreting line drawings of curved objects International Journal of

Computer Vision ��#��

�� J Malik and P Perona Preattentive texture discrimination with early vision

mechanisms Journal of the Optical Society of America A ��#��

�� D Marr Vision Freeman� San Francisco ��

�� G Medioni and Y Yasumoto Corner detection and curve representation using

cubic B�splines In International Conference on Robotics and Automation pages

��#��

�� R Mehrotra S Nichani and N Ranganathan Corner detection Pattern Recog

nition ��#��

�� M Michaelis and G Sommer Junction classi�cation by multiple orientation

detection In J�O Eklundh editor Proceedings� th IEEE International Con

ference on Computer Vision volume � pages ��#��

BIBLIOGRAPHY ��

�� O Monga R Deriche G Malandain and J P Cocquerez �D edge detection by

separable recursive �ltering and edge closing In ��th International Conference

on Pattern Recognition pages ��#�� June �� IEEE Computer Society

Press

�� H P Moravec Towards automatic visual obstacle avoidance In Proceedings

of the International Joint Conference on Arti�cial Intelligence pages ��#��

August ��

�� H P Moravec Visual mapping by a robot rover In Proceedings of the �th

International Joint Conference on Arti�cial Intelligence pages ��#��

�� M C Morrone and D C Burr Feature detection in human vision� A phase�

dependent energy model Proc� R� Soc� Lond� B ��#��

�� M C Morrone and R A Owens Feature detection from local energy Pattern

Recognition Letters ��#��

�� M C Morrone J R Ross D C Burr and R A Owens Mach bands are phase

dependent Nature ��#�� November ��

�� H�H Nagel Constraints for the estimation of displacement vector �elds from im�

age sequences In Proceedings of the International Joint Conference on Arti�cial

Intelligence pages ��#�� Karlsruhe West Germany August ��

�� J A Noble Finding corners Image and Vision Computing ��#�� May

��

�� J A Noble Descriptions of image surfaces D Phil thesis Department of

Engineering Science University of Oxford ��

�� R A Owens S Venkatesh and J Ross Edge detection is a projection Pattern


�� P Perona Steerable�scalable kernels for edge detection and junction analysis In

G Sandini editor �nd European Conference on Computer Vision pages �#��

Santa Margherita Ligure Italy ��

BIBLIOGRAPHY ��

�� W H Press S A Teukolsky W T Vetterling and B P Flannery Numerical

Recipes in C� The Art of Scienti�c Computing Cambridge University Press �nd

edition ��

�� C Pudney P Kovesi and B Robbins A �D local energy surface detector for

confocal microscope images In Proceedings of ANZIIS�� pages �#�� Perth

December ��

�� C Pudney P Kovesi and B Robbins Feature detection using oriented local

energy for �D confocal microscope images In Proceedings of the International

Computer Science Conference pages ��#�� Hong Kong December ��

�� C Pudney M Robins B Robbins and P Kovesi Surface detection in �D con�

focal microscope images via local energy and ridge tracing Journal of Assisted

Confocal Microscopy To Appear

�� I D Reid and D W Murray Tracking foveated corner clusters using a�ne

structure In th International Conference on Computer Vision ��

�� B Robbins and R Owens �D feature detection and identi�cation In Proceedings

of the rd Annual Departmental Research Conference Department of Computer

Science The University of Western Australia ��

�� B Robbins and R Owens The �D local energy model Technical Report ��!�

Department of Computer Science The University of Western Australia August

��

�� B Robbins and R Owens �d feature detection via local energy Image and

Vision Computing To Appear

�� K Rohr Modelling and identi�cation of characteristic intensity variations Image

and Vision Computing pages ��#�� February ��

�� K Rohr Recognizing corners by �tting parametric models International

Journal of Computer Vision ��#��

BIBLIOGRAPHY ��

�� L Rosenthaler F Heitger O K�ubler and R von der Heydt Detection of general

edges and key�points In �nd European Conference in Computer Vision pages

��#��

�� J Serra Image Analysis and Mathematical Morphology Academic Press Lon�

don England ��

�� J Serra Image Analysis and Mathematical Morphology� Theoretical Advances

volume � Academic Press London England ��

�� M A Shah and R Jain Detecting time�varying corners Computer Vision�

Graphics and Image Processing ��#��

�� S M Smith Feature Based Image Sequence Understanding PhD thesis Ro�

botics Research Group Department of Engineering Science Oxford University

��

�� S M Smith and J M Brady SUSAN�A new approach to low level image pro�

cessing Technical Report TR��SMS�b Defence Research Agency Farnborough

Hampshire ��

�� I Sobel Camera models and machine perception Technical Report AIM��

Stanford AI Lab May ��

�� S Venkatesh A Study of Energy Based Models for the Detection and Classi

�cation of Image Features PhD thesis The University of Western Australia

��

�� S Venkatesh and R Owens On the classi�cation of image features Pattern


�� S Venkatesh and R A Owens An energy feature detection scheme In IEEE

International Conference on Image Processing pages ��#�� Singapore

�� M Verseval G A Orban and L Lagae Responses of visual cortical neurons

to curved stimuli and chevrons Vision Research ��#��

BIBLIOGRAPHY ��

�� D Waltz Understanding line drawings of scenes with shadows In P H Win�

ston editor The Psychology of Computer Vision pages ��#�� New York ��

McGraw�Hill

�� S W Zucker and R A Hummel A three�dimensional edge operator IEEE

Transactions on Pattern Analysis and Machine Intelligence ��#�� May

��

�� O A Zuniga and R M Haralick Corner detection using the facet model In Pro

ceedings of the IEEE Conference on Computer Vision and Pattern Recognition

pages ��#�� Washington DC November ��

Appendix A

Optimizing �D Local Energy

Calculation of �D local energy is inherently a computationally expensive task In

order to calculate �D local energy a pair of oriented energy �lters must be applied

twice for each orientation at which the image is examined It is clear that this process

could be sped up by a factor approaching N where N is the number of orientations

at which the �lters are applied by using a coarse�grained parallel implementation

since the output at each orientation may be calculated independently by N separate

processors and then the results combined to produce �D local energy When viewed

in this manner the �� seconds of CPU time mentioned in Section � for a ��x��

image at �� orientations works out to under �� seconds per orientation to complete

both passes of oriented energy While clearly even this result is not fast enough for

real�time applications it is not drastically short of the mark especially considering

the improvements that can be made using special hardware such as Digital Signal

Processors

Ignoring the bene�ts of specialized hardware and parallel computation and focus�

ing on standard serial computer hardware there are still several ways to reduce the

computational load of calculating the �D local energy of an image A feature of the

input and energy images in the spatial domain is that they are all real�valued There

are standard methods for performing FFTs that halve the computation required when

the information is real�valued in either domain �� and these real�valued FFTs can

be used for the transforms in calculating �D local energy

��

APPENDIX A� OPTIMIZING �D LOCAL ENERGY ��

A further signi�cant improvement in the computational e�ciency of applying the

oriented energy �lters can be made by combining the convolutions between the pair

of energy �lters to a single multiplication in the frequency domain In the spatial do�

main the oriented energy �lters are both real�valued with one being even�symmetric

and the other odd�symmetric In the frequency domain the �lters are real�valued

and even�symmetric and imaginary�valued and odd�symmetric respectively Mul�

tiplying the odd�symmetric �lter by i in the frequency domain gives a real�valued

odd�symmetric �lter whose transform back into the spatial domain is now imaginary

and odd�symmetric

The Fourier Transform is a linear operation therefore one can add the even�

symmetric �lter and the odd�symmetric �lter to get a single �lter and multiply this by

the frequency representation of the image After transforming the result back to the

spatial domain the results of applying the oriented energy �lter pair are extracted as

simply the real component for the even�symmetric �lter and the imaginary compon�

ent for the odd�symmetric �lter Therefore the result of applying both halves of an

energy �lter pair can be obtained by multiplication by a single function and a single

�complex�valued� FFT

A further advantage of combining the �lters in the frequency domain before ap�

plying them is that this drastically reduces the storage requirements of the �lters and

computational cost of applying them In the frequency domain the sum of the even�

symmetric �lter and i times the odd�symmetric �lter produces a real�valued function

Furthermore as one �lter is even�symmetric and the other odd�symmetric the sum

of these �lters partly cancels out leaving a function that is not only real�valued but

also only half the size of either of the original �lters This means that a pair of energy

�lters can be represented in the frequency domain by a function that is a quarter of

the size of their sum reducing by ��" the storage costs of the �lters and the amount

of computation required to apply them

The compression achieved by storing �lters in this manner is illustrated in Fig�

ure �� The representations of the even� and odd�symmetric �lters in the frequency

domain are shown along with the compressed representation of the sum of the even�

symmetric �lter plus i times the odd�symmetric �lter Note that the positive and


negative parts of the �lters cancel out when summing them to produce a function that

is half the size of either of the component �lters

(A) (B) (C)

Figure �� A demonstration of the compression achieved by storing the energy �lters in the fre�quency domain after multiplying the odd�symmetric �lter by i� A� The real�valued even�symmetric�lter in the frequency domain� B� The imaginary�valued odd�symmetric �lter in the frequency do�main� C� The sum of the even�symmetric �lter and i times the odd�symmetric �lter is real�valuedand half the size of either �lter� The grey background image intensity corresponds to zero� Darkerand lighter shades correspond to negative and positive values respectively�

The bene�ts of storing the even� and odd�symmetric �lters together in the fre�

quency domain are even more apparent in �D image data �� In �D not

only are the computational demands more severe but with the much larger image

data sets memory resources are critical This scheme allows dramatic reduction in

memory requirements because the �lter data can be cropped so that only the non�zero

components are stored and this is typically only a small fraction of the size of the �D

image The computational demands are also reduced as multiplication of the image

with the �lters only needs be performed on this non�negative cropped section of the

�lters

Even with the optimizations discussed above the computational requirements of

�D feature detection via �D local energy are still moderately high Pro�ling of the

implementation of �D local energy presented in this thesis shows that over ��" of the

running time of the program is spent performing FFTs This implementation uses the

standard FFT algorithm �fourn� provided in �Numerical Recipes in C� �� The use

of a more e�cient FFT algorithm will naturally have a dramatic e�ect on the running


time of �D local energy since this currently takes nearly all of the processing time A

decimation�in�frequency algorithm �which would allow elimination of all bit�reversal

operations� in conjunction with a base�� algorithm for example could be expected to

further reduce the running time of the implementation by approximately ��" General

Prime Factor FFT algorithms are reported to be signi�cantly faster than this Clearly

there are still moderate gains to be made in the performance of this implementation

of �D local energy with respect to the implementation of the FFT

The cost of spatial convolution increases with the size of the �lters until at some

point spatial convolution is more computationally expensive than performing a forward

FFT applying the �lters in the frequency domain and doing an inverse FFT of the

result The broad bandwidth orientation selective �lters required for the calculation

of �D local energy are quite large in the spatial domain and so make performing the

convolutions in the spatial domain impractical Furthermore applying the �lters in

the frequency domain reduces the quantization e�ect when rotating the �lters since

their representation is larger and smoother in the frequency domain than in the spatial

domain

It is not easy to calculate energy in the frequency domain because the square root

is non�linear so there is no linear mapping of this operation to the frequency domain

Even without the square root operation squaring of the �lter responses corresponds to

auto�convolution in the frequency domain which is far more computationally expensive

than two FFTs Therefore inverse FFTs are required to translate the �lter responses

back into the spatial domain for the energy computation if the �lters are applied in

the frequency domain

Documents

THE DETECTION OF D IMA GE FEA TURES · Benjamin John Robbins A ugust . c Co p yr igh t b y Benjamin John Robbins ii. A b stract Accura t ed et ect ion an ... c ht h e sam ew ay t