Edge and Texture - pub.ro › _curs › cm › proiecte › Content_Based... · very simple edge detectors [27, 19, 4]. By themselves simple edge detectors perform well against complex

Chapter 4

Edge and Texture

Edges describe the spatial differences across an image. These differences form boundaries that allow

the human visual system to distinguish between homogeneous colour regions in an image. Simi-

larly, content-based image retrieval systems use low-level edges in higher level feature extraction

techniques such as contour extraction and texture analysis to differentiate between regions within

an image.

Edges have been used extensively for content-based image retrieval and much research has been

conducted [13, 60, 19, 4, 104]. However, many of the techniques proposed in the literature only use

very simple edge detectors [27, 19, 4]. By themselves simple edge detectors perform well against

complex edge detectors, however they perform poorly when used for higher level feature extraction

such as contour extraction and texture analysis. Specific edge detectors have been designed to

extract features required by texture analysis [60, 105, 39] but few edge detectors have been designed

with the intent of accurate contour following. Contours are important in content-based retrieval

systems as they are one of the high-level structural representations within an image. Since the

performance of contour following is intrinsically dependent on edge detection the primary purpose

of this chapter is to investigate edge detection techniques for contour following and to build upon

these techniques to produce an edge detector tuned for contour following that can also be used for

texture analysis. The resulting edge detector, called the Asymmetry edge detector, is able to provide

the best single pixel responses across multiple orientations compared with existing techniques.

Section 4.1 discusses the limitations of existing edge detection techniques. Section 4.2 identifies

the requirements of an edge detector within the context of this research. Section 4.3 analyses

and compares existing multi-orientation operators. Section 4.4 presents the new Asymmetry edge

detector. Section 4.5 presents a new technique for thinning multi-orientation edge responses. Section

4.6 discusses the Asymmetry detector as a computational model of the visual cortex. Section 4.7

presents a new technique for inhibiting textures in edge detection. Section 4.8 presents conclusions

drawn from the findings of this chapter.

75

(a) Original image (b) Simple difference operator (c) Laplacian

(d) Roberts (e) Prewitt (f) Sobel

(g) Frei-Chen (h) Kirsch (i) Robinson

Figure 4.1: Some common edge detectors applied to image (a). Each result image represents the

absolute maximum magnitude at each pixel after the individual masks have been applied.

4.1 Edge Detection

Edges form where the pixel intensity changes rapidly over a small area. Edges are detected by

centring a window over a pixel and detecting the strength of edge within the window. The result is

stored at the same pixel location. Edge responses produced by a number of common edge detectors

are shown in Figure 4.1.

Edge detection techniques often use a mask that is convolved with the pixels in the window.

A simple difference mask is shown in Figure 4.2 (a). The difference mask is directional. An edge

detector can also be non-directional such as the Laplacian or difference of Gaussians as shown in

Figure 4.2 (b). Since edges are directional and contours consist of oriented edges, we are primarily

interested in directional edge detectors.

Other simple, but extensively used edge detectors, include the Sobel, Roberts, and Prewitt

76

-1 1

(a) Simple Difference Operator 0

-1

0

-1

0

0 -1

-1

4

(b) Laplacian

1 0

0 -1 -1

10

0

(c) Roberts

-1 -1 -1

000

1 1 1 1

1

1

-1

-1

-1 0

0

0

(d) Prewitt

-1 -2 -1

000

1 2 1 1

2

1

-2

-1

-1 0

0

0

(e) Sobel

Figure 4.2: Some common edge detectors.

operators (Figure 4.2) [69]. Such operators are directional and can be used to detect orientations at

90 ◦ intervals. Other operators such as the Frei-Chen [58], Kirsch [57], and Robinson [59] operators

can also be oriented at 45 ◦ intervals allowing up to 4 orientations to be detected (Figure 4.3 (a)).

In contrast, the human vision system detects 18 different orientations at 10 ◦ intervals [10].

Operators that are specified by a continuous function rather than a fixed mask can be rotated

to any arbitrary orientation. The Gabor filter [60] and the Canny operator [13] (Figure 4.4) can

both be described mathematically and are two of the most advanced edge detectors as they have

a similar receptive field to the edge detectors of the human vision system [12].

Figure 4.1 shows the output of the various edge detectors discussed in this section applied to a

test image. However, it is not possible to determine a good edge detector simply by looking at its

output. Instead we must look at the design and features of an edge detector with respect to the

requirements of contour following.

4.2 Edge Detector Requirements

For each pixel, contour following requires the orientation and strength of each edge. Contour

following also requires highly tuned edge responses. Tuning can occur across orientations and also

across spatial locations. Figure A.8 shows the orientation tuning response curve for a simple cell

in human vision. Likewise, oriented edge detectors produce different edge responses depending on

the orientation of the edge input. The output will peak when the orientation of the edge and the

77

3 3 3

303

-5 -5 -5 3

3

3

-5

3

-5 -5

3

0

-5 3 3

30-5

-5 3 3 3

3

3

-5

-5

3 3

-5

0

(a) Kirsch masks

(b) Kirsch mask results

Figure 4.3: The Kirsch mask [57] applied to the image in Figure 4.1(a). The masks detect edges at

0 ◦, −45 ◦, −90 ◦, and −135 ◦.

78

detector are aligned and will fall off as their orientations change. Since contour following will follow

the orientation with the largest strength it is important that the edge detectors are tuned tightly so

that the contour following algorithm doesn’t inadvertently follow the wrong orientation. However,

the tuning can not be too tight as responses by two edge detectors with adjacent orientations can

be used to determine the exact orientation of an edge that lies between the two orientations.

Position tuning is also important as a contour following algorithm will also consider a neigh-

bourhood of pixels to determine the next pixel to include in the contour. If two adjacent pixels

produce a strong response then the contour following algorithm may unnecessarily create two

contours at that point rather than following the pixel that the edge is truly aligned to.

Adjacent orientation responses are used to determine the exact orientation of an edge. In the

same manner it is possible to use adjacent position responses to determine the exact position of

an edge. This process is called subpixel edge detection [106], however subpixel edge detection is

beyond the scope of this research, primarily because each stage of edge and contour processing

assumes that each edge is aligned with the centre of a pixel.

In summary, the edge detector must satisfy the following requirements:

• Produce multi-orientation output

• Orientation-tuned with only two adjacent responses generated

• Position-tuned with only one adjacent response generated

• Efficient, small window, convolution-style operator

4.3 Multi-orientation Operators

The Gabor and Canny operators are the most suitable operators for multi-orientation edge de-

tection as they are described by a continuous function (and therefore can be used to construct

multi-orientation detectors), resemble edge detectors in the human vision system, and have been

extensively investigated [60, 13, 107]. Other fixed mask operators such as the Laplacian, difference

of Gaussians, Roberts, Prewitt, and Sobel operators are not suitable because they only support 1

to 4 orientations. An additional benefit of the Gabor and Canny operators is that they are scalable

and can be used to identify edges of different resolutions.

In this research we have decided to use the S-Gabor filter proposed by Heitger et al. [12] over the

standard Gabor filter. The standard Gabor filter modulates a sine or cosine wave with a Gaussian

envelope:

Godd(x) = e−x2/2σ2sin[2πv0x] (4.1)

Geven(x) = e−x2/2σ2cos[2πv0x] (4.2)

79

where σ is the bandwidth of the Gaussian envelope and v0 is the wavelength of the sine wave.

The odd Gabor filter is used for edge detection whilst the even Gabor filter can be used for line

detection. The Gaussian envelope of the Gabor filter is not able to curtail the periodic nature of the

sine or cosine wave and therefore additional fluctuations of the wave may appear at the extremities

of the filter. Since edges are a local phenomenon there is no need for a periodic wave and the

S-Gabor filter reduces the frequency of the sine wave as x increases so that only one wavelength is

present:

Sodd(x) = e−x2/2σ2sin[2πv0xξ(x)] (4.3)

Seven(x) = e−x2/2σ2cos[2πv0xξ(x)] (4.4)

ξ(x) = ke−λx2/σ2+ (1− k) (4.5)

where k determines the change of wavelength. The Canny operator is simpler as it does not use

periodic functions:

C(x) =xe−x2/2σ2

σ2(4.6)

When a multi-orientation operator is applied to an image multiple edge images are generated.

Therefore the greater the number of orientations per edge detector the greater the amount of

memory is required to store the result images and also the longer it will take to generate the

images. For the purposes of optimisation it is beneficial for the number of orientations to be as

small as possible. We have decided to use 12 orientations at 15 ◦ intervals as a compromise between

the 18 orientations of human vision and the 1 to 4 orientations offered by the fixed mask operators.

4.3.1 Multi-orientation Experiments

The S-Gabor and Canny operators were chosen because they can be used at any orientation and

resemble the receptive fields of visual cortex simple cells. The odd S-Gabor filter was constructed

in two dimensions using the following formulae:

Sodd(x′, y′) = e−(x′2+y′2)/2σ2sin[2πv0y

′ξ(x′, y′)] (4.7)

ξ(x′, y′) = ke−λ(x′2+y′2)/σ2+ (1− k) (4.8)

where x′ and y′ are the rotated and scaled pixel co-ordinates defined below in Equations 4.10 and

4.11. The remaining parameters were adjusted to provide a filter that produces only one period of

the sine wave under the Gaussian envelope with a wavelength of 2 pixels, resulting in σ = 0.646,

v0 = 0.5, λ = 0.3, and k = 0.5.

The Canny filter was constructed in two dimensions using the following formula:

C(x) =−y′e−(x′2+y′2)/2σ2

σ2(4.9)

where σ = 0.35 to also provide a separation of one pixel between lobe peaks.

80

The filters were rotated and scaled by pre-rotating and scaling the x and y pixel co-ordinates:

x′ =x cos(−θ)− y sin(−θ)

sx(4.10)

y′ =x sin(−θ) + y cos(−θ)

sy(4.11)

where θ = n π12 , n = (0→ 11), sy = 1, and sx determines the elongated aspect ratio of the filter.

The S-Gabor and Canny operators are very similar in shape and the similarity is shown in

their respective tuning response curves. The tuning response curves display the magnitude of the

response of the operator at different lateral positions and orientations to the edge stimulus. A

vertical black and white edge was used as the stimulus and 12 orientations of the operator were

convolved with the stimulus. Position response values were taken from the few pixels either side of

the edge whilst orientation responses were taken from each of the 12 resulting images.

In our analysis we are primarily interested in the highest frequency edges representable by the

image. These edges are formed between two adjacent pixels. Therefore the filters have a width of 2

pixels with each lobe centred on a pixel. The length of the filter must be greater than one pixel and

should be less than 10 pixels so that curves are detectable. A longer filter is desirable to filter out

noisy edges. Because there is no exact restriction on filter length we will first analyse the tuning

response curves at different lengths to determine the best length.

4.3.2 Multi-orientation Results

Figure 4.4 shows the aspect ratios of the S-Gabor and Canny operators tested. Figures 4.5 and 4.6

show the tuning response curves for the S-Gabor and Canny operators respectively at the different

aspect ratios. By comparing the graphs it can be seen that the Canny and S-Gabor operators

show very similar results (although at different aspect ratios). This can be explained by the Canny

operator being shorter than the S-Gabor operator. Since there is no difference in orientation and

position tuning between the two operators either one may be used. We have selected the Canny

operator because it requires fewer parameters.

The tuning response curves show that shorter filters provide very good position tuning but

poor orientation tuning, whilst the longer filters provide good orientation tuning but poor position

tuning at orientations slightly different to that of the edge. These tuning response curves can be

explained by visualising the overlapping of the operator lobes over a test edge (see Figure 4.7 (a)

to (d)). These scenarios indicate that whenever the edge stimulus is asymmetrical over the length

of the filter a response shouldn’t be generated. What is required is an asymmetry detector whose

response is negated from the response of the edge detector.

81

S Gabor

Canny

Canny asymmetry

1:11.5:12:13:14:1

1:11.5:12:13:14:16:1

1.33:12:12.67:14:1

Figure 4.4: Filters tested.

4.4 New Asymmetry Detector

A simple approach to identify asymmetry of edge response along the length of an edge detector

could be to simply use the same edge detector but at a 90 ◦ orientation. However, such a filter

would give the same tuning responses as those in Figure 4.6 but shifted 90 ◦ and wouldn’t be

sufficient to nullify erroneous responses. What is required is a filter which is the same shape as the

edge detector but at a 90 ◦ orientation (see Figure 4.7).

The same formula for constructing the Canny edge detector in Equation 4.9 is used for the

asymmetry filter (σ = 0.5) however the rotation and scaling equations are modified to allow for an

orthogonal orientation and aspect ratio:

x′ =3[x cos(π

2 − θ)− y sin(π2 − θ)]

2sx(4.12)

y′ =x sin(π

2 − θ) + y cos(π2 − θ)

sy(4.13)

The direction of asymmetry is not relevant so the absolute asymmetry response is subtracted

from the Canny edge detector modulated by a tuning factor t:

EA = |C| − t|A| (4.14)

where C is the response of the Canny edge detector, A is the response from the asymmetry filter,

and EA is the final edge response.

82

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7

Orientation (degrees)

Position

S Gabor 1:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

S Gabor 1.5:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

S Gabor 2:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

S Gabor 3:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

S Gabor 4:1

Figure 4.5: Gabor tuning response curves

83

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny 4:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny 6:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny 2:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny 3:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny 1:1

0.00E+00

1.00E+02

2.00E+02

3.00E+02

4.00E+02

5.00E+02

6.00E+02

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny 1.5:1

Figure 4.6: Canny tuning response curves

84

Asymmetry detector

Edge Detector

(a) (b)

(c) (d) (e)

Figure 4.7: (a) to (d) Edge operator scenarios. (a) Alignment of operator with edge; (b) orientation

misalignment; (c) orientation and position misalignment; (d) position misalignment. (e) Asymmetry

detector overlaid on edge detector.

4.4.1 Asymmetry Detector Results

The tuning curves of asymmetry filters for the 3:1, 4:1, and 6:1 Canny edge detectors are shown in

Figure 4.8. The tuning curves are sufficient to nullify erroneous responses (however, shorter aspect

ratios below 3:1 were not sufficient). The result of the asymmetry edge detector with tuning t = 1

is shown in Figure 4.9 (a). With a tighter tuning parameter of t = 2 the result is a perfectly tuned

edge detector in both orientation and position (Figure 4.9 (b)).

The edge stimulus used is a perfect vertical edge aligned to one of the edge detector orientations.

To test whether the Asymmetry detector performs as well with edge orientations which are not

aligned with one of the edge detector orientations the same vertical edge was tested with edge

orientations at a 7.5 ◦ offset which is half way between the usual 15 ◦ interval between edge detector

orientations. Figure 4.10 shows the results for the 7.5 ◦ offset edge detector. The Asymmetry edge

detector successfully provides two identical responses for each adjacent orientation indicating that

the orientation of the edge lies exactly halfway between the two orientations.

The tuned operator appears to work well for any aspect ratio greater than or equal to 3:1.

However, because the tuned operator is inhibited by asymmetrical stimulus it may have problems

at corners (Figure 4.11). The tuning curves at corners for the three aspect ratios are shown in Figure

4.12. Figure 4.12 shows that there is no response for the edge as the edge detector approaches the

corner. Larger aspect ratio operators fall off early whilst the 3:1 aspect ratio operator falls off only

one pixel before the end of the contour. Therefore, the best operator for all scenarios is the 3:1

aspect ratio operator.

85

0

100

200

300

400

500

600

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny Asymmetry 2.67:1 (matches 4:1)

0

100

200

300

400

500

600

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny Asymmetry 4:1 (matches 6:1)

0

100

200

300

400

500

600

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny Asymmetry 2:1 (matches 3:1)

0

100

200

300

400

500

600

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny Asymmetry 1.33:1 (matches 2:1)

Figure 4.8: Asymmetry tuning curves.

0

100

200

300

400

500

600

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny Tuned 3:1

0

100

200

300

400

500

600

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny Tuned 3:1 minus 2x Asymmetry

Figure 4.9: Combined edge detector and asymmetry inhibitor at 3:1 aspect ratio. (a) t = 1, (b)

t = 2.

86

0

50

100

150

200

250

Response

0 15 30 45 60 75 90 105 120 135 150 165

S1

S2

S3

S4

S5

S6

S7


Position

Canny Tuned 3:1 minus 2x Asymmetry, 7.5˚ offset

Figure 4.10: Tuned edge detector at 7.5 ◦ orientation offset.

The one pixel fall off in edge response before the end of the contour may affect contour extraction

as the contours extracted will not include the last pixel of the corner. However, losing one pixel

before the end of a contour appears to be a fair trade off for the improved orientation and position

tuning gained. In addition, contour-end detection and vertex extraction which are investigated in

the following chapter would be able to identify the corner and higher level processing stages would

be able to link the vertex to the edges.

Figure 4.13 shows how the Asymmetry detector compares with the standard Canny detector

for sample test images. The Asymmetry edge detector results of Figure 4.13 (c) show tighter

positional tuning than the Canny edge detector results of Figure 4.13 (b). The orientation tuning

performance is not as easily seen in a single aggregate image however the impact of improved

orientation tuning in the Asymmetry detector can be seen in later stages of processing which is

indicated by the thinned Canny and Asymmetry responses of Figure 4.13 (d) and (e) respectively.

The thinned Asymmetry edges using the thinning technique discussed in the next section contain

fewer spurious responses than the thinned Canny edges.

4.5 Thinning

Using the Asymmetry edge detector developed in the previous sections the edge responses should

be tightly tuned in both orientation and position. However, it is still possible that an edge may

generate responses over a number of positions because it’s wavelength is greater than that of the

edge detector. Therefore it is still necessary to perform some thinning on the edge responses to

reduce contours to 1 pixel thickness, which is required by the contour following algorithm.

We are only interested in thinning along the direction of a contour. Current thinning techniques

such as morphological thinning and skeletonisation ignore the direction of a contour. As a result

thinning will occur in all directions. Figure 4.15 (a)-(d) shows the results of thinning the cube

87

Figure 4.11: Possible problem when tuned edge detector is placed over a corner.

0

100

200

300

400

500

Response

1 2 3 4 5 6 7 8 9 10 11 12

S1

S2

S3

S4

S5

S6

S7


Position

Canny Tuned 4:1 Corner

0

100

200

300

400

500

600

Response

1 2 3 4 5 6 7 8 9 10 11 12

S1

S2

S3

S4

S5

S6

S7


Position


0

100

200

300

400

500

600

Response

1 2 3 4 5 6 7 8 9 10 11 12

S1

S2

S3

S4

S5

S6

S7


Position


0

50

100

150

200

250

300

350

Response

1 2 3 4 5 6 7 8 9 10 11 12

S1

S2

S3

S4

S5

S6

S7


Position


Figure 4.12: Corner tuning curves

88

(e)

(d)

(c)

(b)

(a)

Figure 4.13: Canny and Asymmetry edge responses for the Chapel, Plane, and Claire images. (a)

Sample images; (b) Results after applying the Canny edge detector; (c) Results after applying the

Asymmetry edge detector; (d) Thinned Canny edges; (e) Thinned Asymmetry edges.

89

(b)(a)

Figure 4.14: (a) Cube test image; (b) asymmetry edge detector responses.

responses of Figure 4.14 using the skeletonisation and morphological thinning algorithms.

Thinning can be applied to either the individual orientation responses or to the aggregate

edge responses. However, as can be seen in Figure 4.15 (a)-(d), skeletonisation and morphological

thinning techniques ignore the direction of the contour responses and any interaction between

different orientations. Therefore a new thinning process has been developed which only thins along

the direction of a contour whilst taking into account adjacent orientations.

Morphological thinning approaches process a small neighbourhood of an image, for example a

3×3 neighbourhood of pixels. In non-directional techniques the goal is to remove any pixel adjacent

to another which lies on an edge. In directional techniques the approach is similar but pixels are

only considered adjacent along the perpendicular to the orientation of the edge response (Figure

4.16). Morphological approaches work well for edge responses that are aligned to the horizontal,

vertical, and diagonal layout of pixels. Thinning occurs by removing a pixel if two pixels are found

to be in adjacent locations. Which pixel is removed depends on the depth of the image. For binary

images there is often an iterative process where the pixels lying on the edge of the region are

removed first and the process stops when no more pixels are removed. For greyscale images, the

magnitude of the edge response can be used to determine which pixel will be removed. Usually the

pixel with the lesser magnitude is removed.

Using a neighbourhood aligned to pixel positions becomes less useful when working with more

than four orientations because the positions of adjacent responses no longer align to the centre of

existing pixels (Figure 4.16 (c) and (d)). In fact this is also true even for 45 ◦ orientations because

the distance between pixel centres is greater than the distance between the centres of horizontally

and vertically aligned pixels. Therefore if more than the horizontal and vertical orientation are to

be used for thinning then a more sophisticated technique is required to determine neighbourhood

90

(f)(e)

(d)(c)

(b)(a)

Figure 4.15: (a) Skeletonisation of the aggregate edge responses; (b) aggregate of the skeletonisation

of individual orientation edge responses; (c) morphological thinning of the aggregate edge responses;

(d) aggregate of the morphological thinning of individual orientation edge responses; (e) Gaussian

thinning; (f) diagonal removal.

91

(a) (b) (c) (d)

Figure 4.16: Positions of perpendicularly adjacent responses used for thinning. (a) Vertical, (b)

horizontal, (c) 45 ◦, and (d) 15 ◦.

responses.

4.5.1 New Gaussian Thinning Approach

The problem facing morphological techniques is a sampling problem where the sampling no longer

occurs at pixel centres. To solve the sampling problem we have created three elongated Gaussian

filters that sample at three positions orthogonal to the orientation of the edge (see Figure 4.17).

The distance between each filter remains constant regardless of the orientation, thereby solving

the sampling problem. The outputs from the three filters are then used to thin laterally along the

orientation.

The three Gaussian filters are based on the two-dimensional Gaussian envelope:

G = e−(x′2+y′2)/2σ2(4.15)

where σ is the bandwidth of the envelope and is set to 0.5, and x′ and y′ are the scaled, translated,

and rotated pixel co-ordinates:

x′ =x cos(−θ)− y sin(−θ)

sx(4.16)

y′ =x sin(−θ) + y cos(−θ) + ty

sy(4.17)

where θ is the orientation of the elongated Gaussian filter ranging in 15 ◦ increments from 0 ◦

to 165 ◦, sx and sy determine the shape of the filter and are set to sx = 4 and sy = 1, and ty

determines the lateral translation of the elongated Gaussian filter and has the values (−1, 0, 1) for

the three lateral filters centring each Gaussian filter one pixel from the centre pixel in an orthogonal

direction from the orientation of the edge.

An edge response is cleared if either of the two lateral Gaussian samples in the same orientation

or the four lateral samples in adjacent orientations are greater than the Gaussian sample centred

at the current pixel, that is, if either of the following are true:

G−1(θ) > G0(θ) (4.18)

92

Figure 4.17: Position of Gaussian filters used for thinning.

Figure 4.18: Potential double pixel lines after Gaussian thinning.

G1(θ) > G0(θ) (4.19)

G−1(θ +π

12) > G0(θ) (4.20)

G1(θ +π

12) > G0(θ) (4.21)

G−1(θ −π

12) > G0(θ) (4.22)

G1(θ −π

12) > G0(θ) (4.23)

where G−1, G0, and G1 are the three lateral Gaussian samples and θ is the orientation of the

elongated Gaussian filter. This first criteria thins laterally across orientations but does not perform

orientation competition at the centre pixel.

Orientation competition is performed by preserving the largest two adjacent edge responses in

a local neighbourhood along the orientation axis. Two adjacent edge responses are preserved so

that the true orientation of the edge can be interpolated. To be preserved, the current orientation

Gaussian response must be greater than or equal to the two adjacent orientation responses:

G0(θ) ≥ G0(θ ±π

12) (4.24)

or, the current orientation may have a greater adjacent orientation but it must be greater than the

responses adjacent to these two, that is:

G0(θ) < G0(θ −π

12) and G0(θ) > G0(θ +

π

12) and G0(θ) > G0(θ +

2π

12) (4.25)

or:

G0(θ) < G0(θ +π

12) and G0(θ) > G0(θ −

π

12) and G0(θ) > G0(θ −

2π

12) (4.26)

The result of applying the Gaussian thinning technique is shown in Figure 4.15 (e) where it can

be seen that the edges are successfully thinned along the orientation of the edges. There is still one

93

Step 1 Step 2

Figure 4.19: Diagonal removal.

problem with this technique in that it is not able to reduce the 45 ◦ orientation edge responses to

a one pixel thick line (see Figure 4.18). This is because the resulting two pixel line contains very

little overlap in the perpendiculars, so the existing two pixels aren’t compared with each other. The

technique for thinning the 45 ◦ orientations is shown in Figure 4.19. If both positions of a diagonal

are occupied in a 2× 2 block then the other two positions are removed. If not then the reverse is

checked to see if the first diagonal should be removed. The values in adjacent orientations are also

checked. The result after removing diagonals is shown in Figure 4.15 (f). Compared with Figure

4.15 (a)-(d), Gaussian thinning produces thinner lines and conforms to the original orientations of

the edge responses.

4.5.2 Gaussian Thinning Results

Results of Gaussian thinning are shown in Figure 4.13 (d) and (e) applied to Canny and Asymmetry

edge responses respectively. The edges extracted are successfully thinned along the orientation of

the contours. The figure also demonstrates the benefits of using the Asymmetry detector for higher

level edge processing such as thinning. The thinned Asymmetry responses contain fewer spurious

edges than the thinned Canny responses showing that the multi-orientation Gaussian thinning

technique performs better with tightly tuned orientation and position edge detectors.

4.6 Asymmetry Edge Detector as a Computational Model

of the Visual Cortex

In Section 2.6 computational models of the visual cortex were presented. These models are de-

signed to validate vision processing theories rather than to be efficient edge detectors for use in

CBVR applications. The asymmetry edge detector presented in this chapter is also motivated by

the architecture of the visual cortex but is designed to be used in CBVR and other image pro-

cessing applications. Figure 4.20 shows the asymmetry edge detector and thinner in the context

of vision processing in the visual cortex. Both the Canny edge detector and asymmetry detectors

are represented as simple cells with the output of the asymmetry detector inhibiting the Canny

94

Simple Cell

Canny 3:1

Simple Cell

Asymmetry 2:1

Photoreceptors

RGB Image

Inhibition

Orientation and Spatial Competition

Gaussian Thinning

Spatial Competition

Remove Diagonals

Figure 4.20: Asymmetry edge detector model of the visual cortex.

edge detector. The Gaussian thinning stage represents both orientation and spatial competition in

the visual cortex whilst the remove diagonals stage represents spatial competition between simple

cells. The asymmetry edge detector differs from other models such as Marr’s [56] and Grossberg’s

[94] as it does not attempt to model the non-directional ganglion and LGN cells. It is also a purely

feed-forward implementation resulting in a simpler architecture and faster execution. Higher-level

stages of the model such as edge linking and end-stopped detection are discussed in the following

chapter.

4.7 New Texture Inhibition Approach

In this chapter edge detection techniques have been presented that can detect boundaries between

regions of homogeneous colour. Detecting boundaries between regions of heterogeneous colour,

such as texture, is more complex because local edges are also formed within the regions. Consider

Figure 4.21 (a) for example, even though the different textures are easily distinguishable by the

human brain there are no contours formed by a consistent change in homogeneous colour, as can be

seen by the lack of edge response along the texture borders in Figure 4.21 (b). Therefore the edge

techniques presented in this chapter alone are not enough to identify boundaries between regions

of texture.

Identifying texture boundaries is crucial for higher-level processing of contours. Since textures

consist of contours, a contour processing stage will process all of the contours within the texture,

which is unnecessary as these contours do not represent boundaries. Therefore it is beneficial to

inhibit texture regions before higher-level processing such as contour extraction occurs. Identifying

texture regions can be difficult as any occurrence of contours could be considered texture. There-

fore rather than simply identifying textures we present a technique that identifies the boundaries

between textures, which would also include non-textural contours. Higher level processes will only

process contours that lie within texture boundaries.

95

(a) (b) (c)

Figure 4.21: (a) A composite of Brodatz textures D9, D38, D92, and D24 histogram equalised [108],

(b) Edge responses of composite texture image, (c) Moving average of maximum edge responses.

4.7.1 Psychological and Perceptual Basis

Through intensive psychological studies Tamura et al. [39] found that humans group textures into

three groups based on coarseness, contrast, and directionality. Coarseness refers to the size of

the repeating pattern, contrast refers to the overall ratio between darkness and lightness in the

texture, and directionality refers to the orientation of the texture. A similar study conducted by

Rao and Lohse [68] found that humans grouped patterns by repetitiveness, directionality, and

complexity. Once again repetitiveness refers to the scale of the pattern and directionality refers to

the orientation of the texture. However, the third texture dimension of complexity refers to how

ordered the placement of the texture patterns are. The complexity could also be considered as

noise.

The first challenge is whether the edge responses of the Asymmetry edge detector are sufficient

to represent the three dimensions of texture. Since the primary component of the Asymmetry edge

detector is the Canny operator, which is similar to a Gabor filter, the edge detector is able to filter

spatial frequencies in a similar way to a wavelet. Therefore, the edge detector is able to detect

Tamura’s coarseness [39] or Rao and Lohse’s repetitiveness [68] which is essentially the spatial

frequency of the texture. Since the edge detector is also oriented, elongated, and uses an asymmetry

inhibitor to fine tune the orientation response, the edge detector is quite capable of representing

the orientation of a texture. Tamura’s contrast can also be represented by the amplitude of the

edge detector response since the edge detector responds to spatial changes which also affect the

contrast of the texture. The component that the edge detector does not represent directly is Rao

and Lohse’s complexity. However, the complexity of the texture is implicit in the location of the

edge responses. Therefore further processing of the edge responses is required to determine the

complexity of the texture. However, our goal is not so much to simply extract the features of the

texture but more importantly to define the spatial extent of a texture and the boundaries between

textures.

96

(a) (b)

Illusory contour

Figure 4.22: (a) Patch-suppressed cell; (b) Abutting grating stimulus.

There is some basis for the inhibition of edge responses through texture detectors in human

vision research. Sillito et al. [109] found a majority of cells (33/36) in V1 where the response

was suppressed by an increasing diameter of a circular patch of drifting sinusoidal grating. These

cells are known as patch-suppressed cells. They found that a small disk grating or a large disk

grating with an empty centre will evoke a response but not when both are combined. Therefore,

larger areas of dense edge responses will be inhibited. Sillito et al. [109] also performed cross-

correlation experiments on pairs of cells that were cross-oriented (had preferred stimulus that were

approximately 90 ◦ to each other). They found a high correlation between cross-oriented simple

cells when the stimulus had inner and outer gratings at 90 ◦ to each other (see Figure 4.22 (a)),

suggesting functional connectivity. Larkum et al. [110] found pyramidal neurones in layer 5 which

fired if both distal and proximal dendrites received input but not if either alone were activated.

Therefore, larger areas of dense edge responses are inhibited, but only if they do not border another

area of dense edge responses which ideally have a perpendicular orientation. Grosof et al. [111] have

also found cells in V1 which respond to the illusory contour formed at the end of abutting gratings

which are different to the cells found in V2 by Soriano et al. [112] which respond to more general

types of illusory contours. The abutting grating stimulus (see Figure 4.22 (b)), which is essentially

the boundary of a texture, shows that the edge boundary between textures is detected early on in

the visual pathway.

Some textures do not have clearly defined boundaries and segregation is dependent on higher

level processing. One example is that texture elements with differing numbers of line ends are easier

to segregate than those with the same number of terminations [113]. Psychophysical experiments

performed by Beck et al. [114] found that the strength of segregation depended on the contrast

and size difference of texture elements. The size difference can also be represented as a contrast

difference, hence the perception can be explained solely through contrast. They also found that

hue can have the same effect but only if the texture element and background are of the same lu-

minance. Beck et al. [114] were able to simulate the psychophysical results using bandpass filters.

It may appear possible that the oriented bandpass filters of the primary visual cortex can perform

texture segregation. Based on the results of Beck et al. [114] this appears possible, however neu-

rophysiological recordings have found that global segregation does not occur at this stage [115]. It

97

is possible that texture segregation can occur at a number of levels which provides a basis for the

low-level approach for processing texture boundaries taken in this chapter.

4.7.2 Texture Identification

Areas of texture need to be identified so that they do not interfere with the extraction of re-

gion boundaries. However, the boundary between two textures should also be considered a region

boundary. Therefore, a technique is required that identifies areas of texture but does not consider

the boundaries between textures as texture. An area of image consists of texture if it contains a

repeating pattern of contours. Therefore the first characteristic of a texture is that it consists of

a uniform spatial distribution of contours. The smallest unit of a repeating pattern is the texture

element, also known as a texton [116]. The distribution of contours within the texture element

does not need to be uniform, however, there must be some uniformity in the distribution of tex-

ture elements. Uniformity of distribution can be represented by the moving average of the edge

responses. Changes in the moving average reflect a change in spatial density of edge responses

within a window. The window of the moving average must be equal or greater than the size of the

texture element. For this research we have chosen a window size of 32 pixels wide and high.

Using the composite image formed from the four Brodatz textures of Figure 4.21 (a) the first

step is to extract the edge responses. The edge responses consist of 12 images representing each

15 ◦ orientation. Figure 4.21 (b) shows the maximum response from all orientations for each pixel.

Applying a moving average to the maximum edge responses produces the image in Figure 4.21 (c).

Unfortunately, applying a moving average to the maximum edge responses does not reveal much

change between the textures. This is because the textures of Figure 4.21 (a) have a relatively similar

edge density. However, the shapes of the texture elements are different and should be revealed by

processing edge orientations individually.

Figure 4.23 (a) shows the moving average applied to each orientation individually. The results

are multiplied by a factor of 10 to make the differences more visible. The differences between the

four textures begin to be revealed when the orientations are processed individually. This approach is

similar to the bandpass filters used by Beck et al. [114] to simulate visual cortex texture segregation.

A problem with the moving average approach is that the square window produces rectangular

artefacts in the average responses. This is caused by the moving average function giving every

pixel equal weighting, even those on the border of the window. The rectangular artefacts can be

removed by using a window with a Gaussian envelope where pixel weighting decreases as the radius

increases from the centre of the window. A two dimensional Gaussian filter with a bandwidth (σ)

of 10 pixels was used in place of the moving average function.

f(x, y) = e−x2+y2

2σ2 (4.27)

Since the convolution of the Gaussian filter with the edge responses can be applied in the Fourier

98

domain, the processing time is considerably less than the moving average approach. The results of

applying the Gaussian filter to the edge responses are shown in Figure 4.23 (b). The rectangular

artefacts are now removed, however the texture borders are less defined. Even so, the Gaussian

moving average of the oriented edge responses is able to detect areas of consistent texture.

4.7.3 Texture Edges

Even though the Gaussian moving average approach is able to successfully identify texture regions

it does not identify borders between textures. The borders between textures must be identified so

that they are not included in the texture areas that will inhibit higher level contour processing.

With the Gaussian moving average approach textures are represented by areas with similar moving

average values. Since the moving average is applied to each orientation, differences between textures

containing texture elements that vary by shape can also be identified. Textures that exhibit a strong

orientation will distribute most of the edge responses in one orientation, such as the top right

hand texture of Figure 4.21 (a). However, textures with multiple orientations will distribute edge

responses across multiple orientations. Nonetheless, differences in shape between textures can still

be identified in the individual orientation responses, as can be seen in Figure 4.23 (b). Therefore,

a texture border will occur when there is no consistency of oriented texture within a region. The

lack of consistency can be represented by the variance (σ2) of moving average responses within a

window.

σ2 =∑

(x− µ)2 (4.28)

A window of 32× 32 pixels was used to compute the variance. The individual variance images

for each orientation are then summed to produce the final image which is shown in Figure 4.24 (a).

The final image clearly shows the borders between the top right texture and the other textures but

only partially represents the bottom and left borders. The variance of moving averages of the edge

responses is similar to the patch-suppressed cells of the human visual cortex reported by Sillito et

al. [109] in that large areas of similar edge responses will be inhibited unless there is variance in

the edge responses over the area.

Since the variance computation also uses a square window similar to the moving average com-

putation it was investigated whether using a Gaussian mask for the variance computation would

improve the results. The computation of µ remains the same however the squared difference of

(x − µ)2 is multiplied by the corresponding Gaussian mask before adding to the variance value.

The results of the Gaussian mask are shown in Figure 4.24 (b) and do not appear to provide

a significant improvement over the square variance approach. Minor differences between the two

images are mainly due to the Gaussian mask being slightly larger than the square window.

99

(a)

(b)

Figure 4.23: (a) Moving average applied to individual orientations, (b) Gaussian filter with band-

width of 15 pixels applied to individual orientations.

100

(a) (b)

Figure 4.24: (a) Variance of moving average, (b) Gaussian variance of moving average.

4.7.4 Texture Noise

The edge responses used to identify texture and texture borders in the last few sections primarily

represent the shape of the texture. The results of the variance computation in Figure 4.24 show

that the shape information alone is not enough to distinguish between textures. The three di-

mensions identified by Rao and Lohse [68] were repetitiveness, directionality, and complexity. The

directionality is represented by the oriented edge responses. However, the edge responses do not

provide a direct indication of the complexity of the texture.

Francos et al. [36] used the Wold decomposition to decompose textures into harmonic and

indeterministic components. The Wold components also relate to the components identified by Rao

and Lohse [68] where the harmonic represents repetitiveness and the indeterministic component

represents complexity. By extending the Wold decomposition into two dimensions Francos et al. [36]

also included a new component called the evanescent component which represents the orientation

of texture. Francos et al. [63] used the auto-regressive moving average (ARMA) model to isolate

the indeterministic component. However, any noise model can and has been used such as moving

average (MA), auto-regressive (AR) [62], simultaneous auto-regressive (SAR) [61], multi-resolution

SAR (MRSAR) [64], Gauss-Markov, and Gibbs [65] models. The SAR model is an instance of

Markov random field (MRF) models [64]. Mao and Jain [64] used SAR and MRSAR models to

perform texture classification and segmentation. In this section we also investigate using the SAR

model for the purpose of identifying boundaries between textures.

101

SAR Model

The SAR model is as follows [64]:

g(s) = µ +∑r∈D

θ(r)g(s + r) + ε(s) (4.29)

where g(s) is the grey level value of a pixel at site s = (s1, s2), D is the set of neighbours at

site s which usually consists of the eight adjacent pixels, ε(s) is an independent Gaussian random

variable with zero mean and variance σ2, θ(r), r ∈ D are the model parameters characterising the

dependence of a pixel to its neighbours, and µ is the bias which is dependent on the mean grey

value of the image.

Texture representation using the SAR model involves determining the parameters µ, σ, and

θ(r), r ∈ D. For a symmetric model where θ(r) = θ(−r), all model parameters can be estimated us-

ing the least squares error (LSE) technique or the maximum likelihood estimation (MLE) method.

Mao and Jain [64] used the LSE technique because it is less time consuming and yields very similar

results to the MLE method.

SAR Implementation

Since more than one variable needs to be determined multiple regression must be used over simple

linear regression. The challenge with the SAR model is to choose an appropriate window size. In

this research the window size will be kept consistent at 32× 32 pixels. For each window, multiple

regression is used to determine the relationship between every pixel in the window and its eight

immediate neighbours. Multiple regression is usually solved using matrices. Equation 4.29 must be

rewritten using matrices:

Y = Xβ + ε (4.30)

Given that n is the set of pixels within a window and p is the set of eight neighbours around

each pixel then Y is the n× 1 matrix of grey level values within the window, X is the n× p matrix

of predictors within the window, that is, each column contains all eight neighbours for each pixel,

β is a p× 1 matrix containing the parameters θ(r), and ε is a n× 1 matrix of random disturbances

for each pixel. Solving equation 4.30 for β involves isolating the β matrix which is shown in the

following equation:

β = (X ′X)−1X ′Y (4.31)

SAR Optimisation

The SAR parameter calculations can be computationally expensive. For a window size of 32 pixels

and a neighbourhood of 8 pixels, 32× 32× 9× 9 = 82, 944 operations are performed per pixel. For

an image with 256× 256 pixels, 5,435,817,984 computations are required resulting in a processing

time of 14 minutes when implemented in Java on a 400MHz PC. In statistics, Equation 4.31 is

102

16 x 16 window

X'

Figure 4.25: The SAR moving window effect on the X ′ matrix.

often optimised using the QR decomposition. However, we investigated an algorithmic approach

for optimisation.

To improve performance, advantage was taken of the fact that a moving window is used to

compute the SAR values. Each subsequent window along the x axis will contain all of the values

of the previous window minus the values in the left column and plus a new column of values for

the right column. This effect can be visualised by looking at X ′. For this example, assume that

the window size is only 16 × 16 pixels. X ′ becomes a matrix with 256 columns and 9 rows. The

256 columns can be divided into groups of 16 columns which represent one column in the original

image window (see Figure 4.25). Since each column in the window is represented by a series of

columns in X ′, when the window moves one pixel to the right, the columns in X ′ which were used

to represent the far left column can be overwritten with the values from the new right column in

the window.

Replacing a section of values in X and X ′ allows an optimisation in the computation of X ′X

to take place. X ′ is a relatively wide matrix and X is a relatively tall matrix, multiplying the two

together results in a small square matrix. Each element (i, j) in the result matrix is calculated by

summing the product of corresponding elements from row j in X ′ and column i in X. When the

window is shifted to the right only the summed product of the old column needs to be subtracted

from the result matrix and the summed product of the new column added in. This results in only

two sets of summed products per pixel rather than the window size, which is 32 in this case.

The number of computations per pixel is reduced to 2 × 32 × 9 × 9 = 5184 and the number of

computations for a 256× 256 image is reduced to 339,738,624. The execution time is reduced from

14 minutes to 2.5 minutes.

103

The same optimisation can be applied to the X ′Y matrix multiplication which results in a 1×9

matrix. Before the optimisation, the computation of X ′Y requires 32×32×9×1 = 9216 operations

which is reduced to 2× 32× 9× 1 = 576 operations after the optimisation.

The optimisation can be taken even further storing the summed products of the previous

columns rather than recomputing them for every new column that is added. This halves the

number of operations to compute X ′X and X ′Y resulting in 32× 9× 9 = 2592 and 32× 9 = 288

operations per pixel respectively.

Finally, the same optimisation can be applied as the window moves down rows in the source

image. As the window moves along the x axis the new column can be computed by using the

summed multiplication for the same column in the previous row and subtracting the top pixel and

adding the new pixel. This reduces the number of operations per pixel to 9 × 9 = 81 to compute

X ′X and 9 to compute X ′Y . For pixels where x >= 1 and y >= 1 the number of computations is

independent of the window size. The only additional overhead is the additional memory required

to store the summed products of previous pixels and rows.

For a 256 × 256 image and a window size of 32 × 32 pixels 90 computations are required for

255 × 255 pixels resulting in 5,852,250 computations. 32 × 32 × 9 × 9 = 82944 computations are

required for the first pixel and 32× 9× 9 = 5184 computations are required for the remaining 255

pixels in the first row. The first pixel in each column also requires 32× 9× 9 = 5184 computations

for all but the top pixel. Therefore the total computations have been reduced to 8,579,034 from

5,435,817,984, a reduction factor of 633.

SAR Application

Using the multiple regression technique presented in the previous section the eight parameters were

determined for each pixel. We weren’t interested in the average (µ) or variance (σ2) as these have

already been computed in the previous sections. Applying the SAR model with a window size of

32× 32 pixels to the test image of Figure 4.21 (a) produced the eight parameter images of Figure

4.26. The SAR parameters show the distinction between the four textures. However, due to the

square window of the LSE technique rectangular artefacts are also produced.

The variance technique of Section 4.7.3 was applied to the SAR images resulting in Figure

4.27 (a). The results are similar to Figure 4.24 however some border responses are slightly com-

plementary. Adding the deterministic component (oriented edge responses) to the indeterministic

component (SAR model parameters) results in the combined texture edges image of Figure 4.27

(b). The combined result is slightly better than either individual result.

104

Figure 4.26: The SAR parameters of Figure 4.21 (a).

4.7.5 Texture Inhibition

The purpose of identifying texture regions and texture borders is to inhibit contours. The texture

edges image of Figure 4.27 (b) is subtracted from the edge response image of Figure 4.21 (b) to

produce Figure 4.27 (c). The resulting image shows that contours within texture areas are largely

inhibited whilst contours near texture borders are not inhibited. Unfortunately the current tech-

nique of using the variance of SAR parameters and oriented edge responses is not accurate enough

to inhibit texture edge responses before contour processing. Ideally the inhibitory action would

result in the suppressed contours of Figure 4.27 (d). The technique could be improved by simulat-

ing the illusory contours generated by cells in V1 when presented with abutting grating stimulus

as was discovered by Grosof et al. [111]. The illusory contours would interfere with the texture

identification stages of moving average oriented edge responses and the SAR model producing more

distinct results at the boundaries between textures.

4.7.6 Comparison with Other Techniques

Unlike other systems such as QBIC [16], ARBIRS [4] detects texture first before analyse colour

regions. ARBIRS uses a relatively simple non-directional first-order derivative edge detector for

determining the basic texture features. The image is subdivided into 24× 24 pixel blocks and edge

density and coarseness values are calculated from the first-order derivative edge responses. A block

is only considered a textured region if the edge density is greater than 25% of the block. Blocks

are then grouped into regions if they have similar colour histograms. The major difference with the

texture detection used in ARBIRS and the texture inhibition approach presented in this chapter

105

(b)(a)

(c) (d)

Figure 4.27: (a) Variance of SAR parameters, (b) Combined variance of SAR parameters and

oriented edge responses, (c) Contour image inhibited by (b), (d) Ideal inhibition.

106

is that the ARBIRS system uses large 24×24 pixel blocks which do not allow for arbitrary texture

boundaries to be identified. However, for the purposes of image retrieval (rather than contour

extraction) the ARBIRS texture subsystem performs well.

4.8 Conclusion

Edge detection must accurately represent the edges present at each pixel. When used for contour

following the accuracy and tuning of the edge detector becomes paramount. In this chapter a

number of existing edge detectors were analysed for suitability for contour following. We found

that a majority of edge detectors that are commonly used such as the Roberts, Prewitt, Sobel, and

Laplacian are not suitable for contour following. Contour following requires multiple arbitrarily

orientated edge detectors. Of the currently used operators, only the Gabor and Canny operators

satisfy these criteria. The S-Gabor and Canny operators were analysed at multiple aspect ratios

to determine their orientation and position tuning performance. We found that neither operator

had a significant advantage over the other. We also found that as the aspect ratio increased there

was a trade off between orientation and position tuning.

An Asymmetry detector was developed that position tunes elongated orientation filters. By

itself, the elongated orientation filter produces good orientation tuning but poor position tuning.

Inhibiting the elongated orientation filter’s responses with the Asymmetry detector provided both

near-perfect orientation and position tuning. The result is a filter that outperforms any other filter

for the purposes of contour following.

To further comply with the requirements of contour following, thinning was investigated to

remove ambiguous edge responses. Morphological thinning and skeletonisation thinning were in-

vestigated but were unable to provide the correct edge responses as they could only be applied

within the discrete horizontal-vertical pixel layout of images. A new technique was developed that

allows thinning to work in the orientation of the edge response using elongated Gaussian filters

perpendicular to the edge orientation. This thinning approach is further refined by also thinning

across adjacent orientations and finally a removal of diagonals. The result is a multi-orientation

edge image that is representative of the true edges in the original image and is ideal for the sub-

sequent phase of contour following. The Asymmetry edge detector is more suitable for contour

following than the Sobel, Roberts, Prewitt, Kirsch, Robinson, and Laplacian operators and pro-

duces better results than just Gabor or Canny filters on their own whilst providing more accurately

thinned results than skeletonisation and morphological thinning.

A new approach for texture analysis was developed using the Asymmetry edge detector. The

purpose of low-level texture analysis is to inhibit edge responses before the contour following

stage to reduce processing overhead. Texture regions were identified using the Asymmetry edge

detector as well as an optimised SAR implementation. However, rather than simply identifying

texture regions, the approach is also able to distinguish between neighbouring textures so that

107

boundaries between textures can propagate up to higher-level contour processing stages allowing

the boundaries between textures to be identified and used to form regions. The boundary detection

phase uses the moving variance to detect changes in textural distribution in Asymmetry edge and

SAR features. Even though the approach is able to identify textures and boundaries between

textures more work is required to achieve reliable texture inhibition before contour processing.

Incorporating contour-end detection may improve the technique’s ability to distinguish boundaries

between textures.

108

Documents

Edge and Texture - pub.ro › _curs › cm › proiecte › Content_Based... · very simple edge detectors [27, 19, 4]. By themselves simple edge detectors perform well against complex