Download pdf - DEPARTMENT OF ECE

IV B.Tech I Semester ECE

www.asrece.com

ASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGY

DEPARTMENT OF ECE

DIGITAL IMAGE PROCESSING

S.NO.

1

2

3

4

5

Download this study material fromDownload this study material fromDownload this study material fromDownload this study material from

IV B.Tech I Semester ECE

1

ASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYPRATHIPADU, TADEPALLIGUDEM

DEPARTMENT OF ECE

DIGITAL IMAGE PROCESSINGCONTENT

UNIT

UNIT-1

UNIT-3

UNIT-4

UNIT-6

UNIT-7


www.asrece.com

Digital Image Processing

Department of ECE

ASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGY PRATHIPADU, TADEPALLIGUDEM

DEPARTMENT OF ECE

DIGITAL IMAGE PROCESSING

PAGE NO.

2-30

31-56

57-69

70-97

98-111


www.asrece.com

Digital Image Processing

Department of ECE

IV B.Tech I Semester ECE Digital Image Processing

www.asrece.com 2 Department of ECE

UNIT-I

Image processing involves changing the nature of an image in order to either:

1. Improve its pictorial information for human interpretation; or

2. Render it more suitable for processing, storage, transmission, and representation for

autonomous machine perception.

Examples of condition 1 may include:

� Enhancing the edges of an image to make it appear sharper

� Removing noise from an image

� Removing motion blur from an image

Examples of condition 2 may include:

� Obtaining the edges of an image

� Removing detail from an image

ASPECTS OF IMAGE PROCESSING

Digital image processing is the use of computer algorithms to perform image processing

on digital images. It is convenient to subdivide different image-processing algorithms into broad

subclasses:

Image Enhancement: Processing an image so that the result is more suitable for a particular

application is called image enhancement. Examples include:

� Sharpening or deblurring an out-of-focus image

� Highlighting Edges

� Improving image contrast or brightening an image and

� Removing noise

Image Restoration: An image may be restored by the damage done to it by a known cause, for

example:

� Removing of blur caused by linear motion

� Removal of optical distortions and

� Removing periodic interference

Image Segmentation: Segmentation involves sub dividing an image into constituent parts or

isolating certain aspects of an image, including:

� Finding lines, circles, or particular shapes in an image &

� Identifying cars, trees, buildings, or roads in an aerial photographs

DIGITAL IMAGE REPRESENTATION

An image may be defined as a two-dimensional function, f(x, y), where x and y are

spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the

intensity or gray level of the image at that point. When x, y, and the amplitude values of f are all

finite, discrete quantities, we call the image a digital image. The field of digital image processing

refers to processing digital images by means of a digital computer. A digital image is composed

of a finite number of elements, each of which has a particular location and value. These elements

are referred to as picture elements, image elements, pels, and pixels. Pixel is the term most

widely used to denote the elements of a digital image.



A Digital image can be considered as a matrix whose row and column indices identify a

point in the image and the corresponding matrix element value identifies the gray level at that

point.

FUNDAMENTAL STEPS IN IMAGE PROCESSING

Let us consider a simple application of image processing technique for automatically

reading the address on pieces of mail. The following figure shows that the overall objective is to

produce a result from a problem domain by means of image processing.

The problem domain in this example consists of pieces of mail, and the objective is to

read the address on each piece. Thus the desired output in this case is a stream of alphanumeric

characters.

Image acquisition

The first step in this process is Image Acquisition- that is, to acquire a digital image. To

do so requires imaging sensor and the capability to digitize the signal produced by the sensor.

The sensor could be a TV camera that produces an entire image of the problem domain every

1/30 sec. The imaging sensor can also be a line-scan camera that produces a single image line at

a time. If the output of the camera or other imaging sensor is not already in digital form, an

Knowledge Base

Image

Acquisition

Segmentation

Preprocessing

Recognition And

Interpretation

Representation

& description

Problem Domain

Result



analog-to-digital converter digitizes it. The nature of the sensor and the image it produces are

determined by the application.

Preprocessing

After a digital image has been obtained, the next step is preprocessing that image. The

key function of preprocessing is to improve the image in ways that increase the chances for

success of the other processes. Basically, the idea behind enhancement techniques is to bring out

detail that is obscured, or simply to highlight certain features of interest in an image.

In this example, preprocessing typically deals with techniques, for enhancing contrast,

removing noise, and isolating regions whose texture indicate a likelihood of

alphanumeric information.

Segmentation

The next stage deals with Segmentation. Segmentation partitions an input image into its

constituent parts or objects. In general, autonomous segmentation is one of the most difficult

tasks in digital image processing. A rugged segmentation procedure brings the process a long

way toward successful solution of imaging problems that require objects to be identified

individually. On the other hand, weak or erratic segmentation algorithms almost always

guarantee eventual failure. In general, the more accurate the segmentation, the more likely

recognition is to succeed.

In our example of character recognition, the key role of segmentation is to extract

individual characters and words from the background.

Representation and Description

The output of the segmentation stage usually is raw pixel data, constituting either the

boundary of a region (i.e., the set of pixels separating one image region from another) or all the

points in the region itself. In either case, converting the data to a form suitable fro computer

processing is necessary. The first decision that must be made is whether the data should be

represented as a boundary or as a complete region. Boundary representation is appropriate when

the focus is on external shape characteristics, such as corners and inflections. Regional

representation is appropriate when the focus is on internal properties, such as texture or skeletal

shape. In some applications, these representations complement each other.

Our character recognition example requires algorithms based on boundary shape as well

as skeletons and other internal properties.

Choosing a representation is only part of the solution for transforming raw data into a

form suitable for subsequent computer processing. A method must also be specified for

describing the data so that features of interest are highlighted. Description, also called feature

selection, deals with extracting attributes that result in some quantitative information of interest

or are basic for differentiating one class of objects from another.

In terms of character recognition, descriptors such as lakes (holes) and bays are

powerful features that help differentiate one part of the alphabet from another.



Recognition and Interpretation

The last stage involves recognition and interpretation. Recognition is the process that

assigns a label to an object based on its descriptors. Interpretation involves assigning meaning to

an ensemble of recognized objects.

In terms, of our example, identifying a character as, say, a c requires associating the

descriptors for that character with the label c. Interpretation attempts to assign meaning

to a set of labeled entities. For example, a string of six numbers can be interpreted to be

a ZIP code.

So far we have said nothing about the need for prior knowledge or about the interaction

between the knowledge base and the processing modules. Knowledge about a problem domain is

coded into an image processing system in the form of a knowledge database. In addition to

guiding the operation of each processing module, the knowledge base also controls the

interaction between modules.

ELEMENTS OF DIGITAL IMAGE PROCESSING SYSTEM

IMAGE ACQUISITION

Two elements are required to acquire digital images. The first is a physical device that is

sensitive to a band in the electromagnetic energy spectrum (such as x-ray, ultraviolet, visible or

infrared bands) and that produces an electrical signal output proportional to the level of energy

sensed. The second, called a Digitizer, is a device for converting the electrical output of the

physical sensing device into digital form.

The types of images in which we are interested are generated by the combination of an

“illumination” source and the reflection or absorption of energy from that source by the elements

of the “scene” being imaged. We enclose illumination and scene in quotes to emphasize the fact

IMAGE

ACQUISITION DISPLAY

PROCESSING

STORAGE

COMMUNICATION



that they are considerably more general than the familiar situation in which a visible light source

illuminates a common everyday 3-D (three-dimensional) scene. For example, the illumination

may originate from a source of electromagnetic energy such as radar, infrared, or X-ray energy.

Figure shows the three principal sensor arrangements used to transform illumination

energy into digital images- single imaging sensor, Line sensor and Array sensor.

The idea is simple: incoming energy is transformed into a voltage by the combination of

input electrical power and sensor material that is responsive to the particular type of energy

being detected. The output voltage waveform is the response of the sensor(s), and a digital

quantity is obtained from each sensor by digitizing its response.

Image Acquisition Using a Single Sensor

The most familiar sensor of this type is the photodiode, which is constructed of silicon

materials and whose output voltage waveform is proportional to light. The use of a filter in front

of a sensor improves selectivity. In order to generate a 2-D image using a single sensor, there has

to be relative displacements in both the x- and y-directions between the sensor and the area to be

imaged.



Figure shows an arrangement used in high-precision scanning, where a film negative is

mounted onto a drum whose mechanical rotation provides displacement in one dimension. The

single sensor is mounted on a lead screw that provides motion in the perpendicular direction.

Microdensitometers

In microdensitometers the transparency or photograph is mounted on a flat bed or

wrapped around a drum. Scanning is accomplished by focusing a beam of light (which could be

a laser) on the image and translating the bed or rotating the drum in relation to the beam. In the

case of transparencies, the beam passes through the film; in photographs the beam is reflected

from the surface of the image. In both cases, the beam is focused on a photodetector and the gray

level at any point in the image is obtained by allowing only discrete values of intensity and

position in the output.

Image Acquisition Using Sensor Strips

A geometry that is used much more frequently than single sensors consists of an in-line

arrangement of sensors in the form of a sensor strip.

The strip provides imaging elements in one direction. Motion perpendicular to the strip

provides imaging in the other direction as above figure shows. This is the type of arrangement

used in most flat bed scanners. Solid state arrays are composed of discrete silicon imaging

elements, called photosites that have a voltage output proportional to the intensity of the incident

light. The figure below shows a typical line scan sensor containing a row of photosites, two

transfer gates used to clock the contents of the imaging elements into transport registers, and an

output gate used to clock the contents of the transport registers into an amplifier. The amplifier

outputs a voltage signal proportional to the contents of the row of photosites.



Sensor strips mounted in a ring configuration are used in medical and industrial imaging

to obtain cross-sectional (“slice”) images of 3-D objects, as Figure shows. A rotating X-ray

source provides illumination and the portion of the sensors opposite the source collect the X-ray

energy that pass through the object (the sensors obviously have to be sensitive to X-ray energy).

A 3-D digital volume consisting of stacked images is generated as the object is moved in a

direction perpendicular to the sensor ring.

Image Acquisition Using Sensor Arrays

Numerous electromagnetic and some ultrasonic sensing devices frequently are arranged

in an array format. This is also the predominant arrangement found in digital cameras. A typical

sensor for these cameras is a CCD array. Charge-Coupled are arrays are similar to line-scan

sensors, except that the photosites are arranged in a matrix form and gate/transport register

combination separates columns of photosites.



The principal manner in which array sensors are used is shown in figure. This figure

shows the energy from an illumination source being reflected from a scene element, but, as

mentioned at the beginning of this section, the energy also could be transmitted through the

scene elements. The first function performed by the imaging system shown in figure (c) is to

collect the incoming energy and focus it onto an image plane. If the illumination is light, the

front end of the imaging system is a lens, which projects the viewed scene onto the lens focal

plane, as figure (d) shows. The sensor array, which is coincident with the focal plane, produces

outputs proportional to the integral of the light received at each sensor. Digital and analog

circuitry sweeps these outputs and converts them to a video signal, which is then digitized by

another section of the imaging system.

STORAGE

Digital storage for image processing applications falls into three principle categories:

i) Short term storage for use during processing

ii) On-line storage for relatively fast recall and

iii) Archival storage, characterized by infrequent access

� One method for providing short term storage is computer memory. Another is by

specialized boards, called frame buffers that store one or more images and can be

accessed rapidly.

� On-line storage generally takes the form of magnetic disks. Juke boxes that hold 30-100

optical disks provide an effective solution for large scale, on-line storage applications that

require read-write capability.

� Archival storage is characterized by massive storage requirements, but infrequent need

for access. Magnetic tapes and optical disks are the usual media for archival applications.

PROCESSING

Processing of digital images involves procedures that are usually expressed in

algorithmic form. Most image processing functions are implemented in software. The only eason

for specialized image processing hardware is the need for speed in some applications or to

overcome some fundamental computer limitations. Image processing is characterized by specific

solutions. Hence techniques that work well in one area can be totally inadequate in another.



COMMUNICTION

Communication in digital image processing primarily involves local communication

between image processing systems and remote communication from one point to another.

Hardware and software for local communication are readily available for most computers.

Communication of images across vast distances presents a more serious challenge. A voice-

grade telephone line is cheaper to use but slower. Wireless links using intermediate stations, such

as satellites, are much faster, but they also considerably more.

DISPLAY

Monochrome and color monitors are the principal display devices used in modern image

processing systems. Monitors are driven by the outputs of an image display module in the

backplane of the host computer. The signals of the display module can also be fed into an image

recording device that produces a hard copy of the image being viewed in the monitor screen.

Other display media include random access cathode ray tubes (CRTs) and printing devices.

A SIMPLE IMAGE FORMATION MODEL

The term image refers to a two dimensional light-intensity function, denoted by f(x,y),

where the value or amplitude of f at spatial coordinates (x,y) gives the intensity (brightness) of

the image at that point. As light is a form of energy, f(x,y) must be nonzero and finite, that is,

0 < f(x,y) < ∞

The images people perceive in everyday visual activities normally consist of light

reflected from objects. The function f(x, y) may be characterized by two components: (1) the

amount of source illumination incide nt on the scene being viewed, and (2) the amount of

illumination reflected by the objects in the scene. Appropriately, these are called the illumination

and reflectance components and are denoted by i(x, y) and r(x, y), respectively. The two

functions combine as a product to form f(x, y):

f(x, y) = i(x, y) r(x, y)

where

0 < i(x, y) < ∞

and

0 < r(x, y) < 1

Above equation indicates that reflectance is bounded by 0 (total absorption) and 1 (total

reflectance). The nature of i(x, y) is determined by the illumination source, and r(x, y) is

determined by the characteristics of the imaged objects.

The values given in above equations are theoretical bounds. The following average

numerical figures illustrate some typical ranges of i(x, y) for visible light. On a clear day, the sun

may produce in excess of 90,000 foot-candles of illumination on the surface of the Earth. This

figure decreases to less than 10,000 foot-candles on a cloudy day. On a clear evening, a full

moon yields about 0.1 foot-candles of illumination. The typical illumination level in a

commercial office is about 1000 foot-candles. Similarly, the following are some typical values of

r(x, y): 0.01 for black velvet, 0.65 for stainless steel, 0.80 for flat-white wall paint, 0.90 for

silver-plated metal, and 0.93 for snow.



The intensity of a monochrome image at any coordinates (x, y) the gray level (l) of the

image at that point. That is,

l=f(x,y)

From the equations of illumination and reflection, it is evident that l lies in the range

Lmin≤ l ≤ Lmax

In theory, the only requirement on Lmin is that it be positive, and on Lmax that it be finite. In

practice, Lmin=iminrmin and Lmax=imaxrmax.Using the preceding average office illumination and

range of reflectance values as guidelines, we may expect Lmin≈10 and Lmax≈1000 to be typical

limits for indoor values in the absence of additional illumination. The interval is called the gray

scale. Common practice is to shift this interval numerically to the interval [0, L-1], where l=0 is

considered black and l=L-1 is considered white on the gray scale. All intermediate values are

shades of gray varying from black to white.

UNIFORM SAMPLING & QUANTIZATION

There are numerous ways to acquire images, but our objective in all is the same: to

generate digital images from sensed data. The output of most sensors is a continuous voltage

waveform whose amplitude and spatial behavior are related to the physical phenomenon being

sensed. To create a digital image, we need to convert the continuous sensed data into digital

form. This involves two processes: sampling and quantization.

Basic Concepts in Sampling and Quantization

The basic idea behind sampling and quantization is illustrated in following figure. Figure

(a) shows a continuous image, f(x, y), that we want to convert to digital form. An image may be

continuous with respect to the x- and y-coordinates, and also in amplitude. To convert it to

digital form, we have to sample the function in both coordinates and in amplitude. Digitizing the

coordinate values is called sampling. Digitizing the amplitude values is called quantization.

(a) (b)



(c) (d) The one-dimensional function shown in Fig. (b) is a plot of amplitude (gray level) values

of the continuous image along the line segment AB in Fig.(a).The random variations are due to

image noise. To sample this function, we take equally spaced samples along line AB, as shown

in Fig. (c).The location of each sample is given by a vertical tick mark in the bottom part of the

figure. The samples are shown as small white squares superimposed on the function. The set of

these discrete locations gives the sampled function. However, the values of the samples still span

(vertically) a continuous range of gray-level values. In order to form a digital function, the gray-

level values also must be converted (quantized) into discrete quantities. The right side of Fig. (c)

shows the gray-level scale divided into eight discrete levels, ranging from black to white. The

vertical tick marks indicate the specific value assigned to each of the eight gray levels. The

continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to

each sample. The assignment is made depending on the vertical proximity of a sample to a

vertical tick mark. The digital samples resulting from both sampling and quantization are shown

in Fig.(d). Starting at the top of the image and carrying out this procedure line by line produces a

two-dimensional digital image. Quantization of the sensor outputs completes the process of

generating a digital image.

Representing Digital Images

To be suitable for computer processing, an image function f(x,y) must be digitized both

spatially and in amplitude. Digitization of the spatial coordinates (x,y) is called image sampling,

and amplitude digitization is called gray-level quantization.

The result of sampling and quantization is a matrix of real numbers. Suppose that a

continuous image f(x, y) is approximated by equally spaced samples arranged in the form of an

NxM matrix, where each element of the array is a discrete quantity.

The right side of this equation is by definition a digital image. Each element of this matrix array

is called an image element, picture element, pixel, or pel. The terms image and pixel will be used

throughout the rest of our discussions to denote a digital image and its elements.



Expressing sampling and quantization in more formal mathematical terms can be useful

at times. Let Z and R denote the set of real integers and the set of real numbers, respectively. The

sampling process may be viewed as partitioning the xy plane into a grid, with the coordinates of

the center of each grid being a pair of elements from the Cartesian product Z2(ZxZ), which is the

set of all ordered pairs of elements (zi, zj), with zi and zj being integers from Z. Hence, f(x, y) is a

digital image if (x, y) are integers from Z2 and f is a function that assigns a gray-level value (that

is, a real number from the set of real numbers, R) to each distinct pair of coordinates (x, y). This

functional assignment obviously is the quantization process described earlier. If the gray levels

also are integers (as usually is the case in this and subsequent chapters), Z replaces R, and a

digital image then becomes a 2-D function whose coordinates and amplitude values are integers.

This digitization process requires decisions about values for M, N, and for the number, L,

of discrete gray levels allowed for each pixel. There are no requirements on Mand N, other than

that they have to be positive integers. However, due to processing, storage, and sampling

hardware considerations, the number of gray levels typically is an integer power of 2:

L = 2k

We assume that the discrete levels are equally spaced and that they are integers in the interval [0,

L-1]. Sometimes the range of values spanned by the gray scale is called the dynamic range of an

image, and we refer to images whose gray levels span a significant portion of the gray scale as

having a high dynamic range. When an appreciable number of pixels exhibit this property, the

image will have high contrast. Conversely, an image with low dynamic range tends to have a

dull, washed out gray look.

The number, b, of bits required to store a digitized image is

b=M x N x k

When M=N, this equation becomes

b = N2k

Table shows the number of bits required to store square images with various values of N

and k.The number of gray levels corresponding to each value of k is shown in parentheses:



Spatial and Gray-Level Resolution

The resolution of an image strongly depends on two parameters: no. of samples and gray

levels. Sampling is the principal factor determining the spatial resolution of an image. Basically,

spatial resolution is the smallest discernible detail in an image. Gray-level resolution similarly

refers to the smallest discernible change in gray level. Due to hardware considerations, the

number of gray levels is usually an integer power of 2.

Now, let us consider the effect that variations in N and k have on the image quality

Effect of Reducing the Spatial Resolution:

Figure 2.19 shows an image of size 1024x1024 pixels whose gray levels are represented

by 8 bits. The other images shown in Fig. 2.19 are the results of sub-sampling the 1024*1024

image. The sub sampling was accomplished by deleting the appropriate number of rows and

columns from the original image. For example, the 512x512 image was obtained by deleting

every other row and column from the 1024x1024 image. The 256x256 image was generated by

deleting every other row and column in the 512x512 image, and so on. The number of allowed

gray levels was kept at 256.

The simplest way to compare these effects is to bring all the sub-sampled images up to

size 1024*1024 by row and column pixel replication. The level of detail lost is simply too fine to

be seen on the printed page at the scale in which these images are shown. Next, the 256x256

image in Fig. 2.20(c) shows a very slight fine checkerboard pattern in the borders between

flower petals and the black background. A slightly more pronounced graininess throughout the

image also is beginning to appear. These effects are much more visible in the 128x128 image in

Fig. 2.20(d), and they become pronounced in the 64x64 and 32x32 imagesin Figs. 2.20(e) and

(f), respectively.



Effect of Reducing Gray Level Resolution (GRAY TO BINARY CONVERSION)

In this example, we keep the number of samples constant and reduce the number of gray

levels from 256 to 2, in integer powers of 2. Figure 2.21(a) is a 452x374image, displayed with

k=8(256 gray levels).



Figures 2.21(b) through (h) were obtained by reducing the number of bits from k=7to

k=1while keeping the spatial resolution constant at 452x374 pixels. The 256-, 128-, and 64-level

images are visually identical for all practical purposes. The 32-level image shown in Fig. 2.21(d),

however, has an almost imperceptible set of very fine ridge like structures in areas of smooth

gray levels (particularly in the skull).This effect, caused by the use of an insufficient number of

gray levels in smooth areas of a digital image, is called false contouring, so called because the

ridges resemble topographic contours in a map. False contouring generally is quite visible in

images displayed using 16 or less uniformly spaced gray levels, as the images in Figs. 2.21(e)

through (h) show.

Iso Preference Curves

The results in above Examples illustrate the effects produced on image quality by varying

N and k independently. However, these results only partially answer the question of how varying

N and k affect images because we have not considered yet any relationships that might exist

between these two parameters. An early study was attempted to quantify experimentally the

effects on image quality produced by varying N and k simultaneously. The experiment consisted

of a set of subjective tests. Images similar to those shown in Fig. 2.22 were used.The woman’s

face is representative of an image with relatively little detail; the picture of the cameraman

contains an intermediate amount of detail; and the crowd picture contains, by comparison, a large

amount of detail.



Sets of these three types of images were generated by varying N and k, and observers

were then asked to rank them according to their subjective quality. Results were summarized in

the form of so-called isopreference curves in the Nk-plane. Each point in the Nk-plane represents

an image having values of N and k equal to the coordinates of that point. Points lying on an

isopreference curve correspond to images of equal subjective quality. It was found in the course

of the experiments that the isopreference curves tended to shift right and upward, but their shapes

in each of the three image categories were similar to those shown in Fig. 2.23.This is not

unexpected, since a shift up and right in the curves simply means larger values for N and k,

which implies better picture quality.

The key point of interest here is that isopreference curves tend to become more vertical as

the detail in the image increases. This result suggests that for images with a large amount of

detail only a few gray levels may be needed. For example, the isopreference curve in Fig. 2.23

corresponding to the crowd is nearly vertical. This indicates that, for a fixed value of N, the

perceived quality for this type of image is nearly independent of the number of gray levels used

(for the range of gray levels shown in Fig. 2.23). It is also of interest to note that perceived

quality in the other two image categories remained the same in some intervals in which the

spatial resolution was increased, but the number of gray levels actually decreased. The most

likely reason for this result is that a decrease in k tends to increase the apparent contrast of an

image, a visual effect that humans often perceive as improved quality in an image.



NON-UNIFORM SAMPLING AND QUANTIZATION

Non-Uniform Sampling

For a fixed value of spatial resolution, the appearance of an image can be improved in

many cases by using an adaptive scheme where the sampling process depends on the

characteristics of the image. In general, fine sampling is required in the neighborhood of sharp

gray-level transitions, whereas coarse sampling may be utilized in relatively smooth regions.

Consider, for example, a simple consisting of a face superimposed on a uniform background.

Clearly, the background carries little detailed information and can be quite adequately

represented by coarse sampling. The face, however contains considerably more detail. If the

additional samples not used in the background are used in region of the image, the overall result

would tend to improve. In distributing the samples, greater sample concentration should be used

in gray-level transition boundaries, such as boundary between the face and the background.

Disadvantages or Drawbacks:

The necessity of having to identify boundaries is a definite draw back of the nonuniform

sampling approach.

This method also is not practical for images containing relatively small uniform regions

(crowd image).

Non-Uniform Sampling

When the number of gray levels must be kept small, the use of unequally spaced levels in the

quantization process usually is desirable. A method similar to the non-uniform sampling

technique may be used for the distribution of gray levels in an image. As the eye is relatively

poor at estimating shades of gray near abrupt level changes, the approach in this case is to use

few gray levels in the neighborhood of boundaries. The remaining levels can then be used in

regions where gray-level variations are smooth, thus avoiding or reducing the false contours that

often appear in these regions if they are too coarsely quantized.

Disadvantages

This method is subjected to the preceding observations about boundary detection and

detail content.

An alternative technique that is particularly attractive for distributing gray levels consists of

computing the frequency of occurrence of allowed levels. If gray levels in a certain range occur

frequently , while others occur rarely, the quantization levels are finely spaced in this range and

coarsely spaced outside of it. This method is sometimes called as TAPERED QUATIZATION



SOME BASIC RELATIONSHIPS BETWEEN PIXELS

An image is denoted by f(x, y).When referring in this section to a particular pixel, we use

lowercase letters, such as p and q. A subset of pixels of f(x,y) is denoted by S.

Neighbors of a Pixel

A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates are given by

(x+1, y), (x-1, y), (x, y+1), (x, y-1)

This set of pixels, called the 4-neighbors of p, is denoted by N4(p). Each pixel is a unit distance

from (x, y), and some of the neighbors of p lie outside the digital image if (x, y) is on the border

of the image.

The four diagonal neighbors of p have coordinates

(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)

and are denoted by ND(p). These points, together with the 4-neighbors, are called the 8-neighbors

of p, denoted by N8(p). As before, some of the points in ND(p) and N8(p) fall outside the image if

(x, y) is on the border of the image.

Connectivity, Adjacency, Regions & Boundaries

Connectivity between pixels is a fundamental concept that simplifies the definition of

numerous digital image concepts, such as regions and boundaries. To establish if two pixels are

connected, it must be determined if they are neighbors and if their gray levels satisfy a specified

criterion of similarity (say, if their gray levels are equal).For instance, in a binary image with

values 0 and 1, two pixels may be 4-neighbors, but they are said to be connected only if they

have the same value.

Let V be the set of gray-level values used to define connectivity. In a binary image,

V={1} if we are referring to connectivity of pixels with value 1. In a grayscale image, the idea is

the same, but set V typically contains more elements. For example, in the connectivity of pixels

with a range of intensity values say, 32 to 64, it follows that V={32,33,…63,64}. We consider

three types of connectivity:

(a) 4-connectivity: Two pixels p and q with values from V are 4-connected if q is in the set N4(p).

(b) 8- connectivity: Two pixels p and q with values from V are 8- connected if q is in the set N8(p).

(c) m- connectivity (mixed connectivity): Two pixels p and q with values from V are m- connected if

(i) q is in N4(p), or

(ii) q is in ND(p) and the set N4(P)∩N4(q)has no pixels whose values are from V.

Mixed connectivity is a modification of 8-connectivity. It is introduced to eliminate the

ambiguities (multiple path connections) that often arise when 8-adjacency is used. For example,

consider the pixel arrangement shown in figure (a). For V={1} The three pixels at the top of

figure show multiple (ambiguous) 8-adjacency, as indicated by the dashed lines in (b). This

ambiguity is removed by using m-adjacency, as shown in (c).



(a) (b) (c)

A pixel p is adjacent to q if they are connected. So there are 3 types of adjacencies too

(a) 4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is in the set N4(p).

(b) 8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is in the set N8(p).

(c) m-adjacency (mixed adjacency): Two pixels p and q with values from V are m-adjacent if

(i) q is in N4(p), or

(ii) q is in ND(p) and the set N4(P)∩N4(q)has no pixels whose values are from V.

A (digital) path (or curve) from pixel p with coordinates (x, y) to pixel q with coordinates (s, t) is

a sequence of distinct pixels with coordinates

(x0,y0),(x1,y1),…,(xn,yn)

Where (x0,y0)= (x,y) and (xn,yn)=(s,t), and (xi,yi) is adjacent to (xi-1,yi-1), 1≤i≤n. In this case, n is

the length of the path.

If (x0,y0)= (xn,yn) the path is a closed path.

We can define 4-, 8-, or m- paths depending on the type of adjacency specified. For

example, the paths shown in above figure (b) between the northeast and southeast points are 8-

paths, and the path in figure (c) is an m-path.

Let S represent a subset of pixels in an image. Two pixels p and q are said to be

connected in S if there exists a path between them consisting entirely of pixels in S. For any pixel

p in S, the set of pixels that are connected to it in S is called a connected component of S. If it

only has one connected component, then set S is called a connected set.

Let R be a subset of pixels in an image. We call R a region of the image if R is a

connected set. The boundary (also called border or contour) of a region R is the set of pixels in

the region that have one or more neighbors that are not in R. If R happens to be an entire image

(which we recall is a rectangular set of pixels), then its boundary is defined as the set of pixels in

the first and last rows and columns of the image. This extra definition is required because an

image has no neighbors beyond its border. Normally, when we refer to a region, we are referring

to a subset of an image, and any pixels in the boundary of the region that happen to coincide with

the border of the image are included implicitly as part of the region boundary.

The concept of an edge is found frequently in discussions dealing with regions and

boundaries. There is a key difference between these concepts, however. The boundary of a finite

region forms a closed path and is thus a “global” concept. Edges are formed from pixels with



derivative values that exceed a preset threshold. Thus, the idea of an edge is a “local” concept

that is based on a measure of gray-level discontinuity at a point. It is possible to link edge points

into edge segments, and sometimes these segments are linked in such a way that correspond to

boundaries, but this is not always the case. The one exception in which edges and boundaries

correspond is in binary images. Depending on the type of connectivity and edge operators used,

the edge extracted from a binary region will be the same as the region boundary. Conceptually, it

is helpful to think of edges as intensity discontinuities and boundaries as closed paths.

Distance Measures

For pixels p, q, and z, with coordinates (x, y), (s, t), and (v, w), respectively, D is a

distance function or metric if

(a) D(p, q) ≥ 0 (D(p, q)=0 if and only if p=q),

(b) D(p, q)=D(q, p), and

(c) D(p, z) ≤D(p, q)+D(q, z).

The Euclidean distance between p and q is defined as

De (p,q)=[(x-s)2 + (y-t)2]1/2

For this distance measure, the pixels having a distance less than or equal to some value r from

(x,y) are the points contained in a disk of radius r centered at (x, y).

The D4 distances (also called city-block distance) between p and q is defined as

D4 (p,q)=|x-s| + |y-t|

In this case, the pixels having a D4 distance from (x, y) less than or equal to some value r form a

diamond centered at (x, y). For example, the pixels with D4 distance ≤ 2 from (x, y) (the center

point) form the following contours of constant distance:

The pixels with D4=1 are the 4-neighbors of (x, y). The D8 distance (also called chessboard distance) between p and q is defined as

D8 (p,q) = max(|x-s| , |y-t|)

In this case, the pixels with D8 distance from (x, y) less than or equal to some value r form a

square centered at (x, y). For example, the pixels with D8 distance ≤ 2 from (x, y) (the center

point) form the following contours of constant distance:



The pixels with D8=1 are the 8-neighbors of (x, y). Note that the D4 and D8 distances between p and q are independent of any paths that

might exist between the points because these distances involve only the coordinates of the points.

If we elect to consider m-adjacency, however, the Dm distance between two points is defined as

the shortest m-path between the points. In this case, the distance between two pixels will depend

on the values of the pixels along the path, as well as the values of their neighbors.

For instance, consider the following arrangement of pixels and assume that p, p2 , and

p4 have value 1 and that p1 and p3 can have a value of 0 or 1:

Suppose that we consider adjacency of pixels valued 1 (i.e.,V={1}). If p1 and p3 are 0,

the length of the shortest m-path (the Dm distance) between p and p4 is 2. If p1 is 1, then p2 and

p will no longer be m-adjacent (see the definition of m-adjacency) and the length of the shortest

m-path becomes 3 (the path goes through the points p-p1-p2-p4). Similar comments apply if p3 is

1 (and p1 is 0); in this case, the length of the shortest m-path also is 3. Finally, if both p1 and p3

are 1 the length of the shortest m-path between p and p4 is 4. In this case, the path goes through

the sequence of points pp1 p2 p3 p4.

IMAGING GEOMETRY SOME BASIC TRANSFORMATIONS In this, all transformations are expressed in a three-dimensional Cartesian coordinate system in

which a point has coordinates denoted as (X,Y,Z).

Translation Suppose that the task is to translate a point with coordinates (X,Y,Z) to a new location by using

displacement (X0,Y0,Z0). The translation is easily accomplished by using the equations:

X*=X+X 0

Y*=Y+Y 0

Z*=Z+Z0

Where (X*,Y*,Z*) are the coordinates of the new point. Above equations can be represented in

matrix form as

0

0

0

* 1 0 0

* 0 1 0

* 0 0 11

XX X

YY Y

ZZ Z

=



It is often useful to concatenate several transformations to produce a composite result, such s

translation, followed by scaling and then rotation. The use of square matrices simplifies the

notational representation of this process considerably. So, the above can be modified as:

0

0

0

* 1 0 0

* 0 1 0

* 0 0 1

1 0 0 0 1 1

X X X

Y Y Y

Z Z Z

=

Let us consider the unified matrix representation.

v*=Av where A is a 4x4 transformation matrix, v is the column vector containing the original

coordinates,

v

1

X

Y

Z

=

and v* is a column vector whose components are the transformed coordinates

*

*v*

*

1

X

Y

Z

=

With this notation, the matrix used for translation is

0

0

0

1 0 0

0 1 0T

0 0 1

0 0 0 1

X

Y

Z

=

And the translation process id accomplished by the equation v*=Tv Scaling Scaling by factors Sx,Sy,Sz along the X,Y and X axes is given by the transformation matrix

x

y

z

S 0 0 0

0 S 0 0

0 0 S 0

0 0 0 1

S

=

Rotation

The transformations used for 3-D rotation are more complex. The simplest form of these

transformations is for rotation of a point about the coordinate axes. To rotate a point about

another arbitrary point in space requires three transformations: the first translates the arbitrary



point to the origin, the second performs the rotation, and the third translated the point back to the

original position.

With reference to above figure, rotation of a point about Z coordinate axis by an angle θ is

achieved by using the transformation

θ

cosθ sinθ 0 0

-sinθ cosθ 0 0R =

0 0 1 0

0 0 0 1

The rotation angle θ is measured clockwise when looking at the origin from point on z-axis. This

transformation affects only the values of X and Y coordinates.

Rotation of a point about the X axis by an angle α is performed by using the transformation

α

1 0 0 0

0 cosα sinα 0R =

0 -sinα cosα 0

0 0 0 1

Finally the rotation of a point about the Y axis by an angle β is achieved using the transformation

β

cosβ 0 -sinβ 0

0 1 0 0R =

sinβ 0 cosβ 0

0 0 0 1

Concatenation and Inverse transformations

The application of several transformations can be represented by a single 4x4 transformation

matrix. For example, translation, scaling, and rotation about Z axis of a point v is given by

v*=R θθθθ(S(Tv))=Av

Z

X

Y

ββββ

αααα

θθθθ



Where A is a 4x4 matrix, A=RθθθθST. These matrices generally do not commute, so the order of

application is important.

The same above ideas can be extended for transforming a set of m points simultaneously

by using a single transformation. Let v1, v2, …. vm represent the coordinates of m points. For a

4x4 matrix V whose column vectors, the simultaneous transformation of all these points by a 4x4

transformation matrix A is given by

V*=AV

The resulting matrix V* is 4xm. Its ith column, vi*, contains the coordinates of the

transformed point corresponding to vi

Many of the transformations discussed above have inverse matrices that perform the

opposite transformation and can be obtained by inspection. For example, the inverse

transformation matix is

0

0-1

0

1 0 0

0 1 0T

0 0 1

0 0 0 1

X

Y

Z

− − = −

QUESTIONS & ANSWERS

1. Why do we process images?

Image Processing has been developed in response to three major problems concerned with

pictures:

� Picture digitization and coding to facilitate transmission, printing and storage of pictures.

� Picture enhancement and restoration in order, for example, to interpret more easily

pictures of the surface of other planets taken by various probes.

� Picture segmentation and description as an early stage in Machine Vision.

2. What is the brightness of an image at a pixel position?

Each pixel of an image corresponds to a part of a physical object in the 3D world. This

physical object is illuminated by some light which is partly reflected and partly absorbed by it.

Part of the reflected light reaches the sensor used to image the scene and is responsible for the

value recorded for the specific pixel. The recorded value of course, depends on the type of sensor

used to image the scene, and the way this sensor responds to the spectrum of the reflected light.

However, as a whole scene is imaged by the same sensor, we usually ignore these details. What

is important to remember is that the brightness values of different pixels have significance only

relative to each other and they are meaningless in absolute terms. So, pixel values between

different images should only be compared if either care has been taken for the physical processes

used to form the two images to be identical, or the brightness values of the two images have

somehow been normalized so that the effects of the different physical processes have been

removed.



3. Why are images often quoted as being 512 X 512, 256 X 256, 128 X 128 etc?

Many image calculations with images are simplified when the size of the image is a power of 2.

4. How many bits do we need to store an image?

The number of bits, b, we need to store an image of size N X N with 2m different grey levels is:

b=NxNxm

So, for a typical 512 X 512 image with 256 grey levels (m = 8) we need 2,097,152 bits or

262,144 8-bit bytes. That is why we often try to reduce m and N, without significant loss in the

quality of the picture.

5. Consider the two image subsets, S1 and S2, shown in the following figure. For V={1},

determine whether these two subsets are (a) 4-adjacent, (b) 8-adjacent, or (c) m-adjacent.

Let p and q be as shown in Figure. Then,

(a) S1 and S2 are not 4connected because q is not in the set N4(p)

(b) S1 and S2 are 8- connected because q is in the set N8(p)

(c) S1 and S2 are m-connected because (i) q is in ND(p), and (ii) the set N4(p)∩ N4(q) is empty.

6. Consider the image segment shown.

(a) Let V={0, 1} and compute the lengths of the shortest 4-, 8-, and m-path between p and q. If

a particular path does not exist between these two points, explain why.

(b) Repeat for V={1, 2}.

(a) When V = {0,1, 4path does not exist between p and q because it is impossible toget from p to

q by traveling along points that are both 4-adjacent and also have values from V . Figure below

shows this condition; it is not possible to get to q.



The shortest 8-path is shown in Figure below; its length is 4. The length of shortest m-path

(shown dashed) is 5. Both of these shortest paths are unique in this case.

(b) One possibility for the shortest 4-path when V = {1, 2} is shown in Figure below; its length is

6. It is easily verified that another 4path of the same length exists between p and q.

One possibility for the shortest 8path (it is not unique) is shown in figure below; its length is 4.

The length of a shortest m-path (shown dashed) is 6. This path is not unique.

7. (a) Give the condition(s) under which the D4 distance between two points p and q is equal to the shortest 4-path between these points. (b) Is this path unique?

A shortest 4-path between a point p with coordinates (x, y) and a point q with coordinates

(s, t) is shown in Fig below, where the assumption is that all points along the path are from V.

The length of the segments of the path are |x- s| and |y- t|, respectively.

The total path length is |x-s| + |y- t|, which we recognize as the definition of the D4

distance. This distance is independent of any paths that may exist between the points. The D4

distance obviously is equal to the length of the shortest 4path when the length of the path is |x- s|

+ |y- t|. This occurs whenever we can get from p to q by following a path whose elements (1) are

from V; and (2) are arranged in such a way that we can traverse the path from p to q by making



turns in at most two directions (e.g., right and up). (b) The path may of may not be unique,

depending on V and the values of the points along the way.

8. Develop an algorithm for converting a one-pixel-thick 8-path to a 4-path.

The solution to this problem consists of determining all possible neighborhood shapes to

go from a diagonal segment to a corresponding 4connected segment, as shown in figure. The

algorithm then simply looks for the appropriate match every time a diagonal segment is

encountered in the boundary.

9. Explain the basic principle of imaging in different bands of electromagnetic spectrum.

Today, there is almost no area of technical endeavor that is not impacted in some way by

digital image processing. One of the simplest ways to develop a basic understanding of the extent

of image processing applications is to categorize images according to their source (e.g., visual,

X-ray, and so on). The principal energy source for images in use today is the electromagnetic

energy spectrum. Other important sources of energy include acoustic, ultrasonic, and electronic

(in the form of electron beams used in electron microscopy). Images based on radiation from the

EM spectrum are the most familiar, especially images in the X-ray and visual bands of the

spectrum. Electromagnetic waves can be conceptualized as propagating sinusoidal waves of

varying wavelengths, or they can be thought of as a stream of mass less particles, each traveling

in a wavelike pattern and moving at the speed of light. If spectral bands are grouped according to

energy per photon, we obtain the spectrum shown in figure, ranging from gamma rays (highest

energy) at one end to radio waves (lowest energy) at the other.



Gamma-Ray Imaging:

Major uses of imaging based on gamma rays include nuclear medicine and astronomical

observations. In nuclear medicine, the approach is to inject a patient with a radioactive isotope

that emits gamma rays as it decays. Images are produced from the emissions collected by gamma

ray detectors.

X-ray Imaging:

X-rays are among the oldest sources of EM radiation used for imaging. The best known

use of X-rays is medical diagnostics, but they also are used extensively in industry and other

areas, like astronomy. X-rays for medical and industrial imaging are generated using an X-ray

tube, which is a vacuum tube with a cathode and anode. In digital radiography, digital images are

obtained by one of two methods: (1) by digitizing X-ray films; or (2) by having the X-rays that

pass through the patient fall directly onto devices (such as a phosphor screen) that convert X-rays

to light. The light signal in turn is captured by a light-sensitive digitizing system. Angiography is

another major application in an area called contrast enhancement radiography. This procedure is

used to obtain images (called angiograms) of blood vessels.

Imaging in the Ultraviolet Band:

Applications of ultraviolet “light” are varied. They include lithography, industrial

inspection, microscopy, lasers, biological imaging, and astronomical observations. Ultraviolet

light is used in fluorescence microscopy, one of the fastest growing areas of microscopy.

Fluorescence microscopy is an excellent method for studying materials that can be made to

fluoresce, either in their natural form (primary fluorescence) or when treated with chemicals

capable of fluorescing (secondary fluorescence).

Imaging in the Visible and Infrared Bands:

Considering that the visual band of the electromagnetic spectrum is the most familiar in

all our activities, it is not surprising that imaging in this band outweighs by far all the others in

terms of scope of application. The infrared band often is used in conjunction with visual

imaging. The examples range from pharmaceuticals and micro inspection to materials

characterization. Another major area of visual processing is remote sensing, which usually

includes several bands in the visual and infrared regions of the spectrum. A major area of

imaging in the visual spectrum is in automated visual inspection of manufactured goods.

Imaging in the Microwave Band

The dominant application of imaging in the microwave band is radar. The unique feature

of imaging radar is its ability to collect data over virtually any region at any time, regardless of

weather or ambient lighting conditions. Some radar waves can penetrate clouds, and under

certain conditions can also see through vegetation, ice, and extremely dry sand. In many cases,

radar is the only way to explore inaccessible regions of the Earth’s surface. Instead of a camera



lens, a radar uses an antenna and digital computer processing to record its images. In a radar

image, one can see only the microwave energy that was reflected back toward the radar antenna.

Imaging in the Radio Band:

As in the case of imaging at the other end of the spectrum (gamma rays), the major

applications of imaging in the radio band are in medicine and astronomy. In medicine radio

waves are used in magnetic resonance imaging (MRI). This technique places a patient in a

powerful magnet and passes radio waves through his or her body in short pulses. Each pulse

causes a responding pulse of radio waves to be emitted by the patient’s tissues. The location from

which these signals originate and their strength are determined by a computer, which produces a

two-dimensional picture of a section of the patient. MRI can produce pictures in any plane.



UNIT-III

IMAGE ENHANCEMENT IN SPATIAL DOMAIN

The principal objective of enhancement is to process an image so that the result is more

suitable than the original image for a specific application. Image enhancement approaches fall

into two broad categories: spatial domain methods and frequency domain methods. The term

spatial domain refers to the image plane itself, and approaches in this category are based on

direct manipulation of pixels in an image. Frequency domain processing techniques are based on

modifying the Fourier transform of an image.

BACKGROUND

The term spatial domain refers to the aggregate of pixels composing an image. Spatial

domain methods are procedures that operate directly on these pixels. Spatial domain processes

will be denoted by the expression.

g(x, y) = T[f(x, y)]

where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f, defined

over some neighborhood of (x, y), where f(x, y) is the input image, g(x, y) is the processed

image, and T is an operator on f, defined over some neighborhood of (x, y).

The principal approach in defining a neighborhood about a point (x, y) is to use a square

or rectangular sub-image area centered at (x, y), as Fig. 3.1 shows.

The center of the sub-image is moved from pixel to pixel starting, say, at the top left

corner. The operator T is applied at each location (x, y) to yield the output, g, at that location.

The process utilizes only the pixels in the area of the image spanned by the neighborhood.

Although other neighborhood shapes, such as approximations to a circle, sometimes are used,

square and rectangular arrays are by far the most predominant because of their ease of

implementation.



POINT PROCESSING TECHNIQUES

The simplest form of T is when the neighborhood is of size 1x1 (that is, a single pixel). In

this case, g depends only on the value of f at (x, y), and T becomes a gray-level (also called an

intensity or mapping) transformation function of the form

s=T(r)

where, for simplicity in notation, r and s are variables denoting, respectively, the gray level of

f(x, y) and g(x, y) at any point (x, y).

For example, if T(r) has the form shown in Fig. 3.2(a), the effect of this transformation

would be to produce an image of higher contrast than the original by darkening the levels below

m and brightening the levels above m in the original image. In this technique, known as contrast

stretching, the values of r below m are compressed by the transformation function into a narrow

range of s, toward black. The opposite effect takes place for values of r above m.

In the limiting case shown in Fig. 3.2(b), T(r) produces a two-level (binary) image. A

mapping of this form is called a thresholding function. Some fairly simple, yet powerful,

processing approaches can be formulated with gray-level transformations. Because enhancement

at any point in an image depends only on the gray level at that point, techniques in this category

often are referred to as point processing.

Larger neighborhoods allow considerably more flexibility. The general approach is to use

a function of the values of f in a predefined neighborhood of (x, y) to determine the value of g at

(x, y). One of the principal approaches in this formulation is based on the use of so-called masks

(also referred to as filters, kernels, templates, or windows). Basically, a mask is a small (say, 3*3)

2-D array, such as the one shown in Fig. 3.1, in which the values of the mask coefficients

determine the nature of the process, such as image sharpening. Enhancement techniques based

on this type of approach often are referred to as mask processing or filtering.

SOME BASIC GRAY LEVEL TRANSFORMATIONS

These are among the simplest of all image enhancement techniques. The values of pixels,

before and after processing, will be denoted by r and s, respectively. These values are related

by an expression of the form s=T(r), where T is a transformation that maps a pixel value r into a

pixel value s. As an introduction to gray-level transformations, consider Fig. 3.3, which shows

three basic types of functions used frequently for image enhancement: linear (negative and



identity transformations), logarithmic (log and inverse-log transformations), and power-law (nth

power and nth root transformations).The identity function is the trivial case in which output

intensities are identical to input intensities.

Image Negatives

The negative of an image with gray levels in the range [0,L-1]is obtained by using the

negative transformation shown in Fig. 3.3, which is given by the expression.

s = L-1-r

Reversing the intensity levels of an image in this manner produces the equivalent of a

photographic negative. This type of processing is particularly suited for enhancing white or gray

detail embedded in dark regions of an image, especially when the black areas are dominant in

size.

An example is shown in Fig. 3.4. The original image is a digital mammogram showing a

small lesion. In spite of the fact that the visual content is the same in both images, note how

much easier it is to analyze the breast tissue in the negative image in this particular case.



Log Transformations

The general form of the log transformation shown in Fig. 3.3 is

s = c log (1 + r)

where c is a constant, and it is assumed that r ≥ 0. The shape of the log curve in Fig. 3.3 shows

that this transformation maps a narrow range of low gray-level values in the input image into a

wider range of output levels. The opposite is true of higher values of input levels. We would use

a transformation of this type to expand the values of dark pixels in an image while compressing

the higher-level values. The opposite is true of the inverse log transformation. The log function

has the important characteristic that it compresses the dynamic range of images with large

variations in pixel values. A classic illustration of an application in which pixel values have a

large dynamic range is the Fourier spectrum. It is not unusual to encounter spectrum values that

range from 0 to or higher. While processing numbers such as these presents no problems for a

computer, image display systems generally will not be able to reproduce faithfully such a wide

range of intensity values. The net effect is that a significant degree of detail will be lost in the

display of a typical Fourier spectrum.

Power-Law Transformations

Power-law transformations have the basic form

s = crγγγγ

where c and g are positive constants. Sometimes the above equation is written as s = (c+ε)γ to

account for an offset (that is, a measurable output when the input is zero). However, offsets

typically are an issue of display calibration and as a result they are normally ignored. Plots of s

versus r for various values of g are shown in Fig. 3.6.

As in the case of the log transformation, power-law curves with fractional values of γ

map a narrow range of dark input values into a wider range of output values, with the opposite

being true for higher values of input levels. Unlike the log function, however, we notice here a

family of possible transformation curves obtained simply by varying γ.



A variety of devices used for image capture, printing, and display respond according to a

power law. By convention, the exponent in the power-law equation is referred to as gamma. The

process used to correct these power-law response phenomena is called gamma correction. In

addition to gamma correction, power-law transformations are useful for general-purpose contrast

manipulation.

ARITHMATIC & LOGICAL OPERATIONS

Arithmetic/logic operations involving images are performed on a pixel-by-pixel basis

between two or more images (this excludes the logic operation NOT, which is performed on a

single image). As an example, subtraction of two images results in a new image whose pixel at

coordinates (x, y) is the difference between the pixels in that same location in the two images

being subtracted. When dealing with logic operations on gray-scale images, pixel values are

processed as strings of binary numbers. For example, performing the NOT operation on a black,

8-bit pixel (a string of eight 0’s) produces a white pixel (a string of eight 1’s). Intermediate

values are processed the same way, changing all 1’s to 0’s and vice versa. Thus, the NOT logic

operator performs the same function as the negative transformation The AND and OR operations

are used for masking; that is, for selecting sub-images in an image. In the AND and OR image

masks, light represents a binary 1 and dark represents a binary 0. Masking sometimes is referred

to as region of interest (ROI) processing. In terms of enhancement, masking is used primarily to

isolate an area for processing. This is done to highlight that area and differentiate

it from the rest of the image. Of the four arithmetic operations, subtraction and addition (in that

order) are the most useful for image enhancement.

IMAGE SUBTRACTION

The difference between two images f(x, y) and h(x, y), expressed as

is obtained by computing the difference between all pairs of corresponding pixels from f and h.

The key usefulness of subtraction is the enhancement of differences between images.

In practice, most images are displayed using 8 bits (even 24-bit color images consists of

three separate 8-bit channels). Thus, we expect image values not to be outside the range from 0

to 255.The values in a difference image can range from a minimum of –255 to a maximum of

255, so some sort of scaling is required to display the results. There are two principal ways to

scale a difference image.

� One method is to add 255 to every pixel and then divide by 2. It is not guaranteed that

the values will cover the entire 8-bit range from 0 to 255, but all pixel values

definitely will be within this range. This method is fast and simple to implement, but

it has the limitations that the full range of the display may not be utilized and,

potentially more serious; the truncation inherent in the division by 2 will generally

cause loss in accuracy.



� If more accuracy and full coverage of the 8-bit range are desired, then we can resort

to another approach. First, the value of the minimum difference is obtained and its

negative added to all the pixels in the difference image (this will create a modified

difference image whose minimum values is 0). Then, all the pixels in the image are

scaled to the interval [0, 255] by multiplying each pixel by the quantity 255_Max,

where Max is the maximum pixel value in the modified difference image. It is evident

that this approach is considerably more complex and difficult to implement.

Application

One of the most commercially successful and beneficial uses of image subtraction is in

the area of medical imaging called mask mode radiography. In this case h(x, y), the mask, is an

X-ray image of a region of a patient’s body captured by an intensified TV camera (instead of

traditional X-ray film) located opposite an X-ray source. The procedure consists of injecting a

contrast medium into the patient’s bloodstream, taking a series of images of the same anatomical

region as h(x, y), and subtracting this mask from the series of incoming images after injection of

the contrast medium. The net effect of subtracting the mask from each sample in the incoming

stream of TV images is that the areas that are different between f(x, y) and h(x, y) appear in the

output image as enhanced detail. Because images can be captured at TV rates, this procedure in

essence gives a movie showing how the contrast medium propagates through the various arteries

in the area being observed.

IMAGE AVERAGING

Consider a noisy image g(x, y) formed by the addition of noise h(x, y) to an original

image f(x, y); that is,

where the assumption is that at every pair of coordinates (x, y) the noise is uncorrelated and has

zero average value. The objective of the following procedure is to reduce the noise content by

adding a set of noisy images, {gi(x, y)}.

If the noise satisfies the constraints just stated, it can be shown that if an image is formed

by averaging K different noisy images,



As K increases, Eqs. (3.4-5) and (3.4-6) indicate that the variability (noise) of the pixel values at

each location (x, y) decreases. Because = f(x, y), this means that approaches f(x, y) as the

number of noisy images used in the averaging process increases.

An important application of image averaging is in the field of astronomy, where imaging

with very low light levels is routine, causing sensor noise frequently to render single images

virtually useless for analysis.

HISTOGRAM PROCESSING

The histogram of a digital image with gray levels in the range [0, L-1] is a discrete

function h(rk)=n, where rk is the kth gray level and nk is the number of pixels in the image having

gray level rk. It is common practice to normalize a histogram by dividing each of its values by

the total number of pixels in the image, denoted by n. Thus, a normalized histogram is given by

p(rk)=nk/n, for k=0, 1,… ,L-1. Loosely speaking, p(rk) gives an estimate of the probability of

occurrence of gray level rk. Note that the sum of all components of a normalized histogram is

equal to 1. The horizontal axis of each histogram plot corresponds to gray level values, rk. The

vertical axis corresponds to values of h(rk)=nk or p(rk)=nk/n if the values are normalized

Histograms are the basis for numerous spatial domain processing techniques. Histogram

manipulation can be used effectively for image enhancement. Histograms are the basis for

numerous spatial domain processing techniques. Histogram manipulation can be used effectively

for image enhancement

For a dark image the components of the histogram are concentrated on the low (dark)

side of the gray scale. Similarly, the components of the histogram of the bright image are biased

toward the high side of the gray scale. An image with low contrast has a histogram that will be

narrow and will be centered toward the middle of the gray scale. For a monochrome image this

implies a dull, washed-out gray look. Finally, we see that the components of the histogram in the

high-contrast image cover a broad range of the gray scale and, further, that the distribution of

pixels is not too far from uniform, with very few vertical lines being much higher than the others.

Intuitively, it is reasonable to conclude that an image, whose pixels tend to occupy the entire

range of possible gray levels and, in addition, tend to be distributed uniformly, will have an

appearance of high contrast and will exhibit a large variety of gray tones. The net effect will be

an image that shows a great deal of gray-level detail and has high dynamic range.

Histogram Equalization (Histogram Linearization)

Consider for a moment continuous functions, and let the variable r represent the gray

levels of the image to be enhanced. Assume that r has been normalized to the interval [0, 1], with

r =0 representing black and r=1 representing white. Later, we consider a discrete formulation and

allow pixel values to be in the interval [0, L-1].



For any r satisfying the aforementioned conditions, we focus attention on transformations of the

form that produce a level s for every pixel value r in the original image.

Assume that the transformation function T(r) satisfies the following conditions:

(a) T(r) is single-valued and monotonically increasing in the interval 0 ≤ r ≤ 1; and

(b) 0 ≤ T(r) ≤ 1 for 0 ≤ r ≤ 1.

The requirement in condition (a) is that T(r) be single valued is needed to guarantee that the

inverse transformation will exist, and the monotonicity condition preserves the increasing order

from black to white in the output image. A transformation function that is not monotonically

increasing could result in at least a section of the intensity range being inverted, thus producing

some inverted gray levels in the output image. Finally, condition (b) guarantees that the output

gray levels will be in the same range as the input levels. Figure 3.16 gives an example of a

transformation function that satisfies these two conditions.

The inverse transformation from s back to r is denoted

The gray levels in an image may be viewed as random variables in the interval [0, 1]. One of the

most fundamental descriptors of a random variable is its probability density function (PDF). Let

pr(r) and ps(s) denote the probability density functions of random variables r and s, respectively,

where the subscripts on p are used to denote that pr and ps are different functions. A basic result

from an elementary probability theory is that, if pr(r) and T(r) are known and satisfies condition

(a), then the probability density function ps(s) of the transformed variable s can be obtained using

a rather simple formula:

Thus, the probability density function of the transformed variable, s, is determined by the gray-

level PDF of the input image and by the chosen transformation function.

A transformation function of particular importance in image processing has the form

where w is a dummy variable of integration.



Given transformation function T(r),we find ps(s) by applying Eq. (3.3-3).We know from basic

calculus (Leibniz’s rule) that the derivative of a definite integral with respect to its upper limit is

simply the integrand evaluated at that limit. In other words,

Substituting this result for dr/ds into Eq. (3.3-3), and keeping in mind that all probability values are positive, yields

Because ps(s) is a probability density function, it follows that it must be zero outside the interval

[0, 1] in this case because its integral over all values of s must equal 1.We recognize the form of

ps(s) given in Eq. (3.3-6) as a uniform probability density function. Simply stated, we have

demonstrated that performing the transformation function given in Eq. (3.3-4) yields a random

variable s characterized by a uniform probability density function. It is important to note from

Eq. (3.3-4) that T(r) depends on pr(r), but, as indicated by Eq. (3.3-6), the resulting ps(s) always

is uniform, independent of the form of pr(r).

For discrete values we deal with probabilities and summations instead of probability density

functions and integrals. The probability of occurrence of gray level rk in an image is

approximated by

where, n is the total number of pixels in the image, nk is the number of pixels that have gray

level rk, and L is the total number of possible gray levels in the image. The discrete version of the

transformation function given in Eq. (3.3-4) is

Thus, a processed (output) image is obtained by mapping each pixel with level rk in the

input image into a corresponding pixel with level sk in the output image via Eq. (3.3-8). As

indicated earlier, a plot of pr(rk) versus rk is called a histogram. The transformation (mapping)

given in Eq. (3.3-8) is called histogram equalization or histogram linearization.

In general that this discrete transformation will produce the discrete equivalent of a

uniform probability density function, which would be a uniform histogram. However, as will be



seen shortly, use of Eq. (3.3-8) does have the general tendency of spreading the histogram of the

input image so that the levels of the histogram-equalized image will span a fuller range of the

gray scale. In addition to producing gray levels that have this tendency, the method just derived

has the additional advantage that it is fully “automatic.” In other words, given an image, the

process of histogram equalization consists simply of implementing Eq. (3.3-8), which is based on

information that can be extracted directly from the given image, without the need for further

parameter specifications.

Histogram Matching (Histogram Specification)

Histogram equalization automatically determines a transformation function that seeks to produce

an output image that has a uniform histogram. When automatic enhancement is desired, this is a

good approach because the results from this technique are predictable and the method is simple

to implement. But there are applications in which attempting to base enhancement on a uniform

histogram is not the best approach. In particular, it is useful sometimes to be able to specify the

shape of the histogram that we wish the processed image to have. The method used to generate a

processed image that has a specified histogram is called histogram matching or histogram

specification.

Development of the method

Let r and z denote continuous gray levels, and let pr(r) and pz(z) denote their corresponding

continuous probability density functions. In this notation, r and z denote the gray levels of the

input and output (processed) images, respectively. We can estimate pr(r) from the given input

image, while pz(z) is the specified probability density function that we wish the output image to

have.

Let s be a random variable with the property

where w is a dummy variable of integration. We recognize this expression as the continuous

version of histogram equalization given in Eq. (3.3-4). Suppose next that we define a random

variable z with the property.

where t is a dummy variable of integration. It then follows from these two equations that

G(z)=T(r) and, therefore, that z must satisfy the condition

The transformation T(r) can be obtained from Eq. (3.3-10) once pr(r) has been estimated from the

input image. Similarly, the transformation function G(z) can be obtained using Eq. (3.3-11)

because pz(z) is given.



The discrete formulation of above equations is,

…. (1)

where n is the total number of pixels in the image, nj is the number of pixels with gray level rj ,

and L is the number of discrete gray levels. Similarly, the discrete formulation of Eq. (3.3-11) is

obtained from the given histogram pz(zi),i=0, 1, 2,… ,L-1, and has the form

… (2)

…(3)

….(4)

Equations (1) to (3) are the foundation for implementing histogram matching for digital

images. Equation (1) is a mapping from the levels in the original image into corresponding levels

sk based on the histogram of the original image, which we compute from the pixels in the image.

Equation (2) computes a transformation function G from the given histogram pz(z). Finally, Eq.

(3) or its equivalent, Eq. (4), gives us (an approximation of) the desired levels of the image with

that histogram. The above equations show that an image with a specified probability density

function can be obtained from an input image by using the following procedure:

(1) Obtain the transformation function T(r) using Eq. (1).

(2) Use Eq. (2) to obtain the transformation function G(z).

(3) Obtain the inverse transformation function G–1.

(4) Obtain the output image by applying Eq. (3) to all the pixels in the input image. The

result of this procedure will be an image whose gray levels, z, have the specified

probability density function pz(z).

Local Enhancement

The histogram processing methods discussed in the previous two sections are global, in

the sense that pixels are modified by a transformation function based on the gray-level content of

an entire image. Although this global approach is suitable for overall enhancement, there are

cases in which it is necessary to enhance details over small areas in an image. The number of

pixels in these areas may have negligible influence on the computation of a global transformation

whose shape does not necessarily guarantee the desired local enhancement. The solution is to

devise transformation functions based on the gray-level distribution or other properties in the

neighborhood of every pixel in the image.

The histogram processing techniques previously described are easily adaptable to local

enhancement. The procedure is to define a square or rectangular neighborhood and move the

center of this area from pixel to pixel. At each location, the histogram of the points in the



neighborhood is computed and either a histogram equalization or histogram specification

transformation function is obtained. This function is finally used to map the gray level of the

pixel centered in the neighborhood. The center of the neighborhood region is then moved to an

adjacent pixel location and the procedure is repeated. Since only one new row or column of the

neighborhood changes during a pixel-to-pixel translation of the region, updating the histogram

obtained in the previous location with the new data introduced at each motion step is possible.

This approach has obvious advantages over repeatedly computing the histogram over all pixels

in the neighborhood region each time the region is moved one pixel location. Another approach

used some times to reduce computation is to utilize non-overlapping regions, but this method

usually produces an undesirable checkerboard effect.

Figure 3.23(a) shows an image that has been slightly blurred to reduce its noise content

(see Section 3.6.1 regarding blurring).Figure 3.23(b) shows the result of global histogram

equalization. As is often the case when this technique is applied to smooth, noisy areas, Fig.

3.23(b) shows considerable enhancement of the noise, with a slight increase in contrast. Note

that no new structural details were brought out by this method. However, local histogram

equalization using a 7*7 neighborhood revealed the presence of small squares inside the larger

dark squares.The small squares were too close in gray level to the larger ones, and their sizes

were too small to influence global histogram equalization significantly. Note also the finer noise

texture in Fig. 3.23(c), a result of local processing using relatively small neighborhoods.

SPATIAL FILTERING

Some neighborhood operations work with the values of the image pixels in the

neighborhood and the corresponding values of a sub-image that has the same dimensions as the

neighborhood. The sub-image is called a filter, mask, kernel, template, or window. The values in

a filter sub-image are referred to as coefficients, rather than pixels.

The mechanics of spatial filtering are illustrated in Fig. 3.32.The process consists simply

of moving the filter mask from point to point in an image. At each point (x, y), the response of

the filter at that point is calculated using a predefined relationship. For linear spatial filtering,



the response is given by a sum of products of the filter coefficients and the corresponding image

pixels in the area spanned by the filter mask. For the 3x3 mask shown in Fig. 3.32, the result (or

response), R, of linear filtering with the filter mask at a point (x, y) in the image is

In general, linear filtering of an image f of size MxN with a filter mask of size mxn is

given by the expression:

where, a=(m-1)/2 and b=(n-1)/2. To generate a complete filtered image this equation must be

applied for x=0, 1, 2, … , M-1 and y=0, 1, 2, … , N-1.

The process of linear filtering given in above equation is similar to a frequency domain

concept called convolution. For this reason, linear spatial filtering often is referred to as

“convolving a mask with an image.” Similarly, filter masks are sometimes called convolution

masks. The term convolution kernel also is in common use.

When interest lies on the response, R, of an mxn mask at any point (x, y), and not on the

mechanics of implementing mask convolution, it is common practice to simplify the notation by

using the following expression:



where the w’s are mask coefficients, the z’s are the values of the image gray levels

corresponding to those coefficients, and mn is the total number of coefficients in the mask. For

the 3x3 general mask shown in figure below, the response at any point (x, y) in the image is

given by

An important consideration in implementing neighborhood operations for spatial filtering is the

issue of what happens when the center of the filter approaches the border of the image. Consider

for simplicity a square mask of size nxn. At least one edge of such a mask will coincide with the

border of the image when the center of the mask is at a distance of (n-1)/2 pixels away from the

border of the image. If the center of the mask moves any closer to the border, one or more rows

or columns of the mask will be located outside the image plane. There are several ways to handle

this situation.

� The simplest is to limit the excursions of the center of the mask to be at a distance no less

than (n-1)/2 pixels from the border. The resulting filtered image will be smaller than the

original, but all the pixels in the filtered imaged will have been processed with the full

mask.

� If the result is required to be the same size as the original, then the approach typically

employed is to filter all pixels only with the section of the mask that is fully contained in

the image. With this approach, there will be bands of pixels near the border that will have

been processed with a partial filter mask.

� Other approaches include “padding” the image by adding rows and columns of 0’s (or

other constant gray level), or padding by replicating rows or columns. The padding is then

stripped off at the end of the process. This keeps the size of the filtered image the same as

the original, but the values of the padding will have an effect near the edges that becomes

more prevalent as the size of the mask increases.

� The only way to obtain a perfectly filtered result is to accept a somewhat smaller filtered

image by limiting the excursions of the center of the filter mask to a distance no less than

(n-1)/2 pixels from the border of the original image.



SMOOTHING SPATIAL FILTERS

Smoothing filters are used for blurring and for noise reduction. Blurring is used in

preprocessing steps, such as removal of small details from an image prior to (large) object

extraction, and bridging of small gaps in lines or curves. Noise reduction can be accomplished by

blurring with a linear filter and also by nonlinear filtering.

Smoothing Linear Filters

The output (response) of a smoothing, linear spatial filter is simply the average of the

pixels contained in the neighborhood of the filter mask. These filters sometimes are called

averaging filters. They also are referred to a lowpass filters.

The idea behind smoothing filters is straightforward. By replacing the value of every

pixel in an image by the average of the gray levels in the neighborhood defined by the filter

mask, this process results in an image with reduced “sharp” transitions in gray levels. Because

random noise typically consists of sharp transitions in gray levels, the most obvious application

of smoothing is noise reduction. However, edges (which almost always are desirable features of

an image) also are characterized by sharp transitions in gray levels, so averaging filters have the

undesirable side effect that they blur edges. Another application of this type of process includes

the smoothing of false contours that result from using an insufficient number of gray levels. A

major use of averaging filters is in the reduction of “irrelevant” detail in an image. By

“irrelevant” we mean pixel regions that are small with respect to the size of the filter mask.

Above figure shows two 3x3 smoothing filters. Use of the first filter yields the standard average

of the pixels under the mask. The response of the first filter is given by

which is the average of the gray levels of the pixels in the 3x3 neighborhood defined by the

mask. A spatial averaging filter in which all coefficients are equal is sometimes called a box

filter.

The second mask shown in above figure is a little more interesting. This mask yields a

so-called weighted average, terminology used to indicate that pixels are multiplied by different

coefficients, thus giving more importance (weight) to some pixels at the expense of others. In the

second mask shown, the pixel at the center of the mask is multiplied by a higher value than any

other, thus giving this pixel more importance in the calculation of the average. The other pixels

are inversely weighted as a function of their distance from the center of the mask. The diagonal

terms are further away from the center than the orthogonal neighbors (by a factor of) and, thus,



are weighed less than these immediate neighbors of the center pixel. The basic strategy behind

weighing the center point the highest and then reducing the value of the coefficients as a function

of increasing distance from the origin is simply an attempt to reduce blurring in the smoothing

process.

In practice, it is difficult in general to see differences between images smoothed by using

either of the masks in above figure, or similar arrangements, because the area these masks span

at any one location in an image is so small.

The general implementation for filtering an MxN image with a weighted averaging filter

of size mxn (m and n odd) is given by the expression.

The denominator in above equation is simply the sum of the mask coefficients and,

therefore, it is a constant that needs to be computed only once.

The effects of smoothing as a function of filter size are illustrated in Fig. 3.35, which

shows an original image and the corresponding smoothed results obtained using square

averaging filters of sizes n=3, 5, 9, 15, and 35 pixels, respectively. The principal features of

these results are as follows:

� For n=3, we note a general slight blurring throughout the entire image but, as expected,

details that are of approximately the same size as the filter mask are affected considerably

more. For example, the 3*3 and 5*5 squares, the small letter “a,” and the fine grain noise

show significant blurring when compared to the rest of the image. A positive result is that

the noise is less pronounced. Note that the jagged borders of the characters and gray

circles have been pleasingly smoothed.

� The result for n=5 is somewhat similar, with a slight further increase in blurring.

� For n=9 we see considerably more blurring, and the 20% black circle is not nearly as

distinct from the background as in the previous three images, illustrating the blending

effect that blurring has on objects whose gray level content is close to that of its

neighboring pixels. Note the significant further smoothing of the noisy rectangles.

� The results for n=15 and 35 are extreme with respect to the sizes of the objects in the

image. This type of excessive blurring is generally used to eliminate small objects from an

image. For instance, the three small squares, two of the circles, and most of the noisy

rectangle areas have been blended into the background of the image in Fig. 3.35(f). Note

also in this figure the pronounced black border. This is a result of padding the border of

the original image with 0’s (black) and then trimming off the padded area. Some of the

black was blended into all filtered images, but became truly objectionable for the images

smoothed with the larger filters.



An important application of spatial averaging is to blur an image for the purpose getting a

gross representation of objects of interest, such that the intensity of smaller objects blends with

the background and larger objects become “blob-like” and easy to detect. The size of the mask

establishes the relative size of the objects that will be blended with the background.

Smoothing Non-Linear Filters (Order-Statistics Filters)

Order-statistics filters are nonlinear spatial filters whose response is based on ordering

(ranking) the pixels contained in the image area encompassed by the filter, and then replacing the

value of the center pixel with the value determined by the ranking result. The best-known

example in this category is the median filter, which, as its name implies, replaces the value of a

pixel by the median of the gray levels in the neighborhood of that pixel (the original value of the

pixel is included in the computation of the median). Median filters are quite popular because, for

certain types of random noise, they provide excellent noise-reduction capabilities, with

considerably less blurring than linear smoothing filters of similar size. Median filters are

particularly effective in the presence of impulse noise, also called salt-and-pepper noise because

of its appearance as white and black dots superimposed on an image.



The median, ξ, of a set of values is such that half the values in the set are less than or

equal to ξ, and half are greater than or equal to ξ. In order to perform median filtering at a point

in an image, we first sort the values of the pixel in question and its neighbors, determine their

median, and assign this value to that pixel. For example, in a 3*3 neighborhood the median is the

5th largest value, in a 5*5 neighborhood the 13th largest value, and so on.

The principal function of median filters is to force points with distinct gray levels to be

more like their neighbors. In fact, isolated clusters of pixels that are light or dark with respect to

their neighbors, and whose area is less than n2/2 (one-half the filter area), are eliminated by an

nxn median filter. In this case “eliminated” means forced to the median intensity of the

neighbors. Larger clusters are affected considerably less.

Figure 3.37(a) shows an X-ray image of a circuit board heavily corrupted by salt-and-

pepper noise. To illustrate the point about the superiority of median filtering over average

filtering in situations such as this, we show in Fig. 3.37(b) the result of processing the noisy

image with a 3*3 neighborhood averaging mask, and in Fig. 3.37(c) the result of using a 3*3

median filter. The image processed with the averaging filter has less visible noise, but the price

paid is significant blurring. The superiority in all respects of median over average filtering in this

case is quite evident. In general, median filtering is much better suited than averaging for the

removal of additive salt-and-pepper noise.

SHARPENING SPATIAL FILTERS

The principal objective of sharpening is to highlight fine detail in an image or to enhance

detail that has been blurred, either in error or as a natural effect of a particular method of image

acquisition. Uses of image sharpening vary and include applications ranging from electronic

printing and medical imaging to industrial inspection and autonomous guidance in military

systems.

We saw that image blurring could be accomplished in the spatial domain by pixel

averaging in a neighborhood. Since averaging is analogous to integration, it is logical to

conclude that sharpening could be accomplished by spatial differentiation. Fundamentally, the

strength of the response of a derivative operator is proportional to the degree of discontinuity of

the image at the point at which the operator is applied. Thus, image differentiation enhances



edges and other discontinuities (such as noise) and deemphasizes areas with slowly varying gray-

level values. We consider in some detail sharpening filters that are based on first- and second-

order derivatives, respectively. Before proceeding with that discussion, however, we stop to look

at some of the fundamental properties of these derivatives in a digital context. To simplify the

explanation, we focus attention on one-dimensional derivatives. In particular, we are interested

in the behavior of these derivatives in areas of constant gray level (flat segments), at the onset

and end of discontinuities (step and ramp discontinuities), and along gray-level ramps. These

types of discontinuities can be used to model noise points, lines, and edges in an image. The

behavior of derivatives during transitions into and out of these image features also is of interest.

The derivatives of a digital function are defined in terms of differences. There are various ways

to define these differences. However, we require that any definition we use for a first derivative

(1) must be zero in flat segments (areas of constant gray-level values);

(2) must be nonzero at the onset of a gray-level step or ramp; and

(3) must be nonzero along ramps.

Similarly, any definition of a second derivative

(1) must be zero in flat areas;

(2) must be nonzero at the onset and end of a gray-level step or ramp; and

(3) must be zero along ramps of constant slope.

Since we are dealing with digital quantities whose values are finite, the maximum possible gray-

level change also is finite, and the shortest distance over which that change can occur is between

adjacent pixels.

A basic definition of the first-order derivative of a one-dimensional function f(x) is the difference

We used a partial derivative here in order to keep the notation the same as when we consider an

image function of two variables, f(x, y), at which time we will be dealing with partial derivatives

along the two spatial axes. Similarly, we define a second-order derivative as the difference

It is easily verified that these two definitions satisfy the conditions stated previously regarding

derivatives of the first and second order. To see this, and also to highlight the fundamental

similarities and differences between first- and second- order derivatives in the context of image

processing, consider the example shown below:

Figure 3.38(a) shows a simple image that contains various solid objects, a line, and a

single noise point. Figure 3.38(b) shows a horizontal gray-level profile (scan line) of the image

along the center and including the noise point. This profile is the one-dimensional function we

will use for illustrations regarding this figure. Figure 3.38(c) shows a simplification of the

profile, with just enough numbers to make it possible for us to analyze how the first- and second-

order derivatives behave as they encounter a noise point, a line, and then the edge of an object. In



our simplified diagram the transition in the ramp spans four pixels, the noise point is a single

pixel, the line is three pixels thick, and the transition into the gray-level step takes place between

adjacent pixels. The number of gray levels was simplified to only eight levels.

Let us consider the properties of the first and second derivatives as we traverse the profile

from left to right. First, we note that the first-order derivative is nonzero along the entire ramp,

while the second-order derivative is nonzero only at the onset and end of the ramp. Because

edges in an image resemble this type of transition, we conclude that first-order derivatives

produce “thick” edges and second-order derivatives, much finer ones. Next we encounter the

isolated noise point. Here, the response at and around the point is much stronger for the second-

than for the first-order derivative. Of course, this is not unexpected. A second-order derivative is

much more aggressive than a first-order derivative in enhancing sharp changes. Thus, we can

expect a second-order derivative to enhance fine detail (including noise) much more than a first-

order derivative. The thin line is a fine detail, and we see essentially the same difference between

the two derivatives. If the maximum gray level of the line had been the same as the isolated

point, the response of the second derivative would have been stronger for the latter. Finally, in

this case, the response of the two derivatives is the same at the gray-level step (in most cases

when the transition into a step is not from zero, the second derivative will be weaker).We also

note that the second derivative has a transition from positive back to negative. In an image, this

shows as a thin double line. This “double-edge” effect is an issue that will be important, where

we use derivatives for edge detection. It is of interest also to note that if the gray level of the thin

line had been the same as the step, the response of the second derivative would have been

stronger for the line than for the step.



In summary, comparing the response between first- and second-order derivatives, we

arrive at the following conclusions.

(1) First-order derivatives generally produce thicker edges in an image.

(2) Second-order derivatives have a stronger response to fine detail, such as thin lines and

isolated points.

(3) First order derivatives generally have a stronger response to a gray-level step.

(4) Second- order derivatives produce a double response at step changes in gray level.

We also note of second-order derivatives that, for similar changes in gray-level values in

an image, their response is stronger to a line than to a step, and to a point than to a line.

In most applications, the second derivative is better suited than the first derivative for

image enhancement because of the ability of the former to enhance fine detail.

Image Enhancement using Second Derivatives–The Laplacian

The approach basically consists of defining a discrete formulation of the second-order

derivative and then constructing a filter mask based on that formulation. We are interested in

isotropic filters, whose response is independent of the direction of the discontinuities in the

image to which the filter is applied. In other words, isotropic filters are rotation invariant; in the

sense that rotating the image and then applying the filter gives the same result as applying the

filter to the image first and then rotating the result.

Development of the method:

It can be shown that the simplest isotropic derivative operator is the Laplacian, which, for

a function (image) f(x, y) of two variables, is defined as

Because derivatives of any order are linear operations, the Laplacian is a linear operator. In order

to be useful for digital image processing, this equation needs to be expressed in discrete

form.There are several ways to define a digital Laplacian using neighborhoods. The partial

second-order derivative in the x-direction is:

The digital implementation of the two-dimensional Laplacian is

This equation can be implemented using the mask shown in Fig. 3.39(a), which gives an

isotropic result for rotations in increments of 90°.



The diagonal directions can be incorporated in the definition of the digital Laplacian by

adding two more terms to Eq (3.7-4), one for each of the two diagonal directions. The form of

each new term is the same as either Eq. (3.7-2). or (3.7-3), but the coordinates are along the

diagonals. Since each diagonal term also contains a –2f(x, y) term, the total subtracted from the

difference terms now would be –8f(x, y). The mask used to implement this new definition is

shown in Fig. 3.39(b). This mask yields isotropic results for increments of 45°. The other two

masks shown in Fig. 3.39 also are used frequently in practice. They are based on a definition of

the Laplacian that is the negative of the one we used here. As such, they yield equivalent results,

but the difference in sign must be kept in mind when combining (by addition or subtraction) a

Laplacian-filtered image with another image.

Because the Laplacian is a derivative operator, its use highlights gray-level

discontinuities in an image and deemphasizes regions with slowly varying gray levels. This will

tend to produce images that have grayish edge lines and other discontinuities, all superimposed

on a dark, featureless background. Background features can be “recovered” while still preserving

the sharpening effect of the Laplacian operation simply by adding the original and Laplacian

images. If the definition used has a negative center coefficient, then we subtract, rather than add,

the Laplacian image to obtain a sharpened result. Thus, the basic way in which we use the

Laplacian for image enhancement is as follows:

Simplifications:

Previously, we implemented Eq. (3.7-5) by first computing the Laplacian-filtered image

and then subtracting it from the original image. In practice, Eq. (3.7-5) is usually implemented

with one pass of a single mask. The coefficients of the single mask are easily obtained by

substituting Eq. (3.7-4) for ▼2(x, y )in the first line of Eq. (3.7-5):



This equation can be implemented using the mask shown below. The second mask shown

in below would be used if the diagonal neighbors also were included in the calculation of the

Laplacian. Identical masks would have resulted if we had substituted the negative of Eq. (3.7-4)

into the second line of Eq. (3.7-5).

The results obtainable with the mask containing the diagonal terms usually are a little

sharper than those obtained with the more basic mask of Fig. 3.41(a). This property is illustrated

by the Laplacian-filtered images shown in Figs. 3.41(d) and (e), which were obtained by using

the masks in Figs. 3.41(a) and (b), respectively. By comparing the filtered images with the

original image shown in Fig. 3.41(c), we note that both masks produced effective enhancement,

but the result using the mask in Fig. 3.41(b) is visibly sharper

Unsharp masking and high-boost filtering

A process used for many years in the publishing industry to sharpen images consists of

subtracting a blurred version of an image from the image itself. This process, called unsharp

masking, is expressed as



where fs(x, y) denotes the sharpened image obtained by unsharp masking, and f(x,y) is a blurred

version of f(x, y).The origin of unsharp masking is in darkroom photography, where it consists of

clamping together a blurred negative to a corresponding positive film and then developing this

combination to produce a sharper image.

A slight further generalization of unsharp masking is called high-boost filtering. A high-

boost filtered image, fhb, is defined at any point (x, y) as

High-boost filtering can be implemented with one pass using either of the two masks

shown in Fig. 3.42. Note that, when A=1, high-boost filtering becomes “standard” Laplacian

sharpening. As the value of A increases past 1, the contribution of the sharpening process

becomes less and less important. Eventually, if A is large enough, the high-boost image will be

approximately equal to the original image multiplied by a constant.

One of the principal applications of boost filtering is when the input image is darker than

desired. By varying the boost coefficient, it generally is possible to obtain an overall increase in

average gray level of the image, thus helping to brighten the final result.



Image Enhancement using First Derivatives–The Gradient

First derivatives in image processing are implemented using the magnitude of the

gradient. For a function f(x, y), the gradient of f at coordinates (x, y) is defined as the two-

dimensional column vector.

The components of the gradient vector itself are linear operators, but the magnitude of

this vector obviously is not because of the squaring and square root operations. On the other

hand, the partial derivatives in Eq. (3.7-12) are not rotation invariant (isotropic), but the

magnitude of the gradient vector is. Although it is not strictly correct, the magnitude of the

gradient vector often is referred to as the gradient.

The computational burden of implementing Eq. (3.7-13) over an entire image is not

trivial, and it is common practice to approximate the magnitude of the gradient by using absolute

values instead of squares and square roots:

This equation is simpler to compute and it still preserves relative changes in gray levels,

but the isotropic feature property is lost in general. However, as in the case of the Laplacian, the

isotropic properties of the digital gradient defined are preserved only for a limited number of

rotational increments that depend on the masks used to approximate the derivatives. As it turns

out, the most popular masks used to approximate the gradient give the same result only for

vertical and horizontal edges and thus the isotropic properties of the gradient are preserved only

for multiples of 90°. These results are independent of whether Eq. (3.7-13) or (3.7-14) is used, so

nothing of significance is lost in using the simpler of the two equations.

As in the case of the Laplacian, we now define digital approximations to the preceding

equations, and from there formulate the appropriate filter masks. In order to simplify the

discussion that follows, we will use the notation in Fig. 3.44(a) to denote image points in a 3x3

region. For example, the center point, z5 , denotes f(x, y), z1 denotes f(x-1, y-1), and so on.

Roberts cross-gradient operators

The simplest approximations to a first-order derivative that satisfy the conditions stated in

that section are Gx=(z8-z5) and Gy=(z6-z5). Two other definitions proposed by Roberts in the

early development of digital image processing use cross differences:



This equation can be implemented with the two masks shown below.These masks are

referred to as the Roberts cross-gradient operators.

Sobel Operators

Masks of even size are awkward to implement. The smallest filter mask in which we are

interested is of size 3x3. An approximation using absolute values, still at point z5, but using a 3x3

mask, is

The difference between the third and first rows of the 3x3 image region approximates the

derivative in the x-direction, and the difference between the third and first columns approximates

the derivative in the y-direction. The masks shown above are called the Sobel operators, can be

used to implement Eq. (3.7-18). The idea behind using a weight value of 2 is to achieve some

smoothing by giving more importance to the center point. Note that the coefficients in all the

masks shown above sum to 0, indicating that they would give a response of 0 in an area of

constant gray level, as expected of a derivative operator.

The gradient is used frequently in industrial inspection, either to aid humans in the

detection of defects or, what is more common, as a preprocessing step in automated inspection.

The gradient can be used to enhance defects and eliminate slowly changing background features.



UNIT-IV

IMAGE ENHANCEMENT IN FREQUENCY DOMAIN

The frequency domain is nothing more than the space defined by values of the fourier

transform and its frequency variables (u,v).

Some basic properties of the frequency domain

Each term of F(u,v) contains all values of f(x,y), modified by the values of the exponential

terms. Some general statements can be made about the relationship between the frequency

components of the Fourier transform and spatial characteristics of an image. For instance, since

frequency is directly related to rate of change, we can associate frequencies in the Fourier

transform with patterns of intensity variations in an image.

� The slowest varying frequency component (u=v=0) corresponds to the average gray level

of an image.

� As we move far away from the origin of the transform, the low frequencies correspond to

the slowly varying components of an image. In an image of a room, for example, these

might correspond to smooth gray-level variations on the walls and floor.

� As we move further away from the origin, the higher frequencies begin to correspond to

faster and faster gray level changes in the image. These are edges of objects and other

components of an image characterized by abrupt changes in gray level, such as noise.

Basics of filtering in the frequency domain

Filtering in the frequency domain consists of the following steps:

1. Multiply the input image by (-1)x+y to center the transform as shown below.

2. Compute F(u,v),th e Discrete Fourier transform of the image from the step 1

3. Multiply F(u,v) by a filter function H(u,v)

4. Compute the inverse Discrete Fourier transform of the result in step 3

5. Obtain the real part of the result in step 4

6. Multiply the result in step 5 by (-1)x+y

The reason that H(u,v) is called a filter is because it suppresses certain frequencies in the

transform while leaving others unchanged. In equation form, let f(x,y) represent the input image

F(u-M/2,

v-N/2) v

u

f(x,y)

y

x



in step 1 and F(u,v) its Fourier transform. Then the Fourier transform of the output image is

given by

In general, the components of F are complex quantities, but the filters which we deal are real. In

this case, each component of H multiplies both the real and imaginary parts of the corresponding

component in F. Such filters are called Zero-Phase-Shift filters. As their name implies, these

filters do not change the phase of the transform.

The filtered image is obtained by simply taking the inverse fourier transform of G(u,v):

The final image is obtained by taking the real part of thus result and multiplying it by (-1)x+y to

cancel the multiplication of the input image by this quantity.

In addition in the (-1)x+y process, examples of other pre-processing functions may include

cropping of the input image to its closest even dimensions, gray level slicing, conversion to

floating point on input, and conversion to an 8-bit integer format on the output. Multiple filtering

stages and other pre- and post processing functions are possible. The important point here is that

filtering process is based on modifying the transform of an image in some way via a filter

function, and then taking the inverse of the result to obtain the processed output image.

Some Basic filters and their properties

According the following equation, the average value of an image is given by F(0,0).

If we set this term to zero in the frequency domain and take the inverse transform, then the

average value of the resulting image will be zero. Assuming that the transform has been

centered, we can do this operation by multiplying all values of F(u,v) by the following filter

function:



The result of processing any image with the above transfer function is the drop in overall average

gray level resulting from forcing the average value to zero.

Low frequencies in the Fourier transform are responsible for the general gray-level appearance

of an image over smooth areas, while high frequencies are responsible for detail, such as edges

and noise.

� A filter that attenuates high frequencies while passing low frequencies is called a lowpass

filter. A lowpass-filtered image has less sharp detail than the original image because the

high frequencies have been attenuated. Such an image will appear smoother.

� A filter that attenuates low frequencies while passing high frequencies is called a low pass

filter. A highpass-filtered image would have less gray level variations in smooth areas and

emphasized transitional gray level detail. Such an image will appear sharper.

SMOOTHING FREQUENCY-DOMAIN FILTERS

Edges and other sharp transitions in the gray levels of an image contribute significantly to

the high-frequency content of its Fourier transform. Hence smoothing (blurring) is achieved in

the frequency domain by attenuating a specified range of high-frequency components in the

transform of a given image.

Our basic model for filtering in the frequency domain is given by

G(u,v)=H(u,v).F(u,v)

where F(u,v) is the fourier transform of the image to be smoothed. The objective is to select a

filter transfer function H(u,v) that yields G(u,v) by attenuating the high-frequency components of

F(u,v). we consider 3 types of lowpass filters: 1) Ideal Lowpass Filter (ILPF)

2) Butterworth Lowpass Filter (BLPF)

3) Gaussian Lowpass Filter (GLPF)

Ideal Lowpass Filter

The simplest lowpass filter we can visualize is a filter that cuts off all high frequency

components of the Fourier transform that are at a distance higher greater than a specified

distance D0 from the origin of the (centered) transform. Such a filter is called a two-dimensional

ideal lowpass filter and has transfer function shown below.

where D0 is a specified non-negative quantity, and D(u,v) is the distance from point (u,v) to the

origin of the frequency rectangle. If the image is of size MxN, we know that its transform also is

of same size, so the center of the frequency rectangle is at (u,v) = (M/2,N/2). In this case, the

distance from any point (u,v) to the center (origin) of the fourier transform is given by



The name ideal filter indicates that all frequencies inside a circle of radius D0 are passed with no

attenuation, where all frequencies outside this circle are completely attenuated. This filter is

radially symmetric about the origin. The complete filter transfer function can be visualized by

rotating the cross section 3600 about the origin. For an ideal lowpass filter cross section, the

point of transition between H(u,v) = 1 and H(u,v) = 0 I called the cutoff frequency, D0.

The lowpass filters can be compared by studying their behavior as a function of the same cutoff

frequencies. One way to establish a set of standard cutoff frequency loci is to compute circles

that enclose specified amounts of total image power PT. This quantity is obtained by summing

the components of the power spectrum at each point (u,v), for u = 0,1,2,…M-1 and v =

0,1,2,…N-1;

where P(u,v) is the power spectrum. If the transform has been centered, a circle of radius r with

origin at the center of the frequency rectangle encloses α percent of the power, where



It is clear from this example that ideal lowpass filtering is not very practical. The blurring and

ringing properties of the ideal lowpass filter can be explained by the reference to the convolution

theorem. The fourier transforms of the original image f(x,y) and the blurred image g(x,y) are

related in the frequency domain by the equation.

Where, H(u,v) is the filter function F and G are the fourier transforms of the two images just

mentioned. The convolution theorem states that the corresponding process in the spatial domain

is

where h(x,y) is the inverse fourier transform of the filter transfer function H(u,v). This h(x,y) has

two majot distinctive characteristics: a dominant component at the origin, and concentric,

circular components about the center component. The center component is primarily responsible

for blurring. The concentric components are responsible primarily for the ringing characteristics

of ideal filters. Both the radius of the center component and the number of circles per unit

distance from the origin are inversely proportional to the value of the cutoff frequency of the

ideal filter. So, as the cutoff frequency increases, blurring and ringing effect decreases.

Butterworth Lowpass Filter

The Butterworth filter has a parameter, called the filter order. For high values of this

parameter the Butterworth filter approaches the form of the ideal filter. For lower-order values,

the Butterworth filter has a smooth form similar to the Gaussian filter.

The transfer function of the Butterworth lowpass filter of order n, and with cutoff frequency at a

distance D0 from the origin, is defined as



where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle..

Unlike the Ideal lowpass filter, the BLPF transfer function does not have a sharp

discontinuity that establishes a clear cutoff between passed and filtered frequencies. For filters

with smooth transfer functions, defining a cutoff frequency locus at points for which H(u,v) is

down to a certain fraction of its maximum value is customary. In this case H(u,v) = 0.5 when

D(u,v) = D0.

Butterworth filtered image has a smooth transition in blurring as a function of increasing

cutoff frequency. A Butterworth filter of order 1 has no ringing. Ringing generally is

imperceptible in filters of order 2, but can become a significant factor in filters of higher order.

Spatial representations of BLPFs of for n = 1,2,5 and 20 respectively are shown below.

The BLPF of order 1 has neither ringing not negative values. The filter of order 2 does

show mild ringing and small negative value, but certainly less pronounces than in the ILPF. As

the remaining image show, ringing in the BLPF becomes significant for higher-order filters. A

Butterworth filter of order 20 exhibits the characteristics of the ILPF. In general, BLPFs of order

2 are a good compromise between effective lowpass filtering and acceptable ringing

characteristics.

Gaussian Lowpass Filter

The transfer function of a two dimensional Gaussian lowpass filter is given by

where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle., σ is a

measure of the spread of the Gaussian curve. By letting σ = D0, the transfer function changes to



When D(u,v) = D0, the filter is down to 0.607 of its maximum value.

The inverse Fourier transform of the Gaussian lowpass filter also is Gaussian. A spatial Gaussian

filter, obtained by computing the inverse Fourier transform of above equation will have no

ringing.

The Gaussian lowpass filter did not achieve as smoothing as the BLPF of order 2 for the same

value of cutoff frequency. This is because; the profile of the GLPF is not as tight as the profile of

the BLPF of order 2.

SHARPENING FREQUENCY-DOMAIN FILTERS

An image can be blurred by attenuating the high frequency components of its Fourier

transform. Because edges and other abrupt changes in gray levels are associated with high-

frequency components, image sharpening can be achieved in the frequency domain by a highpass

filtering process, which attenuates the low frequency components without disturbing the high

frequency information in the Fourier transform.

Because the intended function of the highpass filter is to perform the reverse operation of

the low pass filters, the transfer function of a highpass filter (Hhp(u,v)) can be obtained form its

corresponding transfer function of lowpass filter (Hlp(u,v)) as

we consider 3 types of lowpass filters: 1) Ideal Highpass Filter (IHPF)

2) Butterworth Highpass Filter (BHPF)

3) Gaussian Highpass Filter (GHPF)

Ideal Highpass Filter

The transfer function of a 2-D Ideal highpass filter is defined as

where D0 is the cutoff distance measured from the origin of the frequency rectangle, and D(u,v)

is the distance from point (u,v) to the origin of the frequency rectangle.. This filter is the opposite

of the ideal lowpass filter in the sense that it sets to zero all frequencies inside a circle of radius

D0 while passing, without attenuation, all frequencies outside the circle.



As in the case of ILPF, Ideal Highpass also has ringing effect. Because the spatial representation

of IHPF contains rings. It also contains a black spot at the center. Smaller objects in the image

cannot be filters because of the black spot in the spatial representation of the IHPF. Distortion of

the edges is also main problem in Ideal highpass filter. As the cutoff frequency increases,

distortion in the output image decreases and the spot size also decreases in the h(x,y) resulting in

the better filtering of smaller objects in the image f(x,y).

Butterworth Highpass Filter

The Butterworth Filter represents a transition between the sharpness of the ideal filter and

the total smoothness of the Gaussian filter. The transfer function of a Butterworth highpass filter

of order n and with cutoff frequency locus at a distance D0 from the origin is given by

where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle.

Butterworth filters behave smoother than Ideal highpass filters. In this filter, the distortion is

compared to that of IHPF. Since the center spot size in the spatial representations of IHPF and

BHPF are similar, the performance of the two filters in terms of filtering the smaller objects is

comparable. The transition into higher values of cutoff frequencies is much smoother with the

BHPF.

Gaussian Highpass Filter

The transfer function of the Gaussian highpass filter with cutoff frequency locus at a distance of

D0 from the origin is given by



where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle.

The results obtained with Gaussian highpass filter are smoother than that of IHPF and BHPF.

Even the filtering of smaller objects and thin bars is cleaner with Gaussian filter.

Laplacian in the frequency Domain

It can be shown that

… (1)

From the above expression, it follows that

… (2)

The expression inside the brackets on the left side of above equation is nothing but the Laplacian

of f(x,y). Thus we got the important result,

… (3)

The above equation says that the Laplacian can be implemented in the frequency domain by

using the filter

…(4)

But we generally center the F(u,v) by performing the operation f(x,y)(-1)x+y prior to taking the

transform of the image. If f or F are of size MxN, this operation shifts the center transform so

that (u,v)=0 is at point (M/2,N/2) is the frequency rectangle. So, the center of the filter function

also needs to be shifted as:

…(5)

The Laplacian filtered image in the spatial domain is obtained by computing the inverse fourier

transform of H(u,v)F(u,v) as shown below:

…(6)

Computing the Laplacian in the spatial domain and taking the fourier transform of result is

equivalent to multiplying F(u,v) by H(u,v) in (6)

…(7)



The spatial domain Laplacian filter function obtained by taking the inverse Fourier transform of

equation (5). Figure below shows the mask used to implement the definition of Laplacian in the

spatial domain.

The enhanced image g(x,y) can be obtained by subtracting the Laplacian image from the original

image:

…(8)

Instead of enhancing the image in two steps (first calculating the Laplacian image and

subtracting from original image), a single mask can be used to perform the entire operation in the

frequency domain with only one filter given by (substituting (7) in (8))

The above equation was obtained from the following equation, from which the enhanced image

can be obtained with a single transformation operation:

Unsharp Masking, High Boost Filtering, High Frequency Emphasis Filtering

The average background intensity in a highpass filtered image is near to black. This is

due to the fact that the highpass filters eliminate the zero-frequency component of their Fourier

transforms. The solution to this problem consists of adding a portion of the image back to the

filtered result as in Laplacian technique. Sometimes it is advantageous to increase the

contribution made by the original image to the overall filtered result. This approach is called

high-boost filtering, which is a generalization of unsharp masking.

Unsharp masking consists of simply generating a sharp image by subtracting from an image, a

blurred version of itself. That is, obtaining a highpass filtered image by subtracting from the

image a lowpass filtered version of itself. That is,

fhp(x,y) = f(x,y) - flp(x,y)… (1)

High-Boost filtering generalizes this by multiplying f(x,y) by a constant A≥1

fhb = Af(x,y)-f lp(x,y)… (2)

Thus, high-boost filtering gives us the flexibility to increase the contribution made by the image

to the overall enhanced image. The above equation can be changed as

fhb = (A-1)f(x,y) + f(x,y) - flp(x,y)

=> fhb = (A-1)f(x,y)-fhp(x,y)…(3)

The above result is based on a highpass rather than a lowpass image. When A=1, high-boost

filtering reduces to regular highpass filtering. As A increases past 1, the contribution made by the

image itself becomes more dominant.



We know,

Flp(u,v) = Hlp(u,v)F(u,v)…(4)

Fhp(u,v) = Hhp(u,v)F(u,v)…(5)

Where Hlp is the transfer function of a lowpass filter and Hhp is the transfer function of highpass

filter.

Converting equation (1) into frequency domain

Fhp(u,v) = F(u,v) - Flp(u,v)…(6)

Substituting equation (4) in equation (6)

Fhp(u,v) = F(u,v) - Hlp(u,v)F(u,v)

=> Fhp(u,v) = F(u,v) (1- Hlp(u,v))… (7)

Therefore, unsharp masking can be obtained directly in the frequency domain by using the

composite filter

Hhp(u,v) = (1- Hlp(u,v))…(8)

Converting equation (3) into frequency domain

Fhb = (A-1)F(u,v)-Fhp(u,v)…(9)

Substituting equation (5) in equation (9)

Fhb = (A-1)F(u,v)- Hhp(u,v)F(u,v)

=> Fhb = F(u,v) ((A-1) - Hhp(u,v))… (10)

Therefore, high-boost filtering can be obtained directly in the frequency domain by using the

composite filter

Hhb(u,v) = (A-1) - Hhp(u,v)…(11)

Sometimes it is advantageous to accentuate the contribution to enhancement made by the high-

frequency components of an image. In this case, we simply multiply a highpass filter function by

a constant and add an offset so that the frequency term is not eliminated by the filter. This

process is called High-Frequency Emphasis filtering. It has a transfer function given by

where a≥0 and b>a. typical values of a range from 0.25 to 0.5 and typical values of b range from

1.5 to 2.0. High-frequency emphasis filtering reduces to high boost filtering when a=(A-1) and

b=1. When b>1, the high frequencies are emphasized (highlighted), thus giving this procedure its

name.

HOMOMORPHIC FILTERING

The illumination-reflectance model can be used to develop a frequency domain procedure for

improving the appearance of an image by simultaneous gray-level range compression and

contrast enhancement. An image f(x,y) can be expressed as the product of illumination and

reflectance components:

…(1)

The above equation cannot be used directly to operate separately on the frequency components

of illumination and reflectance because the fourier transform of the product of two functions is

not separable.



So, Let us define

…(2)

Then

… (3)

or

…(4)

Where Fi(u,v) and Fr(u,v) are the fourier transforms of ln i(x,y) and ln r(x,y) respectively.

If we process Z(u,v) by means of a filter function H(u,v) then,

…(5)

Where S(u,v) is the fourier transform of the result s(x,y). In the spatial domain,

… (6)

By letting

… (7)

and

… (8)

Now, the equation (6) can be expressed as

… (9)

Finally, as z(x,y) is formed by taking the logarithm of the original image f(x,y), the inverse

(exponential) operation yields the desired enhanced image, denoted by g(x,y)

Where

,

The above operations can be represented in the form of block diagram as

This method is based on a special case of a class of systems known as homomorphic systems. In

this particular application, the key approach is the separation of the illumination and reflectance



components achieved in the form shown in equation (4). The homomorphic filter function H(u,v)

can then operate on these components separately as in equation (5).

The illumination component of an image is generally characterized by slow spatial variations,

while the reflectance component tends to vary abruptly, particularly at the junctions of dissimilar

objects. So, we can associate the low frequencies of the Fourier transform of the logarithm of an

image with the illumination and the high frequencies with reflectance.

A good deal of control can be gained over the illumination and reflectance components with a

homomorphic filter. This control requires specification of a filter function H(u,v) that effects the

low and high frequency components of the fourier transform in different ways. The below figure

shows the cross section of such a filter.

If the parameters γL and γH are chosen so that γL <1 and γH>1, the filter function tends to decrease

the contribution made by low frequencies (illumination) and amplifies the contribution made by

high frequencies (reflectance). The net result is simultaneous dynamic range compression (by log

function) and contrast enhancement (by H(u,v)).

Figure 4.33 is typical of the results that can be obtained with the homomorphic filter function. N

in the original image shown in figure 4.33(a) the details inside the shelter are obscured by the

glare from the outside walls. Figure 4.33(b) shows the result of processing this image by

homomorphic filtering, with γL = 0.5 and γH = 2. A reduction of dynamic range in the brightness,

together with an increase in contrast, brought out the details of objects inside the shelter and

balanced the gray levels of the outside wall. The enhanced image also is sharper



UNIT-VI

IMAGE RESTORATION

Restoration attempts to reconstruct or recover an image that has been degraded by using a

priori knowledge of the degradation phenomenon. Thus restoration techniques are oriented

towards modeling the degradation and applying the inverse process in order to recover the

original image.

We consider the restoration problem only from the point where degraded, digital image is

given; thus we consider topics dealing with sensor, digitizer and display degradations only.

A MODEL OF THE IMAGE DEGRADATION/RESTORATION PROCES S

A degradation function that, together with an additive noise term, operates on an input

image f(x,y) to produce a degraded image g(x,y). Given g(x,y), some knowledge about the

degradation function H, and some knowledge about the additive noise term η(x,y), the objective

of restoration is to obtain an estimate of the original image. We want to estimate to be

as close as possible to the original image and, in general, the more we know about the H and η,

the closer will be to f(x,y).

If H is linear, position invariant process then the degraded image is given in the spatial

domain by

…(1)

where h(x,y) is the spatial representation of the degradation function and the symbol * indicates

the spatial resolution. The convolution in the spatial domain is equal to multiplication in the

frequency domain, so we may write the model in above equation in an equivalent frequency

domain representation.

where the terms in the capital letters are the fourier transforms of the corresponding terms in

equation (1).

NOISE MODELS

The principal sources of noise in digital images arise during image acquisition

(digitization) and/or transmission.

� The performance of imaging sensors is affected by a variety of factors, such as

environmental conditions during image acquisition, and by the quality of the sensing



elements themselves. For instance, in acquiring image with a camera, light levels and

sensor temperature are major factors affecting the amount of noise in the resulting image.

� Images are corrupted during transmission principally due to interference in the channel

used for transmission. For example, an image transmitted using a wireless network might

be corrupted as a result of lightening or other atmospheric disturbance.

Spatial characteristics of a noise refer to whether the noise is correlated with an image.

Frequency properties refer to the frequency content of noise in the Fourier sense. For example,

when the Fourier spectrum of noise is constant, the noise is usually is called white noise. This

terminology is a carry over from the physical properties of white light, which contains nearly all

frequencies in the visible spectrum in equal proportions.

The noise we are going to consider here are 1. Gaussian Noise 2. Rayleigh Noise 3. Erlang Noise 4. Exponential Noise 5. Uniform Noise 6. Impulse (salt-and-pepper) Noise 7. Periodic Noise

With the exception of periodic noise, we assume here that noise is independent of spatial

coordinates and that it is uncorrelated with respect to the image itself (that is, there is no

correlation between pixel values and the values of noise components). Because it is difficult to

deal with noises that are spatially dependent and correlated.

Gaussian Noise

Because of its mathematical tractability in both spatial and frequency domains, Gaussian (also

called normal) noise models are used frequently practice. In fact, this tractability is so convenient

that it often results in Gaussian models being used in situations in which they are marginally

applicable at best.

The PDF of a Gaussian random variable, z, is given by



Where z represents the gray level, µ is the mean average value of z, and σ is its standard

deviation. The standard deviation squared, σ2 is called the variance of z. when z is described by

the above equation, 70% of its values will be in the range [(µ−σ),(µ+σ)], and about 95% will be

in the range [(µ−2σ),(µ+2σ)].

Rayleigh Noise

The PDF of Rayleigh noise is given by

The mean and variance of this density are given by

Note the displacement from the origin and the fact that the basic shape of this density is skewed

to the right. The Rayleigh density can be quite useful for approximating skewed histograms.

Erlang (Gamma) Noise

The PDF of Erlang Noise given by

Where the parameters are such that a>0, b is a positive integer and ! indicates factorial. The

mean and variance of this density are given by



Although the above equation is often referred to as the gamma density, strictly speaking this

correct only when the denominator is the gamma function, Γ(b). When the denominator is as

shown, the density is more appropriately called the Erlang density.

Exponential Noise

The PDF of exponential noise is given by

where a>0. The mean and variance of this density function given by

The PDF of exponential noise is a special case of the Erlang PDF, with b=1

Uniform Noise

The PDF of uniform noise is given by

The mean and variance of this density function is given by



Impulse (Salt-and-Pepper) Noise

The PDF of (bipolar) impulse noise is given by

If b>a, gray level b will appear as a light dot in the image and level ‘a’ will appear like a dark

dot. If either Pa or Pb is zero, the impulse noise is called unipolar. If neither probability is zero,

and especially if they are approximately equal, impulse noise values will resemble salt-and –

pepper granules randomly distributed over the image. This noise is called shot noise or spike

noise.

DEGRADATION MODEL

The degradation process can be modeled as an operator or system H, which together with

an additive noise term η(x,y) operates on an input image f(x,y) to produce a degraded image

g(x,y). Image restoration may be viewed as the process of obtaining an approximation to f(x,y),

given g(x,y) and a knowledge of the degradation in the form of the operator H.

The input output relation in above figure is expressed as

g(x,y) = H[f(x,y)]+ η(x,y)…(1)

For a moment, let us assume that η(x,y)=0, so that g(x,y) = H[f(x,y)]

The operator H is said to be linear if

H[k1f1(x,y) + k2f2(x,y)] = k1h[f1(x,y)] + k2H[f2(x,y)]…(2)

where k1 and k2 are constants and f1(x,y) and f2(x,y) are two input images.

If k1 = k2 =1,then equation (2) becomes

H[f1(x,y) + f2(x,y)] = h[f1(x,y)] + H[f2(x,y)]…(3)

The above equation is called the property of additivity; this property simply says that, if H is a

linear operator, the response to a sum of two points is equal to the sum of the two responses.

When f2(x,y) = 0, equation (2) becomes

H + f(x,y)

g(x,y)

η(x,y)



H[k1f1(x,y)] = k1h[f1(x,y)]…(4)

The above equation is called the property of homogeneity. It says that the response to a constant

multiple of any input is equal to the response to that input multiplied by the same constant. Thus

the linear property possesses both the property of additivity and the property of homogeneity.

An operator having the input-output relation g(x,y) = H[f(x,y)] is said to be position (or space)

invariant if

H[f(x-α,y-β)] = g(x-α,y-β)…(5)

This definition indicates that the response at any point in the image depends only on the value of

the input at that point and not on the position of the point.

Degradation model for Continuous case

f(x,y) can be expressed in impulse form as

( , ) ( , ) ( , )f x y f x y d dα β δ α β α β∞ ∞

−∞ −∞

= − −∫ ∫…(6)

Then, if η(x,y) = 0, substituting equation (6) in (1)

( , ) [ ( , )] ( , ) ( , )[ ]g x y H f x y H f x y d dα β δ α β α β∞ ∞

−∞ −∞

= = − −∫ ∫…(7)

If H is a linear operator, above equation changes as

( , ) ( , ) ( , )[ ]g x y H f x y d dα β δ α β α β∞ ∞

−∞ −∞

= − −∫ ∫…(8)

Since f(α,β) is independent of x,y and from the homogeneity property

( , ) ( , ) ( , )[ ]g x y f H x y d dα β δ α β α β∞ ∞

−∞ −∞

= − −∫ ∫…(9)

The term H[δ(x-α,y-β)] is called the impulse response of H and is denoted as

h(x,α,y,β)= H[δ(x-α,y-β)]…(10)

from equations (9) and (10), we can write

( , ) ( , ) ( , , , )g x y f h x y d dα β α β α β∞ ∞

−∞ −∞

= ∫ ∫…(11)

The above equation is called the superposition (or Fredholm) integral of the first kind. This

expression states that of the response of H to a impulse is known, the response to any inpout

f(α,β) can be calculated by means of equation (11)

If H is position invariant from equation (5)

H[δ(x-α,y-β)] = h(x-α,y-β)…(12)

Now from equations (12), (11) and (10)



( , ) ( , ) ( , )g x y f h x y d dα β α β α β∞ ∞

−∞ −∞

= − −∫ ∫…(13)

This is nothing but the convolution integral.

In the presence of additive noise the above expression describing the linear degradation model

becomes

( , ) ( , ) ( , ) ( , )g x y f h x y d d x yα β α β α β η∞ ∞

−∞ −∞

= − − +∫ ∫…(15)

Many types of degradations can be approximated by linear, position invariant processes. The

advantage of this approach is that the extensive tools of linear system theory then become

available for the solution of image restoration problems.

Degradation model for Discrete case

Suppose that f(x) and h(x) are sampled uniformly to from arrays of dimensions A and B

respectively. In this case x is a discrete variable in the ranges 0,1,2,…A-1 for f(x) and 0,1,2,…

B-1 for h(x).

The discrete convolution is based in the assumption that the sampled functions are periodic, with

a period M. Overlap in the individual periods of the resulting convolution is avoided by choosing

M ≥ A+B-1 and extending the functions with zeroes so that their length is equal to M.

Let fe(x) and he(x) represent the extended functions. Their convolution is given by

1

0

( ) ( ) ( )M

e e em

g x f m h x m−

=

= −∑…(1)

for x=0,1,2,…, M-1. As both fe(x) and he(x) are assumed to have period equal to M, ge(x) also

has the same period.

The above equation can be represented in matrix form as

g = Hf …(2)

where f and g are M-dimensional column vectors

(0)

(1)

.

.

.

( 1)

e

e

e

f

f

f

f M

=

− …(3)



(0)

(1)

.

.

.

( 1)

e

e

e

g

g

g

g M

=

− …(4)

and H is an MxM matrix

(0) ( 1) ( 2) ... ( 1)

(1) (0) ( 1) ... ( 2)

(2) (1) (0) ... ( 3)

. . . . .

. . . . .

. . . . .

( 1) ( 2) ( 3) ... (0)

e e e e

e e e e

e e e e

e e e e

h h h h M

h h h h M

h h h h M

H

h M h M h M h

− − − + − − + − + = − − −

Because of the periodicity assumption on he(x), it follows that he(x) = he (M+x). Using this

property the above matrix can be changed as

(0) ( 1) ( 2) ... (1)

(1) (0) ( 1) ... (2)

(2) (1) (0) ... (3)

. . . . .

. . . . .

. . . . .

( 1) ( 2) ( 3) ... (0)

e e e e

e e e e

e e e e

e e e e

h h M h M h

h h h M h

h h h h

H

h M h M h M h

− − − = − − −

In the above matrix, the rows are related by a circular shift to the right; that is the right-most

element in one row is equal to the left-most element in the row immediately below. The shift is

called circular because an element shifted off the right end of row reappears at the left end of the

next row. Moreover, the circularity of the H is complete in the sense that it extends from the last

row back the first row. A square matrix in which each row is a circular shift of the preceding

row, and the first row is a circular shift of the last row, is called a circulant matrix.

Extension of the discussion to a 2D, discrete degradation model is straightforward. For two

digitized images f(x,y) and h(x,y) of sizes AxB and CxD respectively, extended sizes of MxN

may be formed by padding the above functions with zeroes. That is

fe(x,y) = f(x,y) 0 ≤ x ≤ A-1 and 0 ≤ y ≤ B-1

= 0 A ≤ x ≤ M-1 or B ≤ y ≤ N-1

and

he(x,y) = h(x,y) 0 ≤ x ≤ C-1 and 0 ≤ y ≤ D-1

= 0 C ≤ x ≤ M-1 or D ≤ y ≤ N-1



Treating the extended functions fe(x,y) and he(x,y) as periodic in two dimension, with periods M

and N in the x and y directions, respectively

1 1

0 0

( , ) ( , ) ( , )M N

e e em n

g x y f m n h x m y n− −

= =

= − −∑∑

For x=0,1,2,…,M-1 and y=0,1,2,…,N-1

The convolution function ge(x,y) is periodic with the same period of fe(x,y) and he(x,y). Overlap

of the individual convolution periods is avoided by chosing M ≥ A+C-1 and N ≥ B+D-1.

Now, the complete discrete degradation model can be given by adding an MxN extended discrete

noise term ηe(x,y) to the above equation

( )1 1

e0 0

( , ) ( , ) ( , ) x, yM N

e e em n

g x y f m n h x m y n η− −

= =

= − − +∑∑

For x=0,1,2,…,M-1 and y=0,1,2,…,N-1

The above equation can be represented in matrix from as

g=Hf+n

where f,g,n are MN-dimensional column vectors formed by stacking the rows of the MxN

functions fe(x,y), ge(x,y) and ηe(x,y). The first N elements of f, foe example are the elements in

the first row of fe(x,y), the next N elements are form the second row, and so on for all the M

rows of fe(x,y). So, f,g and n of dimension MNx1and H is of dimension MnxMN. This matrix

consists of M2 partitions, each partition being of size NxN and ordered according to

0 1 2 1

1 0 1 2

2 1 0 3

1 2 3 0

...

...

...

. . . . .

. . . . .

. . . . .

...

M M

M

M M M

H H H H

H H H H

H H H H

H

H H H H

− −

−

− − −

=

Each partition Hj is constructed from the jth row of the extended function he(x,y) as follows

( ,0) ( , 1) ( , 2) ... ( ,1)

( ,1) ( ,0) ( , 1) ... ( , 2)

( ,2) ( ,1) ( ,0) ... ( ,3)

. . . . .

. . . . .

. . . . .

( , 1) ( , 2) ( , 3) ... ( ,0)

e e e e

e e e e

e e e e

j

e e e e

h j h j N h j N h j

h j h j h j N h j

h j h j h j h j

H

h j N h j N h j N h j

− − − = − − −

Here, Hj is a circulant matrix, and the blocks of H are subscripted in a circular manner. For these

reasons, the matrix H is called a Block-Circulant Matrix.



ALGEBRAIC APPROACH TO RESTORATION

The objective of image restoration is to estimate an original image f from a degraded image g

and some knowledge or assumption about H and n. Central to the algebraic approach is the

concept of seeking an estimate of f, denoted f̂ , that minimizes a predefined criterion of

performance. Because of their simplicity, least squares method is used here.

Unconstrained Restoration

From g=Hf+n, the noise term in the degradation model is

n=g-Hf … (1)

In the absence of any knowledge of n, a meaningful criterion function is to seek an f̂ such that

ˆHf approximates g in a least squares sense by assuming that the norm of the noise term is as

small as possible. In other words, we want to find an f̂ such that

22 ˆn = g-Hf… (2)

is minimum, where

2 Tn n n=and

2Tˆ ˆ ˆg-Hf (g-Hf ) (g-Hf )=

are the squared norms of n and ˆ(g-Hf ) respectively.

Equation (2) allows the equivalent view of this problem as one of minimizing the criterion

function with respect to ̂f .

2ˆ ˆ(f ) g-HfJ =… (3)

Aside from the requirement that it should minimize equation (3) ̂f is not constrained in any other

way.

Now, we want to know, for what value of f̂ , the function ˆ(f )J minimizes to least value. For

that, simply differentiate J with respect to f̂ and set the result equal to zero vector.

Tˆ(f ) ˆ0 2H (g-Hf)

f̂

J∂ = = −∂

Solving the above equation for f

=>T T ˆ2H g+2H Hf=0−

=> T T ˆH g=H Hf

=> T -1 Tf̂=(H H) H g

Letting M=N so that H is a square matrix and assuming that H-1 exists, the above equation

reduces to



-1 T -1 Tf̂=H (H ) H g

-1f̂=H g

Constrained Restoration

In this section, we consider the least squares restoration problem as one of minimizing

functions of the form

2ˆQf, where Q is a linear operator on f, subject to the constraint

2 2ˆg-Hf n=.This approach introduces considerable flexibility in the restoration process

because it yields different solutions for different choices of Q.

The addition of an equality constraint in the minimization problem can be handled without

difficulty by using the method of Lagrange Multipliers. The procedure calls for expressing the

constraint in the form

2 2ˆg-Hf n( )α −and then appending it to the function

2ˆQf. In other

words, we seek an f̂ that minimizes the criterion function

2 2 2ˆ ˆ ˆ(f ) Qf g-Hf n( )J α= + −

Where α is a constant called the Lagrange multiplier. After the constraint has been appended,

minimization is carried out in the usual way.

Differentiating above equation with respect to f̂ and setting the result equal to zero vector yields

T Tˆ(f ) ˆ ˆ0 2Q Qf - 2 g-Hf

f̂Hα ( )J∂ = =

∂

Now, solving for ̂f ,

T T1ˆ ˆf = H H+ H (g-Hf)( )α

The quantity 1/ α must be adjusted so that the constraint is satisfied.

INVERSE FILTERING

The simplest approach to restoration is direct inverse filtering, where we compute an estimate,

F̂(u,v), of the transform of the original image simply by dividing the transform of the degraded

image , G(u,v) by the degradation function:

But we know, G(u,v)=F(u,v)H(u,v)+N(u,v) Substituting this in above equation gives



The image restoration approach in above equations is commonly referred to as the inverse

filtering method. This terminology arises from considering H(u,v) as filter function that

multiplies F(u,v) to produce the transform of the degraded image g(x,y).

The above equation tells us that even if we know the degradation function we cannot

recover the undegraded image exactly because N(u,v) is a random function whose fourier

transform is not known.

If the degradation has zero or very small values, then the ratio N(u,v)/H(u,v) could easily

dominate the estimate F̂(u,v).

One approach to get around the zero or small-value problem is to limit the filter frequencies to

values neat the origin. By limiting the analysis to frequencies near the origin, we reduce the

probability of encountering zero values.

LEAST MEAN SQUARE FILTER/

MINIMUM MEAN SQUARE ERROR (WIENER) FILTERING

The inverse filtering makes no explicit provision for handling noise. This Wiener filtering

method incorporates both the degradation function and statistical characteristics images and

noise as random process, and the objective is to find an estimate ̂f of the uncorrupted image f

such that the mean square error between them is minimized. This error measure is given by

…(1)

where E{.} is the expected value of the argument. It is assumed that the noise and the image are

uncorrelated; that one or the other has zero mean; and that the gray levels in the estimate are a

linear function of the levels in the degraded image. Based on these conditions, the minimum of

the error function in above equation is given in the frequency domain by the expression

…(2)

The terms in the above equations are as follows:

The result in equation (2) is known as the Weiner filter. It is also referred to as the

minimum mean square error filter or the least square error filter. It does not have the same

problem as the inverse filter with zeroes in the degradation function, unless both H(u,v) and

Sη(u,v) are zero for the same values of u and v.



If the noise is zero, then the noise power spectrum vanishes and the wiener filter reduces to the

inverse filter. When we are dealing with spectrally white noise, the spectrum |N(u,v)|2 is a

constant, which simplifies things considerably. Now, the above equation can be written as

where K is a specified constant.

Generally, Wiener filter works better than inverse filtering in the presence of noise and

degradation function.

CONSTRIAINED LEAST SQUARES FILTERING

The difficulty in Wiener filtering is, the power spectra of the undegraded image and noise

must be known. But this constrained least squares filtering method requires knowledge of only

the mean and variance of the noise. These parameters can be calculated from a given degraded

image, so this is an important advantage. Another difference is that the Wiener filter is based on

minimizing a statistical criterion and, as such, it is optimal in an average sense. But this method

has a notable feature that it yields an optimal result for each image to which it is applied.

The degraded image can be represented in matrix form as

…(1)

The problem here is H is highly sensitive to noise. One way to lighten the noise sensitivity

problem is to base optimality of restoration on a measure of smoothness, such as second

derivative of an image. To be meaningful, the restoration must be constrained by the parameters

of the problems. Thus, what is desired is to find the minimum of a criterion function, defined by

…(2)

subject to the constraint

…(3)

is the vector norm and f̂ is the estimate of the undegraded image.

The frequency domain solution to this optimization problem is given by the following expression



where γ is a parameter that must be adjusted so that the constraint in equation (3) is satisfied, and

P(u,v) is the fourier transform of the function p(x,y)

We can recognize the above function as the Laplacian operator.

By comparing the constrained least squares and Wiener results, it is noted that the former

yielded slightly better results for the high and medium noise cases. It is not unexpected that the

constrained least squares filter would outperform the Wiener filter when selecting the parameters

manually for better visual results. The parameter γ is a scalar, while the value of K in Wiener

filtering is an approximation to the ratio of two unknown frequency domain functions, whose

ratio seldom is constant. Thus, it stands to reason that a result based on manually selecting γ

would be more accurate estimate of the undegraded image. The difference between Wiener

filtering and constrained least square restoration method is

1. The Wiener filter is designed to optimize the restoration in an average statistical sense over a

large ensemble of similar images. The constrained matrix inversion deals with one image only

and imposes constraints on the solution sought.

2. The Wiener filter is based on the assumption that the random fields involved are homogeneous

with known spectral densities. In the constrained matrix inversion it is assumed that we know

only some statistical property of the noise.

In the constraint matrix restoration approach, various filters may be constructed using the same

formulation by simply changing the smoothing criterion.

RESTORATION IN THE PRESENCE OF NOISE ONLY-

SPATIAL FILTERING :

We know that the general equations for degradation process in spatial and frequency domain are

given by

When the only degradation present in an image is only noise, the above equations become

The noise terms are unknown, so subtracting them from g(x,y) or G(u,v) is not a realistic option.

Spatial filtering is the method of choice in situations when only additive noise is present.

MEAN FILTERS



Arithmetic Mean Filter

This is the simplest of the mean filters. Let Sxy represent the set of coordinates in a rectangular

subimage window of size mxn, centered at point (x,y). The arithmetic mean filtering process

computes the average value of the corrupted image g(x,y) in the area defined by Sxy. The value

of the restored image ̂f at any point (x,y) is simply the arithmetic mean computed using the

pixels in the region defined by Sxy.

This operation can be implemented using a convolution mask in which all coefficients have

value 1/mn. A mean filter simply smoothes local variations in the image. Noise is reduced as

result of blurring.

Geometric Mean Filter

An image restored using a geometric mean filter is given by the expression

Here, each restored pixel is given by the product of the pixels in the subimage window, raised to

the power 1/mn.

A geometric mean filter achieves smoothing comparable to the arithmetic mean filter, nut it

tends to lose less image detail in the process.

Harmonic Mean Filter

The harmonic mean filtering operation is given by the expression

The harmonic mean filter works well for salt noise, but fails for pepper noise. It does well also

with other types of noise like Gaussian noise.

Contraharmonic Mean Filter

The Contraharmonic mean filtering operation yields a restored image based in the expression.

where Q is called the order of the filter.

This filter is well suited for reducing or virtually eliminating the effects of salt and pepper noise.

For positive values of Q, the filter eliminates pepper noise.

For negative values of Q, the filter eliminates salt noise



For Q=0, this filter reduces to arithmetic mean filter

For Q=-1, this filter reduces to harmonic mean filter.

In general, the arithmetic mean and geometric mean filters are well suited for random

noise like Gaussian or uniform noise. The Contraharmonic filter is well suited for impulse noise,

but it has the disadvantage that it must be known whether the noise is dark or light in order to

select the proper sign for Q. The results of choosing the wrong sign for Q can be disastrous.

ORDER-STATISTICS FILTERS

Order-statistics filters are spatial filters whose response is based on ordering the pixels

contained in the image area encompassed by the filter. The response of the filter at any point is

determined by the ranking result.

Median Filter

It replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel:

For certain types of noises, median filters provide excellent noise reduction capabilities, with

considerably less blurring than linear smoothing filters of similar size. Median filters are

particularly effective in the presence of both bipolar and unipolar noise.

Max and Min Filters

The median filter represents the 50th percentile of a ranked set of numbers. The 100th

percentile result is represented by the Max filter, given by

Max filter is useful for finding the brightest points in an image. It can be used to reduce the

pepper noise from the image. But it removes (sets to a light gray level) some dark pixel from the

borders of the dark objects

The 0th percentile result is represented by the Min filter, given by

Min filter is useful for finding the darkest points in an image. It can be used to reduce the salt

noise from the image. But it removes white points around the border of light objects.

Mid point Filter

The midpoint filter simply computes the midpoint between the maximum and minimum

values in the area encompassed by the filter



This filter combines order statistics ad averaging. This filter works best for randomly distributed

noise like Gaussian noise.

Alpha-Trimmed mean Filter

Suppose that we delete the d/2 lowest and d/2 highest gray-level values of g(s,t) in the

neighborhood Sxy. Let gr(s,t) represent the remaining mn-d pixels. A filter formed by averaging

these remaining pixels is called the alpha-trimmed mean filter.

Where the value of d can range from 0 to mn-1

When d=0, this filter reduces to the arithmetic mean filter

When d=(mn-1)/2 this filter becomes to median filter.

For other values of d, the alpha-trimmed filter is useful in situations involving multiple types of

noise, such as combination of salt and pepper and Gaussian noise.

ADAPTIVE FILTERS

Once selected, the mean filters and order-statistics filters are applied to an image without

regard for how image characteristics vary from one point to another. Adaptive filters are those,

whose behavior changes based on statistical characteristics of the image inside the filter region

defined by the mxn rectangular window Sxy. Adaptive filters are capable of performance superior

to that of the other filters, but with increase in filter complexity.

Adaptive Local Noise Reduction Filter

The simplest statistical measures of a random variable are its mean and variance. These

are reasonable parameters on which to base an adaptive filter because they are quantities closely

related to the appearance of an image. The mean gives a measure of average gray level in the

region over which the mean is computed, and the variance gives a measure of average contrast in

that region.

Our filter is to operate in a local region Sxy. The response of the filter at any point (x,y) on which

the region is centered is to be based on four quantities :

i) g(x,y), the value of the noisy image at (x,y)

ii) ση2, the variance of the noise corrupting f(x,y) to form g(x,y)

iii) mL, the local mean of the pixels in Sxy and

iv) σL2, the local variance of the pixels in Sxy.

The behavior of the filter is to be as follows:



An adaptive expression for obtaining f̂(x,y) based on the above assumptions may be written as

The only quantity that needs to be known or estimated is the variance of the overall noise ση2.

The other parameters are computed from the pixels in Sxy at each location (x,y) on which the

filter window is centered. An implicit assumption in above expression is that ση2 ≤ σL

2, because

the noise in our model is additive and position independent.

Adaptive Median Filter

The median filter performs well as long as the spatial density of the impulse noise is not

large. Adaptive median filtering can handle impulse noise even with large probabilities. An

additional advantage of the adaptive median filter is that it seeks to preserve detail while

smoothing non-impulse noise, something that the traditional median filter does not do. The

adaptive filter also works in a rectangular window area Sxy. Unlike the other filters, the adaptive

median filter changes (increases) the size of Sxy during filter operation, depending on certain

conditions.

Consider the following notation:

The adaptive median filtering algorithm works in two levels, denoted level A and level B, as

follows:



The adaptive median filtering has three main purposes:

1. To remove salt-and-pepper (or impulse) noise,

2. To provide smoothing of other noise that may not be impulsive and

3. To reduce distortion, such as excessive thinning or thickening of object boundaries.

Every time the algorithm outputs a value, the window Sxy is moved to the net location in the

image. The algorithm is then reinitialized and applied to the pixels in the new location.

PERIODIC NOISE REDUTION BY FREQUENCY DOMAIN FITLERI NG

Periodic Noise

Periodic noise in an image arises typically form electrical and electromechanical

interference during image acquisition. This is the only type of spatially dependent noise.

Periodic noise can be reduced significantly with frequency domain filtering.

Band Reject Filters

Band Pass Filters

Notch Filters

Optimum Notch Filtering/ Interactive Restoration

Clearly defined interference patterns are not common. Images derived from electro-optical

scanners, such as those used in space and aerial imaging, sometimes are corrupted by coupling

and amplification of low-level signals in the scanners’ electronics circuitry. The resulting images

tend to contain pronounced, 2D periodic structures superimposed on the scene data with more

complex patterns.

When several interference components are present, the methods like band pass and band

reject are not always acceptable because they may remove too much image information in the

filtering process. The method discussed here is optimum, in the sense that it minimizes local

variances of the restored imagef̂(x,y) .

The procedure consists of first isolating the principal contributions of the interference

pattern and then subtracting a variable, weighted portion of the pattern from the corrupted image.

QUESTION AND ANSWERS

1. What is image restoration?



Image restoration is the improvement of an image using objective criteria and prior knowledge

as to what the image should look like.

2. What is the difference between image enhancement and image restoration?

In image enhancement we try to improve the image using subjective criteria, while in image

restoration we are trying to reverse a specific damage suffered by the image,using objective

criteria.

3. Why may an image require restoration?

An image may be degraded because the grey values of individual pixels may be altered, or it may

be distorted because the position of individual pixels may be shifted away from their correct

position. The second case is the subject of geometric restoration.

Geometric restoration is also called image registration because it helps in finding corresponding

points between two images of the same region taken from different viewing angles. Image

registration is very important in remote sensing when aerial photographs have to be registered

against the map, or two aerial photographs of the same region have to be registered with each

other.

4. What is the problem of image restoration?

The problem of image restoration is: given the degraded image g, recover the original

undegraded image f .

5. How can the problem of image restoration be solved?

The problem of image restoration can be solved if we have prior knowledge of the point spread

function or its Fourier transform (the transfer function) of the degradation process.

6. The white bars in the test pattern shown in figure are 7 pixels wide and 210 pixels high.

The separation between bars is 17 pixels. What would this image look like after application

of different filters of different sizes?

Solution:

The matrix representation of a portion of the given image at any end of a vertical bar is



0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0

0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0

0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0

0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0

0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0

0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0

a) A 3x3 Min Filter:

b) A 5x5 Min Filter:

c) A 7x7 Min Filter:

d) A 9x9 Min Filter:

Explanation:

The 0th percentile result is represented by the Min filter, given by

Min filter is useful for finding the darkest points in an image. It can be used to reduce the salt

noise from the image. But it removes white points around the border of light objects. But for the

given image, the effect of Min filter is decrease in the width and height of the white vertical bars.

As the size of the filter increase, the width and height of the vertical bars decrease.

(a) (c) (d)

a) The resulting image consists of vertical bars of 5 pixels wide and 208 pixels height. There will

be no deformation of the corners. The matrix after the application of 3x3 Min filter is shown

below:



0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

b) The resulting image consists of vertical bars of 3 pixels wide and 206 pixels height. There will


below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

c) The resulting image consists of vertical bars of 1 pixels wide and 204 pixels height. There will


below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0



d) The resulting image consists of vertical bars of 0 pixels wide and 202 pixels height. There will

be no deformation of the corners. The white bars completely disappear from the image. The

matrix after the application of 9x9 Min filter is shown below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

e) A 3x3 Max Filter:

f) A 5x5 Max Filter:

g) A 7x7 Max Filter:

h) A 9x9 Max Filter:

Explanation

Max filter is useful for finding the brightest points in an image. It can be used to reduce the

pepper noise from the image. But it removes (sets to a light gray level) some dark pixel from the

borders of the dark objects. But for the given image, the effect of Max filter is increase in the

width and height of the white vertical bars. As the size of the filter increase, the width and height

of the vertical bars also increases.

(e) (f) (g)

e) The resulting image consists of vertical bars of 9 pixels wide and 212 pixels height. There will

be no deformation of the corners. The matrix after the application of 3x3 Max filter is shown

below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0

0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0

0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0

0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0



0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0

0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0

0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0

f) The resulting image consists of vertical bars of 11 pixels wide and 214 pixels height. There

will be no deformation of the corners. The matrix after the application of 5x5 Max filter is shown

below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0

g) The resulting image consists of vertical bars of 13 pixels wide and 216 pixels height. There


below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0

0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0



h) The resulting image consists of vertical bars of 15 pixels wide and 218pixels height. There


below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0

i) A 3x3 Arithmetic Mean Filter:

j) A 5x5 Arithmetic Mean Filter:

k) A 7x7 Arithmetic Mean Filter:

k) A 9x9 Arithmetic Mean Filter:

(i) (j) (k)

Explanation:

Arithmetic mean filter causes blurring. Burring increases with the size of the mask.

i) Since the width of each vertical bar is 7 pixels wide, a 3x3 arithmetic mean filter slightly

distorts the edges of the vertical bars. As a result, the edges of the vertical bars become a bit

darker. There will be some deformation at the corners of the bars, they become rounded.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 28 113 170 170 170 170 113 28 0 0 0 0 0

0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0

0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0



0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0

0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0

0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0

j) As the size of the mask or filter increases, the vertical bars will distort more, and blurring

increases. Since the size of the mask here is 5x5, after the application of the filter, only the 3

centre lines of the vertical bars remains white. As move we move from the center of the vertical

bar to the either of the edge, the pixels become darker. There will be some deformation at the

corners of the bars, they become rounded.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0

0 0 0 0 0 122 163 163 163 163 163 122 0 0 0 0 0

0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0

0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0

0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0

0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0

k) As the size of the mask or filter increases, the vertical bars will distort more, and blurring

increases. Since the size of the mask here is 7x7, after the application of the filter, only the centre

line of the vertical bars remains white. As move we move from the center of the vertical bar to

the either of the edge, the pixels become darker. There will be some deformation at the corners

of the bars, they become rounded.

l) As the size of the mask is larger than the width of the bars, the vertical bars are completely

distorted. The burring also increases compared to the previous case. The corners also become

more rounded and deformed.

m) A 3x3 Geometric Mean Filter

n) A 5x5 Geometric Mean Filter

o) A 7x7 Geometric Mean Filter

p) A 9x9 Geometric Mean Filter

Explanation

An image restored using a geometric mean filter is given by the expression

Here, each restored pixel is given by the product of the pixels in the subimage window, raised to

the power 1/mn. A geometric mean filter achieves smoothing comparable to the arithmetic mean



filter, nut it tends to lose less image detail in the process. But for the given image, the effect of

Min filter is decrease in the width and height of the white vertical bars. As the size of the filter

increase, the width and height of the vertical bars decrease.

(n) (o) (p)

m) The resulting image consists of vertical bars of 5 pixels wide and 208 pixels height. There

will be no deformation of the corners. The matrix after the application of 3x3 Geometric Mean

filter is shown below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0

n) The resulting image consists of vertical bars of 3 pixels wide and 206 pixels height. There will

be no deformation of the corners. The matrix after the application of 5x5 Geometric Mean filter

is shown below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0

o) The resulting image consists of vertical bars of 1 pixels wide and 204 pixels height. There will

be no deformation of the corners. The matrix after the application of 7x7 Geometric Mean Filter

is shown below:



0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0

p) The resulting image consists of vertical bars of 0 pixels wide and 202 pixels height. There will

be no deformation of the corners. The white bars completely disappear from the image. The

matrix after the application of 9x9 Geometric Mean filter is shown below:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0



UNIT-VI

IMAGE SEGMENTATION

DETECTION OF DISCONTINUITIES

There are 3 types of gray level discontinuities in an image: Points, Lines and Edges

Point Detection

Line Detection



Edge Detection







Gradient Operators





The Laplacian



THRESHOLDING



Global Thresholding



Adaptive/Local Thresholding

REGION BASED SEGMENTATION

Basic Formulation



Region Growing

Region Splitting and Merging



QUESTION AND ANSWERS

1. Suppose that an image has the following intensity distributions. Where p1 (z)

corresponds to the intensity of the objects p2 (z) corresponds to the intensity of the

background. Assume that p1= p2 and find the optimal between objects and

background pixels. Shown in figure 7. [16]

Solution



2. A binary image contains straight lines oriented horizontally, vertically, at 450 and at -450 give a set of 3×3 mask that can be used to detect 1-pixel-long brakes in these lines. Assume that the gray levels of lines are one and that the gray level of the background is 0.

3. What exactly is the purpose of image segmentation and edge detection?

The purpose of image segmentation and edge detection is to extract the outlines ofdifferent

regions in the image; i.e. to divide the image in to regions which are made up of pixels

which have something in common. For example, they may have similar brightness, or

colour, which may indicate that they belong to the same object or facet of an object.

4. Are there any segmentation methods that take into consideration the spatial proximity of

pixels?

Yes, they are called region growing methods. In general, one starts from some seed pixels

and attaches neighbouring pixels to them provided the attributes of the pixels in the region

created in this way vary within a predefined range. So, each seed grows gradually by



accumulating more and more neighbouring pixels until all pixels in the image have been

assigned to a region.

5. How can one choose the seed pixels?

There is no clear answer to this question, and this is the most important drawback of this

type of method. In some applications the choice of seeds is easy. For example, in target

tracking in infrared images, the target will appear bright, and one can use as seeds the few

brightest pixels. A method which does not need a predetermined number of regions or

seeds is that of split and merge.

6. Is it possible to segment an image by considering the dissimilarities between regions, as

opposed to considering the similarities between pixels?

Yes, in such an approach we examine the differences between neighbouring pixels and say

that pixels with different attribute values belong to different regions and therefore we

postulate a boundary separating them. Such a boundary is called an edge and the process is

called edge detection.

7. Are Sobel masks appropriate for all images?

Sobel masks are appropriate for images with low levels of noise. They are inadequate for

noisy images.