IV B.Tech I Semester ECE
www.asrece.com
ASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGY
DEPARTMENT OF ECE
DIGITAL IMAGE PROCESSING
S.NO.
1
2
3
4
5
Download this study material fromDownload this study material fromDownload this study material fromDownload this study material from
IV B.Tech I Semester ECE
1
ASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYPRATHIPADU, TADEPALLIGUDEM
DEPARTMENT OF ECE
DIGITAL IMAGE PROCESSINGCONTENT
UNIT
UNIT-1
UNIT-3
UNIT-4
UNIT-6
UNIT-7
Download this study material fromDownload this study material fromDownload this study material fromDownload this study material from
www.asrece.com
Digital Image Processing
Department of ECE
ASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGYASR INSTITUTE OF TECHNOLOGY PRATHIPADU, TADEPALLIGUDEM
DEPARTMENT OF ECE
DIGITAL IMAGE PROCESSING
PAGE NO.
2-30
31-56
57-69
70-97
98-111
Download this study material fromDownload this study material fromDownload this study material fromDownload this study material from
www.asrece.com
Digital Image Processing
Department of ECE
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 2 Department of ECE
UNIT-I
Image processing involves changing the nature of an image in order to either:
1. Improve its pictorial information for human interpretation; or
2. Render it more suitable for processing, storage, transmission, and representation for
autonomous machine perception.
Examples of condition 1 may include:
� Enhancing the edges of an image to make it appear sharper
� Removing noise from an image
� Removing motion blur from an image
Examples of condition 2 may include:
� Obtaining the edges of an image
� Removing detail from an image
ASPECTS OF IMAGE PROCESSING
Digital image processing is the use of computer algorithms to perform image processing
on digital images. It is convenient to subdivide different image-processing algorithms into broad
subclasses:
Image Enhancement: Processing an image so that the result is more suitable for a particular
application is called image enhancement. Examples include:
� Sharpening or deblurring an out-of-focus image
� Highlighting Edges
� Improving image contrast or brightening an image and
� Removing noise
Image Restoration: An image may be restored by the damage done to it by a known cause, for
example:
� Removing of blur caused by linear motion
� Removal of optical distortions and
� Removing periodic interference
Image Segmentation: Segmentation involves sub dividing an image into constituent parts or
isolating certain aspects of an image, including:
� Finding lines, circles, or particular shapes in an image &
� Identifying cars, trees, buildings, or roads in an aerial photographs
DIGITAL IMAGE REPRESENTATION
An image may be defined as a two-dimensional function, f(x, y), where x and y are
spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the
intensity or gray level of the image at that point. When x, y, and the amplitude values of f are all
finite, discrete quantities, we call the image a digital image. The field of digital image processing
refers to processing digital images by means of a digital computer. A digital image is composed
of a finite number of elements, each of which has a particular location and value. These elements
are referred to as picture elements, image elements, pels, and pixels. Pixel is the term most
widely used to denote the elements of a digital image.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 3 Department of ECE
A Digital image can be considered as a matrix whose row and column indices identify a
point in the image and the corresponding matrix element value identifies the gray level at that
point.
FUNDAMENTAL STEPS IN IMAGE PROCESSING
Let us consider a simple application of image processing technique for automatically
reading the address on pieces of mail. The following figure shows that the overall objective is to
produce a result from a problem domain by means of image processing.
The problem domain in this example consists of pieces of mail, and the objective is to
read the address on each piece. Thus the desired output in this case is a stream of alphanumeric
characters.
Image acquisition
The first step in this process is Image Acquisition- that is, to acquire a digital image. To
do so requires imaging sensor and the capability to digitize the signal produced by the sensor.
The sensor could be a TV camera that produces an entire image of the problem domain every
1/30 sec. The imaging sensor can also be a line-scan camera that produces a single image line at
a time. If the output of the camera or other imaging sensor is not already in digital form, an
Knowledge Base
Image
Acquisition
Segmentation
Preprocessing
Recognition And
Interpretation
Representation
& description
Problem Domain
Result
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 4 Department of ECE
analog-to-digital converter digitizes it. The nature of the sensor and the image it produces are
determined by the application.
Preprocessing
After a digital image has been obtained, the next step is preprocessing that image. The
key function of preprocessing is to improve the image in ways that increase the chances for
success of the other processes. Basically, the idea behind enhancement techniques is to bring out
detail that is obscured, or simply to highlight certain features of interest in an image.
In this example, preprocessing typically deals with techniques, for enhancing contrast,
removing noise, and isolating regions whose texture indicate a likelihood of
alphanumeric information.
Segmentation
The next stage deals with Segmentation. Segmentation partitions an input image into its
constituent parts or objects. In general, autonomous segmentation is one of the most difficult
tasks in digital image processing. A rugged segmentation procedure brings the process a long
way toward successful solution of imaging problems that require objects to be identified
individually. On the other hand, weak or erratic segmentation algorithms almost always
guarantee eventual failure. In general, the more accurate the segmentation, the more likely
recognition is to succeed.
In our example of character recognition, the key role of segmentation is to extract
individual characters and words from the background.
Representation and Description
The output of the segmentation stage usually is raw pixel data, constituting either the
boundary of a region (i.e., the set of pixels separating one image region from another) or all the
points in the region itself. In either case, converting the data to a form suitable fro computer
processing is necessary. The first decision that must be made is whether the data should be
represented as a boundary or as a complete region. Boundary representation is appropriate when
the focus is on external shape characteristics, such as corners and inflections. Regional
representation is appropriate when the focus is on internal properties, such as texture or skeletal
shape. In some applications, these representations complement each other.
Our character recognition example requires algorithms based on boundary shape as well
as skeletons and other internal properties.
Choosing a representation is only part of the solution for transforming raw data into a
form suitable for subsequent computer processing. A method must also be specified for
describing the data so that features of interest are highlighted. Description, also called feature
selection, deals with extracting attributes that result in some quantitative information of interest
or are basic for differentiating one class of objects from another.
In terms of character recognition, descriptors such as lakes (holes) and bays are
powerful features that help differentiate one part of the alphabet from another.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 5 Department of ECE
Recognition and Interpretation
The last stage involves recognition and interpretation. Recognition is the process that
assigns a label to an object based on its descriptors. Interpretation involves assigning meaning to
an ensemble of recognized objects.
In terms, of our example, identifying a character as, say, a c requires associating the
descriptors for that character with the label c. Interpretation attempts to assign meaning
to a set of labeled entities. For example, a string of six numbers can be interpreted to be
a ZIP code.
So far we have said nothing about the need for prior knowledge or about the interaction
between the knowledge base and the processing modules. Knowledge about a problem domain is
coded into an image processing system in the form of a knowledge database. In addition to
guiding the operation of each processing module, the knowledge base also controls the
interaction between modules.
ELEMENTS OF DIGITAL IMAGE PROCESSING SYSTEM
IMAGE ACQUISITION
Two elements are required to acquire digital images. The first is a physical device that is
sensitive to a band in the electromagnetic energy spectrum (such as x-ray, ultraviolet, visible or
infrared bands) and that produces an electrical signal output proportional to the level of energy
sensed. The second, called a Digitizer, is a device for converting the electrical output of the
physical sensing device into digital form.
The types of images in which we are interested are generated by the combination of an
“illumination” source and the reflection or absorption of energy from that source by the elements
of the “scene” being imaged. We enclose illumination and scene in quotes to emphasize the fact
IMAGE
ACQUISITION DISPLAY
PROCESSING
STORAGE
COMMUNICATION
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 6 Department of ECE
that they are considerably more general than the familiar situation in which a visible light source
illuminates a common everyday 3-D (three-dimensional) scene. For example, the illumination
may originate from a source of electromagnetic energy such as radar, infrared, or X-ray energy.
Figure shows the three principal sensor arrangements used to transform illumination
energy into digital images- single imaging sensor, Line sensor and Array sensor.
The idea is simple: incoming energy is transformed into a voltage by the combination of
input electrical power and sensor material that is responsive to the particular type of energy
being detected. The output voltage waveform is the response of the sensor(s), and a digital
quantity is obtained from each sensor by digitizing its response.
Image Acquisition Using a Single Sensor
The most familiar sensor of this type is the photodiode, which is constructed of silicon
materials and whose output voltage waveform is proportional to light. The use of a filter in front
of a sensor improves selectivity. In order to generate a 2-D image using a single sensor, there has
to be relative displacements in both the x- and y-directions between the sensor and the area to be
imaged.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 7 Department of ECE
Figure shows an arrangement used in high-precision scanning, where a film negative is
mounted onto a drum whose mechanical rotation provides displacement in one dimension. The
single sensor is mounted on a lead screw that provides motion in the perpendicular direction.
Microdensitometers
In microdensitometers the transparency or photograph is mounted on a flat bed or
wrapped around a drum. Scanning is accomplished by focusing a beam of light (which could be
a laser) on the image and translating the bed or rotating the drum in relation to the beam. In the
case of transparencies, the beam passes through the film; in photographs the beam is reflected
from the surface of the image. In both cases, the beam is focused on a photodetector and the gray
level at any point in the image is obtained by allowing only discrete values of intensity and
position in the output.
Image Acquisition Using Sensor Strips
A geometry that is used much more frequently than single sensors consists of an in-line
arrangement of sensors in the form of a sensor strip.
The strip provides imaging elements in one direction. Motion perpendicular to the strip
provides imaging in the other direction as above figure shows. This is the type of arrangement
used in most flat bed scanners. Solid state arrays are composed of discrete silicon imaging
elements, called photosites that have a voltage output proportional to the intensity of the incident
light. The figure below shows a typical line scan sensor containing a row of photosites, two
transfer gates used to clock the contents of the imaging elements into transport registers, and an
output gate used to clock the contents of the transport registers into an amplifier. The amplifier
outputs a voltage signal proportional to the contents of the row of photosites.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 8 Department of ECE
Sensor strips mounted in a ring configuration are used in medical and industrial imaging
to obtain cross-sectional (“slice”) images of 3-D objects, as Figure shows. A rotating X-ray
source provides illumination and the portion of the sensors opposite the source collect the X-ray
energy that pass through the object (the sensors obviously have to be sensitive to X-ray energy).
A 3-D digital volume consisting of stacked images is generated as the object is moved in a
direction perpendicular to the sensor ring.
Image Acquisition Using Sensor Arrays
Numerous electromagnetic and some ultrasonic sensing devices frequently are arranged
in an array format. This is also the predominant arrangement found in digital cameras. A typical
sensor for these cameras is a CCD array. Charge-Coupled are arrays are similar to line-scan
sensors, except that the photosites are arranged in a matrix form and gate/transport register
combination separates columns of photosites.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 9 Department of ECE
The principal manner in which array sensors are used is shown in figure. This figure
shows the energy from an illumination source being reflected from a scene element, but, as
mentioned at the beginning of this section, the energy also could be transmitted through the
scene elements. The first function performed by the imaging system shown in figure (c) is to
collect the incoming energy and focus it onto an image plane. If the illumination is light, the
front end of the imaging system is a lens, which projects the viewed scene onto the lens focal
plane, as figure (d) shows. The sensor array, which is coincident with the focal plane, produces
outputs proportional to the integral of the light received at each sensor. Digital and analog
circuitry sweeps these outputs and converts them to a video signal, which is then digitized by
another section of the imaging system.
STORAGE
Digital storage for image processing applications falls into three principle categories:
i) Short term storage for use during processing
ii) On-line storage for relatively fast recall and
iii) Archival storage, characterized by infrequent access
� One method for providing short term storage is computer memory. Another is by
specialized boards, called frame buffers that store one or more images and can be
accessed rapidly.
� On-line storage generally takes the form of magnetic disks. Juke boxes that hold 30-100
optical disks provide an effective solution for large scale, on-line storage applications that
require read-write capability.
� Archival storage is characterized by massive storage requirements, but infrequent need
for access. Magnetic tapes and optical disks are the usual media for archival applications.
PROCESSING
Processing of digital images involves procedures that are usually expressed in
algorithmic form. Most image processing functions are implemented in software. The only eason
for specialized image processing hardware is the need for speed in some applications or to
overcome some fundamental computer limitations. Image processing is characterized by specific
solutions. Hence techniques that work well in one area can be totally inadequate in another.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 10 Department of ECE
COMMUNICTION
Communication in digital image processing primarily involves local communication
between image processing systems and remote communication from one point to another.
Hardware and software for local communication are readily available for most computers.
Communication of images across vast distances presents a more serious challenge. A voice-
grade telephone line is cheaper to use but slower. Wireless links using intermediate stations, such
as satellites, are much faster, but they also considerably more.
DISPLAY
Monochrome and color monitors are the principal display devices used in modern image
processing systems. Monitors are driven by the outputs of an image display module in the
backplane of the host computer. The signals of the display module can also be fed into an image
recording device that produces a hard copy of the image being viewed in the monitor screen.
Other display media include random access cathode ray tubes (CRTs) and printing devices.
A SIMPLE IMAGE FORMATION MODEL
The term image refers to a two dimensional light-intensity function, denoted by f(x,y),
where the value or amplitude of f at spatial coordinates (x,y) gives the intensity (brightness) of
the image at that point. As light is a form of energy, f(x,y) must be nonzero and finite, that is,
0 < f(x,y) < ∞
The images people perceive in everyday visual activities normally consist of light
reflected from objects. The function f(x, y) may be characterized by two components: (1) the
amount of source illumination incide nt on the scene being viewed, and (2) the amount of
illumination reflected by the objects in the scene. Appropriately, these are called the illumination
and reflectance components and are denoted by i(x, y) and r(x, y), respectively. The two
functions combine as a product to form f(x, y):
f(x, y) = i(x, y) r(x, y)
where
0 < i(x, y) < ∞
and
0 < r(x, y) < 1
Above equation indicates that reflectance is bounded by 0 (total absorption) and 1 (total
reflectance). The nature of i(x, y) is determined by the illumination source, and r(x, y) is
determined by the characteristics of the imaged objects.
The values given in above equations are theoretical bounds. The following average
numerical figures illustrate some typical ranges of i(x, y) for visible light. On a clear day, the sun
may produce in excess of 90,000 foot-candles of illumination on the surface of the Earth. This
figure decreases to less than 10,000 foot-candles on a cloudy day. On a clear evening, a full
moon yields about 0.1 foot-candles of illumination. The typical illumination level in a
commercial office is about 1000 foot-candles. Similarly, the following are some typical values of
r(x, y): 0.01 for black velvet, 0.65 for stainless steel, 0.80 for flat-white wall paint, 0.90 for
silver-plated metal, and 0.93 for snow.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 11 Department of ECE
The intensity of a monochrome image at any coordinates (x, y) the gray level (l) of the
image at that point. That is,
l=f(x,y)
From the equations of illumination and reflection, it is evident that l lies in the range
Lmin≤ l ≤ Lmax
In theory, the only requirement on Lmin is that it be positive, and on Lmax that it be finite. In
practice, Lmin=iminrmin and Lmax=imaxrmax.Using the preceding average office illumination and
range of reflectance values as guidelines, we may expect Lmin≈10 and Lmax≈1000 to be typical
limits for indoor values in the absence of additional illumination. The interval is called the gray
scale. Common practice is to shift this interval numerically to the interval [0, L-1], where l=0 is
considered black and l=L-1 is considered white on the gray scale. All intermediate values are
shades of gray varying from black to white.
UNIFORM SAMPLING & QUANTIZATION
There are numerous ways to acquire images, but our objective in all is the same: to
generate digital images from sensed data. The output of most sensors is a continuous voltage
waveform whose amplitude and spatial behavior are related to the physical phenomenon being
sensed. To create a digital image, we need to convert the continuous sensed data into digital
form. This involves two processes: sampling and quantization.
Basic Concepts in Sampling and Quantization
The basic idea behind sampling and quantization is illustrated in following figure. Figure
(a) shows a continuous image, f(x, y), that we want to convert to digital form. An image may be
continuous with respect to the x- and y-coordinates, and also in amplitude. To convert it to
digital form, we have to sample the function in both coordinates and in amplitude. Digitizing the
coordinate values is called sampling. Digitizing the amplitude values is called quantization.
(a) (b)
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 12 Department of ECE
(c) (d) The one-dimensional function shown in Fig. (b) is a plot of amplitude (gray level) values
of the continuous image along the line segment AB in Fig.(a).The random variations are due to
image noise. To sample this function, we take equally spaced samples along line AB, as shown
in Fig. (c).The location of each sample is given by a vertical tick mark in the bottom part of the
figure. The samples are shown as small white squares superimposed on the function. The set of
these discrete locations gives the sampled function. However, the values of the samples still span
(vertically) a continuous range of gray-level values. In order to form a digital function, the gray-
level values also must be converted (quantized) into discrete quantities. The right side of Fig. (c)
shows the gray-level scale divided into eight discrete levels, ranging from black to white. The
vertical tick marks indicate the specific value assigned to each of the eight gray levels. The
continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to
each sample. The assignment is made depending on the vertical proximity of a sample to a
vertical tick mark. The digital samples resulting from both sampling and quantization are shown
in Fig.(d). Starting at the top of the image and carrying out this procedure line by line produces a
two-dimensional digital image. Quantization of the sensor outputs completes the process of
generating a digital image.
Representing Digital Images
To be suitable for computer processing, an image function f(x,y) must be digitized both
spatially and in amplitude. Digitization of the spatial coordinates (x,y) is called image sampling,
and amplitude digitization is called gray-level quantization.
The result of sampling and quantization is a matrix of real numbers. Suppose that a
continuous image f(x, y) is approximated by equally spaced samples arranged in the form of an
NxM matrix, where each element of the array is a discrete quantity.
The right side of this equation is by definition a digital image. Each element of this matrix array
is called an image element, picture element, pixel, or pel. The terms image and pixel will be used
throughout the rest of our discussions to denote a digital image and its elements.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 13 Department of ECE
Expressing sampling and quantization in more formal mathematical terms can be useful
at times. Let Z and R denote the set of real integers and the set of real numbers, respectively. The
sampling process may be viewed as partitioning the xy plane into a grid, with the coordinates of
the center of each grid being a pair of elements from the Cartesian product Z2(ZxZ), which is the
set of all ordered pairs of elements (zi, zj), with zi and zj being integers from Z. Hence, f(x, y) is a
digital image if (x, y) are integers from Z2 and f is a function that assigns a gray-level value (that
is, a real number from the set of real numbers, R) to each distinct pair of coordinates (x, y). This
functional assignment obviously is the quantization process described earlier. If the gray levels
also are integers (as usually is the case in this and subsequent chapters), Z replaces R, and a
digital image then becomes a 2-D function whose coordinates and amplitude values are integers.
This digitization process requires decisions about values for M, N, and for the number, L,
of discrete gray levels allowed for each pixel. There are no requirements on Mand N, other than
that they have to be positive integers. However, due to processing, storage, and sampling
hardware considerations, the number of gray levels typically is an integer power of 2:
L = 2k
We assume that the discrete levels are equally spaced and that they are integers in the interval [0,
L-1]. Sometimes the range of values spanned by the gray scale is called the dynamic range of an
image, and we refer to images whose gray levels span a significant portion of the gray scale as
having a high dynamic range. When an appreciable number of pixels exhibit this property, the
image will have high contrast. Conversely, an image with low dynamic range tends to have a
dull, washed out gray look.
The number, b, of bits required to store a digitized image is
b=M x N x k
When M=N, this equation becomes
b = N2k
Table shows the number of bits required to store square images with various values of N
and k.The number of gray levels corresponding to each value of k is shown in parentheses:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 14 Department of ECE
Spatial and Gray-Level Resolution
The resolution of an image strongly depends on two parameters: no. of samples and gray
levels. Sampling is the principal factor determining the spatial resolution of an image. Basically,
spatial resolution is the smallest discernible detail in an image. Gray-level resolution similarly
refers to the smallest discernible change in gray level. Due to hardware considerations, the
number of gray levels is usually an integer power of 2.
Now, let us consider the effect that variations in N and k have on the image quality
Effect of Reducing the Spatial Resolution:
Figure 2.19 shows an image of size 1024x1024 pixels whose gray levels are represented
by 8 bits. The other images shown in Fig. 2.19 are the results of sub-sampling the 1024*1024
image. The sub sampling was accomplished by deleting the appropriate number of rows and
columns from the original image. For example, the 512x512 image was obtained by deleting
every other row and column from the 1024x1024 image. The 256x256 image was generated by
deleting every other row and column in the 512x512 image, and so on. The number of allowed
gray levels was kept at 256.
The simplest way to compare these effects is to bring all the sub-sampled images up to
size 1024*1024 by row and column pixel replication. The level of detail lost is simply too fine to
be seen on the printed page at the scale in which these images are shown. Next, the 256x256
image in Fig. 2.20(c) shows a very slight fine checkerboard pattern in the borders between
flower petals and the black background. A slightly more pronounced graininess throughout the
image also is beginning to appear. These effects are much more visible in the 128x128 image in
Fig. 2.20(d), and they become pronounced in the 64x64 and 32x32 imagesin Figs. 2.20(e) and
(f), respectively.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 15 Department of ECE
Effect of Reducing Gray Level Resolution (GRAY TO BINARY CONVERSION)
In this example, we keep the number of samples constant and reduce the number of gray
levels from 256 to 2, in integer powers of 2. Figure 2.21(a) is a 452x374image, displayed with
k=8(256 gray levels).
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 16 Department of ECE
Figures 2.21(b) through (h) were obtained by reducing the number of bits from k=7to
k=1while keeping the spatial resolution constant at 452x374 pixels. The 256-, 128-, and 64-level
images are visually identical for all practical purposes. The 32-level image shown in Fig. 2.21(d),
however, has an almost imperceptible set of very fine ridge like structures in areas of smooth
gray levels (particularly in the skull).This effect, caused by the use of an insufficient number of
gray levels in smooth areas of a digital image, is called false contouring, so called because the
ridges resemble topographic contours in a map. False contouring generally is quite visible in
images displayed using 16 or less uniformly spaced gray levels, as the images in Figs. 2.21(e)
through (h) show.
Iso Preference Curves
The results in above Examples illustrate the effects produced on image quality by varying
N and k independently. However, these results only partially answer the question of how varying
N and k affect images because we have not considered yet any relationships that might exist
between these two parameters. An early study was attempted to quantify experimentally the
effects on image quality produced by varying N and k simultaneously. The experiment consisted
of a set of subjective tests. Images similar to those shown in Fig. 2.22 were used.The woman’s
face is representative of an image with relatively little detail; the picture of the cameraman
contains an intermediate amount of detail; and the crowd picture contains, by comparison, a large
amount of detail.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 17 Department of ECE
Sets of these three types of images were generated by varying N and k, and observers
were then asked to rank them according to their subjective quality. Results were summarized in
the form of so-called isopreference curves in the Nk-plane. Each point in the Nk-plane represents
an image having values of N and k equal to the coordinates of that point. Points lying on an
isopreference curve correspond to images of equal subjective quality. It was found in the course
of the experiments that the isopreference curves tended to shift right and upward, but their shapes
in each of the three image categories were similar to those shown in Fig. 2.23.This is not
unexpected, since a shift up and right in the curves simply means larger values for N and k,
which implies better picture quality.
The key point of interest here is that isopreference curves tend to become more vertical as
the detail in the image increases. This result suggests that for images with a large amount of
detail only a few gray levels may be needed. For example, the isopreference curve in Fig. 2.23
corresponding to the crowd is nearly vertical. This indicates that, for a fixed value of N, the
perceived quality for this type of image is nearly independent of the number of gray levels used
(for the range of gray levels shown in Fig. 2.23). It is also of interest to note that perceived
quality in the other two image categories remained the same in some intervals in which the
spatial resolution was increased, but the number of gray levels actually decreased. The most
likely reason for this result is that a decrease in k tends to increase the apparent contrast of an
image, a visual effect that humans often perceive as improved quality in an image.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 18 Department of ECE
NON-UNIFORM SAMPLING AND QUANTIZATION
Non-Uniform Sampling
For a fixed value of spatial resolution, the appearance of an image can be improved in
many cases by using an adaptive scheme where the sampling process depends on the
characteristics of the image. In general, fine sampling is required in the neighborhood of sharp
gray-level transitions, whereas coarse sampling may be utilized in relatively smooth regions.
Consider, for example, a simple consisting of a face superimposed on a uniform background.
Clearly, the background carries little detailed information and can be quite adequately
represented by coarse sampling. The face, however contains considerably more detail. If the
additional samples not used in the background are used in region of the image, the overall result
would tend to improve. In distributing the samples, greater sample concentration should be used
in gray-level transition boundaries, such as boundary between the face and the background.
Disadvantages or Drawbacks:
The necessity of having to identify boundaries is a definite draw back of the nonuniform
sampling approach.
This method also is not practical for images containing relatively small uniform regions
(crowd image).
Non-Uniform Sampling
When the number of gray levels must be kept small, the use of unequally spaced levels in the
quantization process usually is desirable. A method similar to the non-uniform sampling
technique may be used for the distribution of gray levels in an image. As the eye is relatively
poor at estimating shades of gray near abrupt level changes, the approach in this case is to use
few gray levels in the neighborhood of boundaries. The remaining levels can then be used in
regions where gray-level variations are smooth, thus avoiding or reducing the false contours that
often appear in these regions if they are too coarsely quantized.
Disadvantages
This method is subjected to the preceding observations about boundary detection and
detail content.
An alternative technique that is particularly attractive for distributing gray levels consists of
computing the frequency of occurrence of allowed levels. If gray levels in a certain range occur
frequently , while others occur rarely, the quantization levels are finely spaced in this range and
coarsely spaced outside of it. This method is sometimes called as TAPERED QUATIZATION
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 19 Department of ECE
SOME BASIC RELATIONSHIPS BETWEEN PIXELS
An image is denoted by f(x, y).When referring in this section to a particular pixel, we use
lowercase letters, such as p and q. A subset of pixels of f(x,y) is denoted by S.
Neighbors of a Pixel
A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates are given by
(x+1, y), (x-1, y), (x, y+1), (x, y-1)
This set of pixels, called the 4-neighbors of p, is denoted by N4(p). Each pixel is a unit distance
from (x, y), and some of the neighbors of p lie outside the digital image if (x, y) is on the border
of the image.
The four diagonal neighbors of p have coordinates
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)
and are denoted by ND(p). These points, together with the 4-neighbors, are called the 8-neighbors
of p, denoted by N8(p). As before, some of the points in ND(p) and N8(p) fall outside the image if
(x, y) is on the border of the image.
Connectivity, Adjacency, Regions & Boundaries
Connectivity between pixels is a fundamental concept that simplifies the definition of
numerous digital image concepts, such as regions and boundaries. To establish if two pixels are
connected, it must be determined if they are neighbors and if their gray levels satisfy a specified
criterion of similarity (say, if their gray levels are equal).For instance, in a binary image with
values 0 and 1, two pixels may be 4-neighbors, but they are said to be connected only if they
have the same value.
Let V be the set of gray-level values used to define connectivity. In a binary image,
V={1} if we are referring to connectivity of pixels with value 1. In a grayscale image, the idea is
the same, but set V typically contains more elements. For example, in the connectivity of pixels
with a range of intensity values say, 32 to 64, it follows that V={32,33,…63,64}. We consider
three types of connectivity:
(a) 4-connectivity: Two pixels p and q with values from V are 4-connected if q is in the set N4(p).
(b) 8- connectivity: Two pixels p and q with values from V are 8- connected if q is in the set N8(p).
(c) m- connectivity (mixed connectivity): Two pixels p and q with values from V are m- connected if
(i) q is in N4(p), or
(ii) q is in ND(p) and the set N4(P)∩N4(q)has no pixels whose values are from V.
Mixed connectivity is a modification of 8-connectivity. It is introduced to eliminate the
ambiguities (multiple path connections) that often arise when 8-adjacency is used. For example,
consider the pixel arrangement shown in figure (a). For V={1} The three pixels at the top of
figure show multiple (ambiguous) 8-adjacency, as indicated by the dashed lines in (b). This
ambiguity is removed by using m-adjacency, as shown in (c).
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 20 Department of ECE
(a) (b) (c)
A pixel p is adjacent to q if they are connected. So there are 3 types of adjacencies too
(a) 4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is in the set N4(p).
(b) 8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is in the set N8(p).
(c) m-adjacency (mixed adjacency): Two pixels p and q with values from V are m-adjacent if
(i) q is in N4(p), or
(ii) q is in ND(p) and the set N4(P)∩N4(q)has no pixels whose values are from V.
A (digital) path (or curve) from pixel p with coordinates (x, y) to pixel q with coordinates (s, t) is
a sequence of distinct pixels with coordinates
(x0,y0),(x1,y1),…,(xn,yn)
Where (x0,y0)= (x,y) and (xn,yn)=(s,t), and (xi,yi) is adjacent to (xi-1,yi-1), 1≤i≤n. In this case, n is
the length of the path.
If (x0,y0)= (xn,yn) the path is a closed path.
We can define 4-, 8-, or m- paths depending on the type of adjacency specified. For
example, the paths shown in above figure (b) between the northeast and southeast points are 8-
paths, and the path in figure (c) is an m-path.
Let S represent a subset of pixels in an image. Two pixels p and q are said to be
connected in S if there exists a path between them consisting entirely of pixels in S. For any pixel
p in S, the set of pixels that are connected to it in S is called a connected component of S. If it
only has one connected component, then set S is called a connected set.
Let R be a subset of pixels in an image. We call R a region of the image if R is a
connected set. The boundary (also called border or contour) of a region R is the set of pixels in
the region that have one or more neighbors that are not in R. If R happens to be an entire image
(which we recall is a rectangular set of pixels), then its boundary is defined as the set of pixels in
the first and last rows and columns of the image. This extra definition is required because an
image has no neighbors beyond its border. Normally, when we refer to a region, we are referring
to a subset of an image, and any pixels in the boundary of the region that happen to coincide with
the border of the image are included implicitly as part of the region boundary.
The concept of an edge is found frequently in discussions dealing with regions and
boundaries. There is a key difference between these concepts, however. The boundary of a finite
region forms a closed path and is thus a “global” concept. Edges are formed from pixels with
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 21 Department of ECE
derivative values that exceed a preset threshold. Thus, the idea of an edge is a “local” concept
that is based on a measure of gray-level discontinuity at a point. It is possible to link edge points
into edge segments, and sometimes these segments are linked in such a way that correspond to
boundaries, but this is not always the case. The one exception in which edges and boundaries
correspond is in binary images. Depending on the type of connectivity and edge operators used,
the edge extracted from a binary region will be the same as the region boundary. Conceptually, it
is helpful to think of edges as intensity discontinuities and boundaries as closed paths.
Distance Measures
For pixels p, q, and z, with coordinates (x, y), (s, t), and (v, w), respectively, D is a
distance function or metric if
(a) D(p, q) ≥ 0 (D(p, q)=0 if and only if p=q),
(b) D(p, q)=D(q, p), and
(c) D(p, z) ≤D(p, q)+D(q, z).
The Euclidean distance between p and q is defined as
De (p,q)=[(x-s)2 + (y-t)2]1/2
For this distance measure, the pixels having a distance less than or equal to some value r from
(x,y) are the points contained in a disk of radius r centered at (x, y).
The D4 distances (also called city-block distance) between p and q is defined as
D4 (p,q)=|x-s| + |y-t|
In this case, the pixels having a D4 distance from (x, y) less than or equal to some value r form a
diamond centered at (x, y). For example, the pixels with D4 distance ≤ 2 from (x, y) (the center
point) form the following contours of constant distance:
The pixels with D4=1 are the 4-neighbors of (x, y). The D8 distance (also called chessboard distance) between p and q is defined as
D8 (p,q) = max(|x-s| , |y-t|)
In this case, the pixels with D8 distance from (x, y) less than or equal to some value r form a
square centered at (x, y). For example, the pixels with D8 distance ≤ 2 from (x, y) (the center
point) form the following contours of constant distance:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 22 Department of ECE
The pixels with D8=1 are the 8-neighbors of (x, y). Note that the D4 and D8 distances between p and q are independent of any paths that
might exist between the points because these distances involve only the coordinates of the points.
If we elect to consider m-adjacency, however, the Dm distance between two points is defined as
the shortest m-path between the points. In this case, the distance between two pixels will depend
on the values of the pixels along the path, as well as the values of their neighbors.
For instance, consider the following arrangement of pixels and assume that p, p2 , and
p4 have value 1 and that p1 and p3 can have a value of 0 or 1:
Suppose that we consider adjacency of pixels valued 1 (i.e.,V={1}). If p1 and p3 are 0,
the length of the shortest m-path (the Dm distance) between p and p4 is 2. If p1 is 1, then p2 and
p will no longer be m-adjacent (see the definition of m-adjacency) and the length of the shortest
m-path becomes 3 (the path goes through the points p-p1-p2-p4). Similar comments apply if p3 is
1 (and p1 is 0); in this case, the length of the shortest m-path also is 3. Finally, if both p1 and p3
are 1 the length of the shortest m-path between p and p4 is 4. In this case, the path goes through
the sequence of points pp1 p2 p3 p4.
IMAGING GEOMETRY SOME BASIC TRANSFORMATIONS In this, all transformations are expressed in a three-dimensional Cartesian coordinate system in
which a point has coordinates denoted as (X,Y,Z).
Translation Suppose that the task is to translate a point with coordinates (X,Y,Z) to a new location by using
displacement (X0,Y0,Z0). The translation is easily accomplished by using the equations:
X*=X+X 0
Y*=Y+Y 0
Z*=Z+Z0
Where (X*,Y*,Z*) are the coordinates of the new point. Above equations can be represented in
matrix form as
0
0
0
* 1 0 0
* 0 1 0
* 0 0 11
XX X
YY Y
ZZ Z
=
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 23 Department of ECE
It is often useful to concatenate several transformations to produce a composite result, such s
translation, followed by scaling and then rotation. The use of square matrices simplifies the
notational representation of this process considerably. So, the above can be modified as:
0
0
0
* 1 0 0
* 0 1 0
* 0 0 1
1 0 0 0 1 1
X X X
Y Y Y
Z Z Z
=
Let us consider the unified matrix representation.
v*=Av where A is a 4x4 transformation matrix, v is the column vector containing the original
coordinates,
v
1
X
Y
Z
=
and v* is a column vector whose components are the transformed coordinates
*
*v*
*
1
X
Y
Z
=
With this notation, the matrix used for translation is
0
0
0
1 0 0
0 1 0T
0 0 1
0 0 0 1
X
Y
Z
=
And the translation process id accomplished by the equation v*=Tv Scaling Scaling by factors Sx,Sy,Sz along the X,Y and X axes is given by the transformation matrix
x
y
z
S 0 0 0
0 S 0 0
0 0 S 0
0 0 0 1
S
=
Rotation
The transformations used for 3-D rotation are more complex. The simplest form of these
transformations is for rotation of a point about the coordinate axes. To rotate a point about
another arbitrary point in space requires three transformations: the first translates the arbitrary
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 24 Department of ECE
point to the origin, the second performs the rotation, and the third translated the point back to the
original position.
With reference to above figure, rotation of a point about Z coordinate axis by an angle θ is
achieved by using the transformation
θ
cosθ sinθ 0 0
-sinθ cosθ 0 0R =
0 0 1 0
0 0 0 1
The rotation angle θ is measured clockwise when looking at the origin from point on z-axis. This
transformation affects only the values of X and Y coordinates.
Rotation of a point about the X axis by an angle α is performed by using the transformation
α
1 0 0 0
0 cosα sinα 0R =
0 -sinα cosα 0
0 0 0 1
Finally the rotation of a point about the Y axis by an angle β is achieved using the transformation
β
cosβ 0 -sinβ 0
0 1 0 0R =
sinβ 0 cosβ 0
0 0 0 1
Concatenation and Inverse transformations
The application of several transformations can be represented by a single 4x4 transformation
matrix. For example, translation, scaling, and rotation about Z axis of a point v is given by
v*=R θθθθ(S(Tv))=Av
Z
X
Y
ββββ
αααα
θθθθ
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 25 Department of ECE
Where A is a 4x4 matrix, A=RθθθθST. These matrices generally do not commute, so the order of
application is important.
The same above ideas can be extended for transforming a set of m points simultaneously
by using a single transformation. Let v1, v2, …. vm represent the coordinates of m points. For a
4x4 matrix V whose column vectors, the simultaneous transformation of all these points by a 4x4
transformation matrix A is given by
V*=AV
The resulting matrix V* is 4xm. Its ith column, vi*, contains the coordinates of the
transformed point corresponding to vi
Many of the transformations discussed above have inverse matrices that perform the
opposite transformation and can be obtained by inspection. For example, the inverse
transformation matix is
0
0-1
0
1 0 0
0 1 0T
0 0 1
0 0 0 1
X
Y
Z
− − = −
QUESTIONS & ANSWERS
1. Why do we process images?
Image Processing has been developed in response to three major problems concerned with
pictures:
� Picture digitization and coding to facilitate transmission, printing and storage of pictures.
� Picture enhancement and restoration in order, for example, to interpret more easily
pictures of the surface of other planets taken by various probes.
� Picture segmentation and description as an early stage in Machine Vision.
2. What is the brightness of an image at a pixel position?
Each pixel of an image corresponds to a part of a physical object in the 3D world. This
physical object is illuminated by some light which is partly reflected and partly absorbed by it.
Part of the reflected light reaches the sensor used to image the scene and is responsible for the
value recorded for the specific pixel. The recorded value of course, depends on the type of sensor
used to image the scene, and the way this sensor responds to the spectrum of the reflected light.
However, as a whole scene is imaged by the same sensor, we usually ignore these details. What
is important to remember is that the brightness values of different pixels have significance only
relative to each other and they are meaningless in absolute terms. So, pixel values between
different images should only be compared if either care has been taken for the physical processes
used to form the two images to be identical, or the brightness values of the two images have
somehow been normalized so that the effects of the different physical processes have been
removed.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 26 Department of ECE
3. Why are images often quoted as being 512 X 512, 256 X 256, 128 X 128 etc?
Many image calculations with images are simplified when the size of the image is a power of 2.
4. How many bits do we need to store an image?
The number of bits, b, we need to store an image of size N X N with 2m different grey levels is:
b=NxNxm
So, for a typical 512 X 512 image with 256 grey levels (m = 8) we need 2,097,152 bits or
262,144 8-bit bytes. That is why we often try to reduce m and N, without significant loss in the
quality of the picture.
5. Consider the two image subsets, S1 and S2, shown in the following figure. For V={1},
determine whether these two subsets are (a) 4-adjacent, (b) 8-adjacent, or (c) m-adjacent.
Let p and q be as shown in Figure. Then,
(a) S1 and S2 are not 4connected because q is not in the set N4(p)
(b) S1 and S2 are 8- connected because q is in the set N8(p)
(c) S1 and S2 are m-connected because (i) q is in ND(p), and (ii) the set N4(p)∩ N4(q) is empty.
6. Consider the image segment shown.
(a) Let V={0, 1} and compute the lengths of the shortest 4-, 8-, and m-path between p and q. If
a particular path does not exist between these two points, explain why.
(b) Repeat for V={1, 2}.
(a) When V = {0,1, 4path does not exist between p and q because it is impossible toget from p to
q by traveling along points that are both 4-adjacent and also have values from V . Figure below
shows this condition; it is not possible to get to q.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 27 Department of ECE
The shortest 8-path is shown in Figure below; its length is 4. The length of shortest m-path
(shown dashed) is 5. Both of these shortest paths are unique in this case.
(b) One possibility for the shortest 4-path when V = {1, 2} is shown in Figure below; its length is
6. It is easily verified that another 4path of the same length exists between p and q.
One possibility for the shortest 8path (it is not unique) is shown in figure below; its length is 4.
The length of a shortest m-path (shown dashed) is 6. This path is not unique.
7. (a) Give the condition(s) under which the D4 distance between two points p and q is equal to the shortest 4-path between these points. (b) Is this path unique?
A shortest 4-path between a point p with coordinates (x, y) and a point q with coordinates
(s, t) is shown in Fig below, where the assumption is that all points along the path are from V.
The length of the segments of the path are |x- s| and |y- t|, respectively.
The total path length is |x-s| + |y- t|, which we recognize as the definition of the D4
distance. This distance is independent of any paths that may exist between the points. The D4
distance obviously is equal to the length of the shortest 4path when the length of the path is |x- s|
+ |y- t|. This occurs whenever we can get from p to q by following a path whose elements (1) are
from V; and (2) are arranged in such a way that we can traverse the path from p to q by making
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 28 Department of ECE
turns in at most two directions (e.g., right and up). (b) The path may of may not be unique,
depending on V and the values of the points along the way.
8. Develop an algorithm for converting a one-pixel-thick 8-path to a 4-path.
The solution to this problem consists of determining all possible neighborhood shapes to
go from a diagonal segment to a corresponding 4connected segment, as shown in figure. The
algorithm then simply looks for the appropriate match every time a diagonal segment is
encountered in the boundary.
9. Explain the basic principle of imaging in different bands of electromagnetic spectrum.
Today, there is almost no area of technical endeavor that is not impacted in some way by
digital image processing. One of the simplest ways to develop a basic understanding of the extent
of image processing applications is to categorize images according to their source (e.g., visual,
X-ray, and so on). The principal energy source for images in use today is the electromagnetic
energy spectrum. Other important sources of energy include acoustic, ultrasonic, and electronic
(in the form of electron beams used in electron microscopy). Images based on radiation from the
EM spectrum are the most familiar, especially images in the X-ray and visual bands of the
spectrum. Electromagnetic waves can be conceptualized as propagating sinusoidal waves of
varying wavelengths, or they can be thought of as a stream of mass less particles, each traveling
in a wavelike pattern and moving at the speed of light. If spectral bands are grouped according to
energy per photon, we obtain the spectrum shown in figure, ranging from gamma rays (highest
energy) at one end to radio waves (lowest energy) at the other.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 29 Department of ECE
Gamma-Ray Imaging:
Major uses of imaging based on gamma rays include nuclear medicine and astronomical
observations. In nuclear medicine, the approach is to inject a patient with a radioactive isotope
that emits gamma rays as it decays. Images are produced from the emissions collected by gamma
ray detectors.
X-ray Imaging:
X-rays are among the oldest sources of EM radiation used for imaging. The best known
use of X-rays is medical diagnostics, but they also are used extensively in industry and other
areas, like astronomy. X-rays for medical and industrial imaging are generated using an X-ray
tube, which is a vacuum tube with a cathode and anode. In digital radiography, digital images are
obtained by one of two methods: (1) by digitizing X-ray films; or (2) by having the X-rays that
pass through the patient fall directly onto devices (such as a phosphor screen) that convert X-rays
to light. The light signal in turn is captured by a light-sensitive digitizing system. Angiography is
another major application in an area called contrast enhancement radiography. This procedure is
used to obtain images (called angiograms) of blood vessels.
Imaging in the Ultraviolet Band:
Applications of ultraviolet “light” are varied. They include lithography, industrial
inspection, microscopy, lasers, biological imaging, and astronomical observations. Ultraviolet
light is used in fluorescence microscopy, one of the fastest growing areas of microscopy.
Fluorescence microscopy is an excellent method for studying materials that can be made to
fluoresce, either in their natural form (primary fluorescence) or when treated with chemicals
capable of fluorescing (secondary fluorescence).
Imaging in the Visible and Infrared Bands:
Considering that the visual band of the electromagnetic spectrum is the most familiar in
all our activities, it is not surprising that imaging in this band outweighs by far all the others in
terms of scope of application. The infrared band often is used in conjunction with visual
imaging. The examples range from pharmaceuticals and micro inspection to materials
characterization. Another major area of visual processing is remote sensing, which usually
includes several bands in the visual and infrared regions of the spectrum. A major area of
imaging in the visual spectrum is in automated visual inspection of manufactured goods.
Imaging in the Microwave Band
The dominant application of imaging in the microwave band is radar. The unique feature
of imaging radar is its ability to collect data over virtually any region at any time, regardless of
weather or ambient lighting conditions. Some radar waves can penetrate clouds, and under
certain conditions can also see through vegetation, ice, and extremely dry sand. In many cases,
radar is the only way to explore inaccessible regions of the Earth’s surface. Instead of a camera
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 30 Department of ECE
lens, a radar uses an antenna and digital computer processing to record its images. In a radar
image, one can see only the microwave energy that was reflected back toward the radar antenna.
Imaging in the Radio Band:
As in the case of imaging at the other end of the spectrum (gamma rays), the major
applications of imaging in the radio band are in medicine and astronomy. In medicine radio
waves are used in magnetic resonance imaging (MRI). This technique places a patient in a
powerful magnet and passes radio waves through his or her body in short pulses. Each pulse
causes a responding pulse of radio waves to be emitted by the patient’s tissues. The location from
which these signals originate and their strength are determined by a computer, which produces a
two-dimensional picture of a section of the patient. MRI can produce pictures in any plane.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 31 Department of ECE
UNIT-III
IMAGE ENHANCEMENT IN SPATIAL DOMAIN
The principal objective of enhancement is to process an image so that the result is more
suitable than the original image for a specific application. Image enhancement approaches fall
into two broad categories: spatial domain methods and frequency domain methods. The term
spatial domain refers to the image plane itself, and approaches in this category are based on
direct manipulation of pixels in an image. Frequency domain processing techniques are based on
modifying the Fourier transform of an image.
BACKGROUND
The term spatial domain refers to the aggregate of pixels composing an image. Spatial
domain methods are procedures that operate directly on these pixels. Spatial domain processes
will be denoted by the expression.
g(x, y) = T[f(x, y)]
where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f, defined
over some neighborhood of (x, y), where f(x, y) is the input image, g(x, y) is the processed
image, and T is an operator on f, defined over some neighborhood of (x, y).
The principal approach in defining a neighborhood about a point (x, y) is to use a square
or rectangular sub-image area centered at (x, y), as Fig. 3.1 shows.
The center of the sub-image is moved from pixel to pixel starting, say, at the top left
corner. The operator T is applied at each location (x, y) to yield the output, g, at that location.
The process utilizes only the pixels in the area of the image spanned by the neighborhood.
Although other neighborhood shapes, such as approximations to a circle, sometimes are used,
square and rectangular arrays are by far the most predominant because of their ease of
implementation.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 32 Department of ECE
POINT PROCESSING TECHNIQUES
The simplest form of T is when the neighborhood is of size 1x1 (that is, a single pixel). In
this case, g depends only on the value of f at (x, y), and T becomes a gray-level (also called an
intensity or mapping) transformation function of the form
s=T(r)
where, for simplicity in notation, r and s are variables denoting, respectively, the gray level of
f(x, y) and g(x, y) at any point (x, y).
For example, if T(r) has the form shown in Fig. 3.2(a), the effect of this transformation
would be to produce an image of higher contrast than the original by darkening the levels below
m and brightening the levels above m in the original image. In this technique, known as contrast
stretching, the values of r below m are compressed by the transformation function into a narrow
range of s, toward black. The opposite effect takes place for values of r above m.
In the limiting case shown in Fig. 3.2(b), T(r) produces a two-level (binary) image. A
mapping of this form is called a thresholding function. Some fairly simple, yet powerful,
processing approaches can be formulated with gray-level transformations. Because enhancement
at any point in an image depends only on the gray level at that point, techniques in this category
often are referred to as point processing.
Larger neighborhoods allow considerably more flexibility. The general approach is to use
a function of the values of f in a predefined neighborhood of (x, y) to determine the value of g at
(x, y). One of the principal approaches in this formulation is based on the use of so-called masks
(also referred to as filters, kernels, templates, or windows). Basically, a mask is a small (say, 3*3)
2-D array, such as the one shown in Fig. 3.1, in which the values of the mask coefficients
determine the nature of the process, such as image sharpening. Enhancement techniques based
on this type of approach often are referred to as mask processing or filtering.
SOME BASIC GRAY LEVEL TRANSFORMATIONS
These are among the simplest of all image enhancement techniques. The values of pixels,
before and after processing, will be denoted by r and s, respectively. These values are related
by an expression of the form s=T(r), where T is a transformation that maps a pixel value r into a
pixel value s. As an introduction to gray-level transformations, consider Fig. 3.3, which shows
three basic types of functions used frequently for image enhancement: linear (negative and
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 33 Department of ECE
identity transformations), logarithmic (log and inverse-log transformations), and power-law (nth
power and nth root transformations).The identity function is the trivial case in which output
intensities are identical to input intensities.
Image Negatives
The negative of an image with gray levels in the range [0,L-1]is obtained by using the
negative transformation shown in Fig. 3.3, which is given by the expression.
s = L-1-r
Reversing the intensity levels of an image in this manner produces the equivalent of a
photographic negative. This type of processing is particularly suited for enhancing white or gray
detail embedded in dark regions of an image, especially when the black areas are dominant in
size.
An example is shown in Fig. 3.4. The original image is a digital mammogram showing a
small lesion. In spite of the fact that the visual content is the same in both images, note how
much easier it is to analyze the breast tissue in the negative image in this particular case.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 34 Department of ECE
Log Transformations
The general form of the log transformation shown in Fig. 3.3 is
s = c log (1 + r)
where c is a constant, and it is assumed that r ≥ 0. The shape of the log curve in Fig. 3.3 shows
that this transformation maps a narrow range of low gray-level values in the input image into a
wider range of output levels. The opposite is true of higher values of input levels. We would use
a transformation of this type to expand the values of dark pixels in an image while compressing
the higher-level values. The opposite is true of the inverse log transformation. The log function
has the important characteristic that it compresses the dynamic range of images with large
variations in pixel values. A classic illustration of an application in which pixel values have a
large dynamic range is the Fourier spectrum. It is not unusual to encounter spectrum values that
range from 0 to or higher. While processing numbers such as these presents no problems for a
computer, image display systems generally will not be able to reproduce faithfully such a wide
range of intensity values. The net effect is that a significant degree of detail will be lost in the
display of a typical Fourier spectrum.
Power-Law Transformations
Power-law transformations have the basic form
s = crγγγγ
where c and g are positive constants. Sometimes the above equation is written as s = (c+ε)γ to
account for an offset (that is, a measurable output when the input is zero). However, offsets
typically are an issue of display calibration and as a result they are normally ignored. Plots of s
versus r for various values of g are shown in Fig. 3.6.
As in the case of the log transformation, power-law curves with fractional values of γ
map a narrow range of dark input values into a wider range of output values, with the opposite
being true for higher values of input levels. Unlike the log function, however, we notice here a
family of possible transformation curves obtained simply by varying γ.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 35 Department of ECE
A variety of devices used for image capture, printing, and display respond according to a
power law. By convention, the exponent in the power-law equation is referred to as gamma. The
process used to correct these power-law response phenomena is called gamma correction. In
addition to gamma correction, power-law transformations are useful for general-purpose contrast
manipulation.
ARITHMATIC & LOGICAL OPERATIONS
Arithmetic/logic operations involving images are performed on a pixel-by-pixel basis
between two or more images (this excludes the logic operation NOT, which is performed on a
single image). As an example, subtraction of two images results in a new image whose pixel at
coordinates (x, y) is the difference between the pixels in that same location in the two images
being subtracted. When dealing with logic operations on gray-scale images, pixel values are
processed as strings of binary numbers. For example, performing the NOT operation on a black,
8-bit pixel (a string of eight 0’s) produces a white pixel (a string of eight 1’s). Intermediate
values are processed the same way, changing all 1’s to 0’s and vice versa. Thus, the NOT logic
operator performs the same function as the negative transformation The AND and OR operations
are used for masking; that is, for selecting sub-images in an image. In the AND and OR image
masks, light represents a binary 1 and dark represents a binary 0. Masking sometimes is referred
to as region of interest (ROI) processing. In terms of enhancement, masking is used primarily to
isolate an area for processing. This is done to highlight that area and differentiate
it from the rest of the image. Of the four arithmetic operations, subtraction and addition (in that
order) are the most useful for image enhancement.
IMAGE SUBTRACTION
The difference between two images f(x, y) and h(x, y), expressed as
is obtained by computing the difference between all pairs of corresponding pixels from f and h.
The key usefulness of subtraction is the enhancement of differences between images.
In practice, most images are displayed using 8 bits (even 24-bit color images consists of
three separate 8-bit channels). Thus, we expect image values not to be outside the range from 0
to 255.The values in a difference image can range from a minimum of –255 to a maximum of
255, so some sort of scaling is required to display the results. There are two principal ways to
scale a difference image.
� One method is to add 255 to every pixel and then divide by 2. It is not guaranteed that
the values will cover the entire 8-bit range from 0 to 255, but all pixel values
definitely will be within this range. This method is fast and simple to implement, but
it has the limitations that the full range of the display may not be utilized and,
potentially more serious; the truncation inherent in the division by 2 will generally
cause loss in accuracy.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 36 Department of ECE
� If more accuracy and full coverage of the 8-bit range are desired, then we can resort
to another approach. First, the value of the minimum difference is obtained and its
negative added to all the pixels in the difference image (this will create a modified
difference image whose minimum values is 0). Then, all the pixels in the image are
scaled to the interval [0, 255] by multiplying each pixel by the quantity 255_Max,
where Max is the maximum pixel value in the modified difference image. It is evident
that this approach is considerably more complex and difficult to implement.
Application
One of the most commercially successful and beneficial uses of image subtraction is in
the area of medical imaging called mask mode radiography. In this case h(x, y), the mask, is an
X-ray image of a region of a patient’s body captured by an intensified TV camera (instead of
traditional X-ray film) located opposite an X-ray source. The procedure consists of injecting a
contrast medium into the patient’s bloodstream, taking a series of images of the same anatomical
region as h(x, y), and subtracting this mask from the series of incoming images after injection of
the contrast medium. The net effect of subtracting the mask from each sample in the incoming
stream of TV images is that the areas that are different between f(x, y) and h(x, y) appear in the
output image as enhanced detail. Because images can be captured at TV rates, this procedure in
essence gives a movie showing how the contrast medium propagates through the various arteries
in the area being observed.
IMAGE AVERAGING
Consider a noisy image g(x, y) formed by the addition of noise h(x, y) to an original
image f(x, y); that is,
where the assumption is that at every pair of coordinates (x, y) the noise is uncorrelated and has
zero average value. The objective of the following procedure is to reduce the noise content by
adding a set of noisy images, {gi(x, y)}.
If the noise satisfies the constraints just stated, it can be shown that if an image is formed
by averaging K different noisy images,
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 37 Department of ECE
As K increases, Eqs. (3.4-5) and (3.4-6) indicate that the variability (noise) of the pixel values at
each location (x, y) decreases. Because = f(x, y), this means that approaches f(x, y) as the
number of noisy images used in the averaging process increases.
An important application of image averaging is in the field of astronomy, where imaging
with very low light levels is routine, causing sensor noise frequently to render single images
virtually useless for analysis.
HISTOGRAM PROCESSING
The histogram of a digital image with gray levels in the range [0, L-1] is a discrete
function h(rk)=n, where rk is the kth gray level and nk is the number of pixels in the image having
gray level rk. It is common practice to normalize a histogram by dividing each of its values by
the total number of pixels in the image, denoted by n. Thus, a normalized histogram is given by
p(rk)=nk/n, for k=0, 1,… ,L-1. Loosely speaking, p(rk) gives an estimate of the probability of
occurrence of gray level rk. Note that the sum of all components of a normalized histogram is
equal to 1. The horizontal axis of each histogram plot corresponds to gray level values, rk. The
vertical axis corresponds to values of h(rk)=nk or p(rk)=nk/n if the values are normalized
Histograms are the basis for numerous spatial domain processing techniques. Histogram
manipulation can be used effectively for image enhancement. Histograms are the basis for
numerous spatial domain processing techniques. Histogram manipulation can be used effectively
for image enhancement
For a dark image the components of the histogram are concentrated on the low (dark)
side of the gray scale. Similarly, the components of the histogram of the bright image are biased
toward the high side of the gray scale. An image with low contrast has a histogram that will be
narrow and will be centered toward the middle of the gray scale. For a monochrome image this
implies a dull, washed-out gray look. Finally, we see that the components of the histogram in the
high-contrast image cover a broad range of the gray scale and, further, that the distribution of
pixels is not too far from uniform, with very few vertical lines being much higher than the others.
Intuitively, it is reasonable to conclude that an image, whose pixels tend to occupy the entire
range of possible gray levels and, in addition, tend to be distributed uniformly, will have an
appearance of high contrast and will exhibit a large variety of gray tones. The net effect will be
an image that shows a great deal of gray-level detail and has high dynamic range.
Histogram Equalization (Histogram Linearization)
Consider for a moment continuous functions, and let the variable r represent the gray
levels of the image to be enhanced. Assume that r has been normalized to the interval [0, 1], with
r =0 representing black and r=1 representing white. Later, we consider a discrete formulation and
allow pixel values to be in the interval [0, L-1].
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 38 Department of ECE
For any r satisfying the aforementioned conditions, we focus attention on transformations of the
form that produce a level s for every pixel value r in the original image.
Assume that the transformation function T(r) satisfies the following conditions:
(a) T(r) is single-valued and monotonically increasing in the interval 0 ≤ r ≤ 1; and
(b) 0 ≤ T(r) ≤ 1 for 0 ≤ r ≤ 1.
The requirement in condition (a) is that T(r) be single valued is needed to guarantee that the
inverse transformation will exist, and the monotonicity condition preserves the increasing order
from black to white in the output image. A transformation function that is not monotonically
increasing could result in at least a section of the intensity range being inverted, thus producing
some inverted gray levels in the output image. Finally, condition (b) guarantees that the output
gray levels will be in the same range as the input levels. Figure 3.16 gives an example of a
transformation function that satisfies these two conditions.
The inverse transformation from s back to r is denoted
The gray levels in an image may be viewed as random variables in the interval [0, 1]. One of the
most fundamental descriptors of a random variable is its probability density function (PDF). Let
pr(r) and ps(s) denote the probability density functions of random variables r and s, respectively,
where the subscripts on p are used to denote that pr and ps are different functions. A basic result
from an elementary probability theory is that, if pr(r) and T(r) are known and satisfies condition
(a), then the probability density function ps(s) of the transformed variable s can be obtained using
a rather simple formula:
Thus, the probability density function of the transformed variable, s, is determined by the gray-
level PDF of the input image and by the chosen transformation function.
A transformation function of particular importance in image processing has the form
where w is a dummy variable of integration.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 39 Department of ECE
Given transformation function T(r),we find ps(s) by applying Eq. (3.3-3).We know from basic
calculus (Leibniz’s rule) that the derivative of a definite integral with respect to its upper limit is
simply the integrand evaluated at that limit. In other words,
Substituting this result for dr/ds into Eq. (3.3-3), and keeping in mind that all probability values are positive, yields
Because ps(s) is a probability density function, it follows that it must be zero outside the interval
[0, 1] in this case because its integral over all values of s must equal 1.We recognize the form of
ps(s) given in Eq. (3.3-6) as a uniform probability density function. Simply stated, we have
demonstrated that performing the transformation function given in Eq. (3.3-4) yields a random
variable s characterized by a uniform probability density function. It is important to note from
Eq. (3.3-4) that T(r) depends on pr(r), but, as indicated by Eq. (3.3-6), the resulting ps(s) always
is uniform, independent of the form of pr(r).
For discrete values we deal with probabilities and summations instead of probability density
functions and integrals. The probability of occurrence of gray level rk in an image is
approximated by
where, n is the total number of pixels in the image, nk is the number of pixels that have gray
level rk, and L is the total number of possible gray levels in the image. The discrete version of the
transformation function given in Eq. (3.3-4) is
Thus, a processed (output) image is obtained by mapping each pixel with level rk in the
input image into a corresponding pixel with level sk in the output image via Eq. (3.3-8). As
indicated earlier, a plot of pr(rk) versus rk is called a histogram. The transformation (mapping)
given in Eq. (3.3-8) is called histogram equalization or histogram linearization.
In general that this discrete transformation will produce the discrete equivalent of a
uniform probability density function, which would be a uniform histogram. However, as will be
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 40 Department of ECE
seen shortly, use of Eq. (3.3-8) does have the general tendency of spreading the histogram of the
input image so that the levels of the histogram-equalized image will span a fuller range of the
gray scale. In addition to producing gray levels that have this tendency, the method just derived
has the additional advantage that it is fully “automatic.” In other words, given an image, the
process of histogram equalization consists simply of implementing Eq. (3.3-8), which is based on
information that can be extracted directly from the given image, without the need for further
parameter specifications.
Histogram Matching (Histogram Specification)
Histogram equalization automatically determines a transformation function that seeks to produce
an output image that has a uniform histogram. When automatic enhancement is desired, this is a
good approach because the results from this technique are predictable and the method is simple
to implement. But there are applications in which attempting to base enhancement on a uniform
histogram is not the best approach. In particular, it is useful sometimes to be able to specify the
shape of the histogram that we wish the processed image to have. The method used to generate a
processed image that has a specified histogram is called histogram matching or histogram
specification.
Development of the method
Let r and z denote continuous gray levels, and let pr(r) and pz(z) denote their corresponding
continuous probability density functions. In this notation, r and z denote the gray levels of the
input and output (processed) images, respectively. We can estimate pr(r) from the given input
image, while pz(z) is the specified probability density function that we wish the output image to
have.
Let s be a random variable with the property
where w is a dummy variable of integration. We recognize this expression as the continuous
version of histogram equalization given in Eq. (3.3-4). Suppose next that we define a random
variable z with the property.
where t is a dummy variable of integration. It then follows from these two equations that
G(z)=T(r) and, therefore, that z must satisfy the condition
The transformation T(r) can be obtained from Eq. (3.3-10) once pr(r) has been estimated from the
input image. Similarly, the transformation function G(z) can be obtained using Eq. (3.3-11)
because pz(z) is given.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 41 Department of ECE
The discrete formulation of above equations is,
…. (1)
where n is the total number of pixels in the image, nj is the number of pixels with gray level rj ,
and L is the number of discrete gray levels. Similarly, the discrete formulation of Eq. (3.3-11) is
obtained from the given histogram pz(zi),i=0, 1, 2,… ,L-1, and has the form
… (2)
…(3)
….(4)
Equations (1) to (3) are the foundation for implementing histogram matching for digital
images. Equation (1) is a mapping from the levels in the original image into corresponding levels
sk based on the histogram of the original image, which we compute from the pixels in the image.
Equation (2) computes a transformation function G from the given histogram pz(z). Finally, Eq.
(3) or its equivalent, Eq. (4), gives us (an approximation of) the desired levels of the image with
that histogram. The above equations show that an image with a specified probability density
function can be obtained from an input image by using the following procedure:
(1) Obtain the transformation function T(r) using Eq. (1).
(2) Use Eq. (2) to obtain the transformation function G(z).
(3) Obtain the inverse transformation function G–1.
(4) Obtain the output image by applying Eq. (3) to all the pixels in the input image. The
result of this procedure will be an image whose gray levels, z, have the specified
probability density function pz(z).
Local Enhancement
The histogram processing methods discussed in the previous two sections are global, in
the sense that pixels are modified by a transformation function based on the gray-level content of
an entire image. Although this global approach is suitable for overall enhancement, there are
cases in which it is necessary to enhance details over small areas in an image. The number of
pixels in these areas may have negligible influence on the computation of a global transformation
whose shape does not necessarily guarantee the desired local enhancement. The solution is to
devise transformation functions based on the gray-level distribution or other properties in the
neighborhood of every pixel in the image.
The histogram processing techniques previously described are easily adaptable to local
enhancement. The procedure is to define a square or rectangular neighborhood and move the
center of this area from pixel to pixel. At each location, the histogram of the points in the
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 42 Department of ECE
neighborhood is computed and either a histogram equalization or histogram specification
transformation function is obtained. This function is finally used to map the gray level of the
pixel centered in the neighborhood. The center of the neighborhood region is then moved to an
adjacent pixel location and the procedure is repeated. Since only one new row or column of the
neighborhood changes during a pixel-to-pixel translation of the region, updating the histogram
obtained in the previous location with the new data introduced at each motion step is possible.
This approach has obvious advantages over repeatedly computing the histogram over all pixels
in the neighborhood region each time the region is moved one pixel location. Another approach
used some times to reduce computation is to utilize non-overlapping regions, but this method
usually produces an undesirable checkerboard effect.
Figure 3.23(a) shows an image that has been slightly blurred to reduce its noise content
(see Section 3.6.1 regarding blurring).Figure 3.23(b) shows the result of global histogram
equalization. As is often the case when this technique is applied to smooth, noisy areas, Fig.
3.23(b) shows considerable enhancement of the noise, with a slight increase in contrast. Note
that no new structural details were brought out by this method. However, local histogram
equalization using a 7*7 neighborhood revealed the presence of small squares inside the larger
dark squares.The small squares were too close in gray level to the larger ones, and their sizes
were too small to influence global histogram equalization significantly. Note also the finer noise
texture in Fig. 3.23(c), a result of local processing using relatively small neighborhoods.
SPATIAL FILTERING
Some neighborhood operations work with the values of the image pixels in the
neighborhood and the corresponding values of a sub-image that has the same dimensions as the
neighborhood. The sub-image is called a filter, mask, kernel, template, or window. The values in
a filter sub-image are referred to as coefficients, rather than pixels.
The mechanics of spatial filtering are illustrated in Fig. 3.32.The process consists simply
of moving the filter mask from point to point in an image. At each point (x, y), the response of
the filter at that point is calculated using a predefined relationship. For linear spatial filtering,
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 43 Department of ECE
the response is given by a sum of products of the filter coefficients and the corresponding image
pixels in the area spanned by the filter mask. For the 3x3 mask shown in Fig. 3.32, the result (or
response), R, of linear filtering with the filter mask at a point (x, y) in the image is
In general, linear filtering of an image f of size MxN with a filter mask of size mxn is
given by the expression:
where, a=(m-1)/2 and b=(n-1)/2. To generate a complete filtered image this equation must be
applied for x=0, 1, 2, … , M-1 and y=0, 1, 2, … , N-1.
The process of linear filtering given in above equation is similar to a frequency domain
concept called convolution. For this reason, linear spatial filtering often is referred to as
“convolving a mask with an image.” Similarly, filter masks are sometimes called convolution
masks. The term convolution kernel also is in common use.
When interest lies on the response, R, of an mxn mask at any point (x, y), and not on the
mechanics of implementing mask convolution, it is common practice to simplify the notation by
using the following expression:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 44 Department of ECE
where the w’s are mask coefficients, the z’s are the values of the image gray levels
corresponding to those coefficients, and mn is the total number of coefficients in the mask. For
the 3x3 general mask shown in figure below, the response at any point (x, y) in the image is
given by
An important consideration in implementing neighborhood operations for spatial filtering is the
issue of what happens when the center of the filter approaches the border of the image. Consider
for simplicity a square mask of size nxn. At least one edge of such a mask will coincide with the
border of the image when the center of the mask is at a distance of (n-1)/2 pixels away from the
border of the image. If the center of the mask moves any closer to the border, one or more rows
or columns of the mask will be located outside the image plane. There are several ways to handle
this situation.
� The simplest is to limit the excursions of the center of the mask to be at a distance no less
than (n-1)/2 pixels from the border. The resulting filtered image will be smaller than the
original, but all the pixels in the filtered imaged will have been processed with the full
mask.
� If the result is required to be the same size as the original, then the approach typically
employed is to filter all pixels only with the section of the mask that is fully contained in
the image. With this approach, there will be bands of pixels near the border that will have
been processed with a partial filter mask.
� Other approaches include “padding” the image by adding rows and columns of 0’s (or
other constant gray level), or padding by replicating rows or columns. The padding is then
stripped off at the end of the process. This keeps the size of the filtered image the same as
the original, but the values of the padding will have an effect near the edges that becomes
more prevalent as the size of the mask increases.
� The only way to obtain a perfectly filtered result is to accept a somewhat smaller filtered
image by limiting the excursions of the center of the filter mask to a distance no less than
(n-1)/2 pixels from the border of the original image.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 45 Department of ECE
SMOOTHING SPATIAL FILTERS
Smoothing filters are used for blurring and for noise reduction. Blurring is used in
preprocessing steps, such as removal of small details from an image prior to (large) object
extraction, and bridging of small gaps in lines or curves. Noise reduction can be accomplished by
blurring with a linear filter and also by nonlinear filtering.
Smoothing Linear Filters
The output (response) of a smoothing, linear spatial filter is simply the average of the
pixels contained in the neighborhood of the filter mask. These filters sometimes are called
averaging filters. They also are referred to a lowpass filters.
The idea behind smoothing filters is straightforward. By replacing the value of every
pixel in an image by the average of the gray levels in the neighborhood defined by the filter
mask, this process results in an image with reduced “sharp” transitions in gray levels. Because
random noise typically consists of sharp transitions in gray levels, the most obvious application
of smoothing is noise reduction. However, edges (which almost always are desirable features of
an image) also are characterized by sharp transitions in gray levels, so averaging filters have the
undesirable side effect that they blur edges. Another application of this type of process includes
the smoothing of false contours that result from using an insufficient number of gray levels. A
major use of averaging filters is in the reduction of “irrelevant” detail in an image. By
“irrelevant” we mean pixel regions that are small with respect to the size of the filter mask.
Above figure shows two 3x3 smoothing filters. Use of the first filter yields the standard average
of the pixels under the mask. The response of the first filter is given by
which is the average of the gray levels of the pixels in the 3x3 neighborhood defined by the
mask. A spatial averaging filter in which all coefficients are equal is sometimes called a box
filter.
The second mask shown in above figure is a little more interesting. This mask yields a
so-called weighted average, terminology used to indicate that pixels are multiplied by different
coefficients, thus giving more importance (weight) to some pixels at the expense of others. In the
second mask shown, the pixel at the center of the mask is multiplied by a higher value than any
other, thus giving this pixel more importance in the calculation of the average. The other pixels
are inversely weighted as a function of their distance from the center of the mask. The diagonal
terms are further away from the center than the orthogonal neighbors (by a factor of) and, thus,
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 46 Department of ECE
are weighed less than these immediate neighbors of the center pixel. The basic strategy behind
weighing the center point the highest and then reducing the value of the coefficients as a function
of increasing distance from the origin is simply an attempt to reduce blurring in the smoothing
process.
In practice, it is difficult in general to see differences between images smoothed by using
either of the masks in above figure, or similar arrangements, because the area these masks span
at any one location in an image is so small.
The general implementation for filtering an MxN image with a weighted averaging filter
of size mxn (m and n odd) is given by the expression.
The denominator in above equation is simply the sum of the mask coefficients and,
therefore, it is a constant that needs to be computed only once.
The effects of smoothing as a function of filter size are illustrated in Fig. 3.35, which
shows an original image and the corresponding smoothed results obtained using square
averaging filters of sizes n=3, 5, 9, 15, and 35 pixels, respectively. The principal features of
these results are as follows:
� For n=3, we note a general slight blurring throughout the entire image but, as expected,
details that are of approximately the same size as the filter mask are affected considerably
more. For example, the 3*3 and 5*5 squares, the small letter “a,” and the fine grain noise
show significant blurring when compared to the rest of the image. A positive result is that
the noise is less pronounced. Note that the jagged borders of the characters and gray
circles have been pleasingly smoothed.
� The result for n=5 is somewhat similar, with a slight further increase in blurring.
� For n=9 we see considerably more blurring, and the 20% black circle is not nearly as
distinct from the background as in the previous three images, illustrating the blending
effect that blurring has on objects whose gray level content is close to that of its
neighboring pixels. Note the significant further smoothing of the noisy rectangles.
� The results for n=15 and 35 are extreme with respect to the sizes of the objects in the
image. This type of excessive blurring is generally used to eliminate small objects from an
image. For instance, the three small squares, two of the circles, and most of the noisy
rectangle areas have been blended into the background of the image in Fig. 3.35(f). Note
also in this figure the pronounced black border. This is a result of padding the border of
the original image with 0’s (black) and then trimming off the padded area. Some of the
black was blended into all filtered images, but became truly objectionable for the images
smoothed with the larger filters.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 47 Department of ECE
An important application of spatial averaging is to blur an image for the purpose getting a
gross representation of objects of interest, such that the intensity of smaller objects blends with
the background and larger objects become “blob-like” and easy to detect. The size of the mask
establishes the relative size of the objects that will be blended with the background.
Smoothing Non-Linear Filters (Order-Statistics Filters)
Order-statistics filters are nonlinear spatial filters whose response is based on ordering
(ranking) the pixels contained in the image area encompassed by the filter, and then replacing the
value of the center pixel with the value determined by the ranking result. The best-known
example in this category is the median filter, which, as its name implies, replaces the value of a
pixel by the median of the gray levels in the neighborhood of that pixel (the original value of the
pixel is included in the computation of the median). Median filters are quite popular because, for
certain types of random noise, they provide excellent noise-reduction capabilities, with
considerably less blurring than linear smoothing filters of similar size. Median filters are
particularly effective in the presence of impulse noise, also called salt-and-pepper noise because
of its appearance as white and black dots superimposed on an image.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 48 Department of ECE
The median, ξ, of a set of values is such that half the values in the set are less than or
equal to ξ, and half are greater than or equal to ξ. In order to perform median filtering at a point
in an image, we first sort the values of the pixel in question and its neighbors, determine their
median, and assign this value to that pixel. For example, in a 3*3 neighborhood the median is the
5th largest value, in a 5*5 neighborhood the 13th largest value, and so on.
The principal function of median filters is to force points with distinct gray levels to be
more like their neighbors. In fact, isolated clusters of pixels that are light or dark with respect to
their neighbors, and whose area is less than n2/2 (one-half the filter area), are eliminated by an
nxn median filter. In this case “eliminated” means forced to the median intensity of the
neighbors. Larger clusters are affected considerably less.
Figure 3.37(a) shows an X-ray image of a circuit board heavily corrupted by salt-and-
pepper noise. To illustrate the point about the superiority of median filtering over average
filtering in situations such as this, we show in Fig. 3.37(b) the result of processing the noisy
image with a 3*3 neighborhood averaging mask, and in Fig. 3.37(c) the result of using a 3*3
median filter. The image processed with the averaging filter has less visible noise, but the price
paid is significant blurring. The superiority in all respects of median over average filtering in this
case is quite evident. In general, median filtering is much better suited than averaging for the
removal of additive salt-and-pepper noise.
SHARPENING SPATIAL FILTERS
The principal objective of sharpening is to highlight fine detail in an image or to enhance
detail that has been blurred, either in error or as a natural effect of a particular method of image
acquisition. Uses of image sharpening vary and include applications ranging from electronic
printing and medical imaging to industrial inspection and autonomous guidance in military
systems.
We saw that image blurring could be accomplished in the spatial domain by pixel
averaging in a neighborhood. Since averaging is analogous to integration, it is logical to
conclude that sharpening could be accomplished by spatial differentiation. Fundamentally, the
strength of the response of a derivative operator is proportional to the degree of discontinuity of
the image at the point at which the operator is applied. Thus, image differentiation enhances
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 49 Department of ECE
edges and other discontinuities (such as noise) and deemphasizes areas with slowly varying gray-
level values. We consider in some detail sharpening filters that are based on first- and second-
order derivatives, respectively. Before proceeding with that discussion, however, we stop to look
at some of the fundamental properties of these derivatives in a digital context. To simplify the
explanation, we focus attention on one-dimensional derivatives. In particular, we are interested
in the behavior of these derivatives in areas of constant gray level (flat segments), at the onset
and end of discontinuities (step and ramp discontinuities), and along gray-level ramps. These
types of discontinuities can be used to model noise points, lines, and edges in an image. The
behavior of derivatives during transitions into and out of these image features also is of interest.
The derivatives of a digital function are defined in terms of differences. There are various ways
to define these differences. However, we require that any definition we use for a first derivative
(1) must be zero in flat segments (areas of constant gray-level values);
(2) must be nonzero at the onset of a gray-level step or ramp; and
(3) must be nonzero along ramps.
Similarly, any definition of a second derivative
(1) must be zero in flat areas;
(2) must be nonzero at the onset and end of a gray-level step or ramp; and
(3) must be zero along ramps of constant slope.
Since we are dealing with digital quantities whose values are finite, the maximum possible gray-
level change also is finite, and the shortest distance over which that change can occur is between
adjacent pixels.
A basic definition of the first-order derivative of a one-dimensional function f(x) is the difference
We used a partial derivative here in order to keep the notation the same as when we consider an
image function of two variables, f(x, y), at which time we will be dealing with partial derivatives
along the two spatial axes. Similarly, we define a second-order derivative as the difference
It is easily verified that these two definitions satisfy the conditions stated previously regarding
derivatives of the first and second order. To see this, and also to highlight the fundamental
similarities and differences between first- and second- order derivatives in the context of image
processing, consider the example shown below:
Figure 3.38(a) shows a simple image that contains various solid objects, a line, and a
single noise point. Figure 3.38(b) shows a horizontal gray-level profile (scan line) of the image
along the center and including the noise point. This profile is the one-dimensional function we
will use for illustrations regarding this figure. Figure 3.38(c) shows a simplification of the
profile, with just enough numbers to make it possible for us to analyze how the first- and second-
order derivatives behave as they encounter a noise point, a line, and then the edge of an object. In
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 50 Department of ECE
our simplified diagram the transition in the ramp spans four pixels, the noise point is a single
pixel, the line is three pixels thick, and the transition into the gray-level step takes place between
adjacent pixels. The number of gray levels was simplified to only eight levels.
Let us consider the properties of the first and second derivatives as we traverse the profile
from left to right. First, we note that the first-order derivative is nonzero along the entire ramp,
while the second-order derivative is nonzero only at the onset and end of the ramp. Because
edges in an image resemble this type of transition, we conclude that first-order derivatives
produce “thick” edges and second-order derivatives, much finer ones. Next we encounter the
isolated noise point. Here, the response at and around the point is much stronger for the second-
than for the first-order derivative. Of course, this is not unexpected. A second-order derivative is
much more aggressive than a first-order derivative in enhancing sharp changes. Thus, we can
expect a second-order derivative to enhance fine detail (including noise) much more than a first-
order derivative. The thin line is a fine detail, and we see essentially the same difference between
the two derivatives. If the maximum gray level of the line had been the same as the isolated
point, the response of the second derivative would have been stronger for the latter. Finally, in
this case, the response of the two derivatives is the same at the gray-level step (in most cases
when the transition into a step is not from zero, the second derivative will be weaker).We also
note that the second derivative has a transition from positive back to negative. In an image, this
shows as a thin double line. This “double-edge” effect is an issue that will be important, where
we use derivatives for edge detection. It is of interest also to note that if the gray level of the thin
line had been the same as the step, the response of the second derivative would have been
stronger for the line than for the step.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 51 Department of ECE
In summary, comparing the response between first- and second-order derivatives, we
arrive at the following conclusions.
(1) First-order derivatives generally produce thicker edges in an image.
(2) Second-order derivatives have a stronger response to fine detail, such as thin lines and
isolated points.
(3) First order derivatives generally have a stronger response to a gray-level step.
(4) Second- order derivatives produce a double response at step changes in gray level.
We also note of second-order derivatives that, for similar changes in gray-level values in
an image, their response is stronger to a line than to a step, and to a point than to a line.
In most applications, the second derivative is better suited than the first derivative for
image enhancement because of the ability of the former to enhance fine detail.
Image Enhancement using Second Derivatives–The Laplacian
The approach basically consists of defining a discrete formulation of the second-order
derivative and then constructing a filter mask based on that formulation. We are interested in
isotropic filters, whose response is independent of the direction of the discontinuities in the
image to which the filter is applied. In other words, isotropic filters are rotation invariant; in the
sense that rotating the image and then applying the filter gives the same result as applying the
filter to the image first and then rotating the result.
Development of the method:
It can be shown that the simplest isotropic derivative operator is the Laplacian, which, for
a function (image) f(x, y) of two variables, is defined as
Because derivatives of any order are linear operations, the Laplacian is a linear operator. In order
to be useful for digital image processing, this equation needs to be expressed in discrete
form.There are several ways to define a digital Laplacian using neighborhoods. The partial
second-order derivative in the x-direction is:
The digital implementation of the two-dimensional Laplacian is
This equation can be implemented using the mask shown in Fig. 3.39(a), which gives an
isotropic result for rotations in increments of 90°.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 52 Department of ECE
The diagonal directions can be incorporated in the definition of the digital Laplacian by
adding two more terms to Eq (3.7-4), one for each of the two diagonal directions. The form of
each new term is the same as either Eq. (3.7-2). or (3.7-3), but the coordinates are along the
diagonals. Since each diagonal term also contains a –2f(x, y) term, the total subtracted from the
difference terms now would be –8f(x, y). The mask used to implement this new definition is
shown in Fig. 3.39(b). This mask yields isotropic results for increments of 45°. The other two
masks shown in Fig. 3.39 also are used frequently in practice. They are based on a definition of
the Laplacian that is the negative of the one we used here. As such, they yield equivalent results,
but the difference in sign must be kept in mind when combining (by addition or subtraction) a
Laplacian-filtered image with another image.
Because the Laplacian is a derivative operator, its use highlights gray-level
discontinuities in an image and deemphasizes regions with slowly varying gray levels. This will
tend to produce images that have grayish edge lines and other discontinuities, all superimposed
on a dark, featureless background. Background features can be “recovered” while still preserving
the sharpening effect of the Laplacian operation simply by adding the original and Laplacian
images. If the definition used has a negative center coefficient, then we subtract, rather than add,
the Laplacian image to obtain a sharpened result. Thus, the basic way in which we use the
Laplacian for image enhancement is as follows:
Simplifications:
Previously, we implemented Eq. (3.7-5) by first computing the Laplacian-filtered image
and then subtracting it from the original image. In practice, Eq. (3.7-5) is usually implemented
with one pass of a single mask. The coefficients of the single mask are easily obtained by
substituting Eq. (3.7-4) for ▼2(x, y )in the first line of Eq. (3.7-5):
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 53 Department of ECE
This equation can be implemented using the mask shown below. The second mask shown
in below would be used if the diagonal neighbors also were included in the calculation of the
Laplacian. Identical masks would have resulted if we had substituted the negative of Eq. (3.7-4)
into the second line of Eq. (3.7-5).
The results obtainable with the mask containing the diagonal terms usually are a little
sharper than those obtained with the more basic mask of Fig. 3.41(a). This property is illustrated
by the Laplacian-filtered images shown in Figs. 3.41(d) and (e), which were obtained by using
the masks in Figs. 3.41(a) and (b), respectively. By comparing the filtered images with the
original image shown in Fig. 3.41(c), we note that both masks produced effective enhancement,
but the result using the mask in Fig. 3.41(b) is visibly sharper
Unsharp masking and high-boost filtering
A process used for many years in the publishing industry to sharpen images consists of
subtracting a blurred version of an image from the image itself. This process, called unsharp
masking, is expressed as
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 54 Department of ECE
where fs(x, y) denotes the sharpened image obtained by unsharp masking, and f(x,y) is a blurred
version of f(x, y).The origin of unsharp masking is in darkroom photography, where it consists of
clamping together a blurred negative to a corresponding positive film and then developing this
combination to produce a sharper image.
A slight further generalization of unsharp masking is called high-boost filtering. A high-
boost filtered image, fhb, is defined at any point (x, y) as
High-boost filtering can be implemented with one pass using either of the two masks
shown in Fig. 3.42. Note that, when A=1, high-boost filtering becomes “standard” Laplacian
sharpening. As the value of A increases past 1, the contribution of the sharpening process
becomes less and less important. Eventually, if A is large enough, the high-boost image will be
approximately equal to the original image multiplied by a constant.
One of the principal applications of boost filtering is when the input image is darker than
desired. By varying the boost coefficient, it generally is possible to obtain an overall increase in
average gray level of the image, thus helping to brighten the final result.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 55 Department of ECE
Image Enhancement using First Derivatives–The Gradient
First derivatives in image processing are implemented using the magnitude of the
gradient. For a function f(x, y), the gradient of f at coordinates (x, y) is defined as the two-
dimensional column vector.
The components of the gradient vector itself are linear operators, but the magnitude of
this vector obviously is not because of the squaring and square root operations. On the other
hand, the partial derivatives in Eq. (3.7-12) are not rotation invariant (isotropic), but the
magnitude of the gradient vector is. Although it is not strictly correct, the magnitude of the
gradient vector often is referred to as the gradient.
The computational burden of implementing Eq. (3.7-13) over an entire image is not
trivial, and it is common practice to approximate the magnitude of the gradient by using absolute
values instead of squares and square roots:
This equation is simpler to compute and it still preserves relative changes in gray levels,
but the isotropic feature property is lost in general. However, as in the case of the Laplacian, the
isotropic properties of the digital gradient defined are preserved only for a limited number of
rotational increments that depend on the masks used to approximate the derivatives. As it turns
out, the most popular masks used to approximate the gradient give the same result only for
vertical and horizontal edges and thus the isotropic properties of the gradient are preserved only
for multiples of 90°. These results are independent of whether Eq. (3.7-13) or (3.7-14) is used, so
nothing of significance is lost in using the simpler of the two equations.
As in the case of the Laplacian, we now define digital approximations to the preceding
equations, and from there formulate the appropriate filter masks. In order to simplify the
discussion that follows, we will use the notation in Fig. 3.44(a) to denote image points in a 3x3
region. For example, the center point, z5 , denotes f(x, y), z1 denotes f(x-1, y-1), and so on.
Roberts cross-gradient operators
The simplest approximations to a first-order derivative that satisfy the conditions stated in
that section are Gx=(z8-z5) and Gy=(z6-z5). Two other definitions proposed by Roberts in the
early development of digital image processing use cross differences:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 56 Department of ECE
This equation can be implemented with the two masks shown below.These masks are
referred to as the Roberts cross-gradient operators.
Sobel Operators
Masks of even size are awkward to implement. The smallest filter mask in which we are
interested is of size 3x3. An approximation using absolute values, still at point z5, but using a 3x3
mask, is
The difference between the third and first rows of the 3x3 image region approximates the
derivative in the x-direction, and the difference between the third and first columns approximates
the derivative in the y-direction. The masks shown above are called the Sobel operators, can be
used to implement Eq. (3.7-18). The idea behind using a weight value of 2 is to achieve some
smoothing by giving more importance to the center point. Note that the coefficients in all the
masks shown above sum to 0, indicating that they would give a response of 0 in an area of
constant gray level, as expected of a derivative operator.
The gradient is used frequently in industrial inspection, either to aid humans in the
detection of defects or, what is more common, as a preprocessing step in automated inspection.
The gradient can be used to enhance defects and eliminate slowly changing background features.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 57 Department of ECE
UNIT-IV
IMAGE ENHANCEMENT IN FREQUENCY DOMAIN
The frequency domain is nothing more than the space defined by values of the fourier
transform and its frequency variables (u,v).
Some basic properties of the frequency domain
Each term of F(u,v) contains all values of f(x,y), modified by the values of the exponential
terms. Some general statements can be made about the relationship between the frequency
components of the Fourier transform and spatial characteristics of an image. For instance, since
frequency is directly related to rate of change, we can associate frequencies in the Fourier
transform with patterns of intensity variations in an image.
� The slowest varying frequency component (u=v=0) corresponds to the average gray level
of an image.
� As we move far away from the origin of the transform, the low frequencies correspond to
the slowly varying components of an image. In an image of a room, for example, these
might correspond to smooth gray-level variations on the walls and floor.
� As we move further away from the origin, the higher frequencies begin to correspond to
faster and faster gray level changes in the image. These are edges of objects and other
components of an image characterized by abrupt changes in gray level, such as noise.
Basics of filtering in the frequency domain
Filtering in the frequency domain consists of the following steps:
1. Multiply the input image by (-1)x+y to center the transform as shown below.
2. Compute F(u,v),th e Discrete Fourier transform of the image from the step 1
3. Multiply F(u,v) by a filter function H(u,v)
4. Compute the inverse Discrete Fourier transform of the result in step 3
5. Obtain the real part of the result in step 4
6. Multiply the result in step 5 by (-1)x+y
The reason that H(u,v) is called a filter is because it suppresses certain frequencies in the
transform while leaving others unchanged. In equation form, let f(x,y) represent the input image
F(u-M/2,
v-N/2) v
u
f(x,y)
y
x
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 58 Department of ECE
in step 1 and F(u,v) its Fourier transform. Then the Fourier transform of the output image is
given by
In general, the components of F are complex quantities, but the filters which we deal are real. In
this case, each component of H multiplies both the real and imaginary parts of the corresponding
component in F. Such filters are called Zero-Phase-Shift filters. As their name implies, these
filters do not change the phase of the transform.
The filtered image is obtained by simply taking the inverse fourier transform of G(u,v):
The final image is obtained by taking the real part of thus result and multiplying it by (-1)x+y to
cancel the multiplication of the input image by this quantity.
In addition in the (-1)x+y process, examples of other pre-processing functions may include
cropping of the input image to its closest even dimensions, gray level slicing, conversion to
floating point on input, and conversion to an 8-bit integer format on the output. Multiple filtering
stages and other pre- and post processing functions are possible. The important point here is that
filtering process is based on modifying the transform of an image in some way via a filter
function, and then taking the inverse of the result to obtain the processed output image.
Some Basic filters and their properties
According the following equation, the average value of an image is given by F(0,0).
If we set this term to zero in the frequency domain and take the inverse transform, then the
average value of the resulting image will be zero. Assuming that the transform has been
centered, we can do this operation by multiplying all values of F(u,v) by the following filter
function:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 59 Department of ECE
The result of processing any image with the above transfer function is the drop in overall average
gray level resulting from forcing the average value to zero.
Low frequencies in the Fourier transform are responsible for the general gray-level appearance
of an image over smooth areas, while high frequencies are responsible for detail, such as edges
and noise.
� A filter that attenuates high frequencies while passing low frequencies is called a lowpass
filter. A lowpass-filtered image has less sharp detail than the original image because the
high frequencies have been attenuated. Such an image will appear smoother.
� A filter that attenuates low frequencies while passing high frequencies is called a low pass
filter. A highpass-filtered image would have less gray level variations in smooth areas and
emphasized transitional gray level detail. Such an image will appear sharper.
SMOOTHING FREQUENCY-DOMAIN FILTERS
Edges and other sharp transitions in the gray levels of an image contribute significantly to
the high-frequency content of its Fourier transform. Hence smoothing (blurring) is achieved in
the frequency domain by attenuating a specified range of high-frequency components in the
transform of a given image.
Our basic model for filtering in the frequency domain is given by
G(u,v)=H(u,v).F(u,v)
where F(u,v) is the fourier transform of the image to be smoothed. The objective is to select a
filter transfer function H(u,v) that yields G(u,v) by attenuating the high-frequency components of
F(u,v). we consider 3 types of lowpass filters: 1) Ideal Lowpass Filter (ILPF)
2) Butterworth Lowpass Filter (BLPF)
3) Gaussian Lowpass Filter (GLPF)
Ideal Lowpass Filter
The simplest lowpass filter we can visualize is a filter that cuts off all high frequency
components of the Fourier transform that are at a distance higher greater than a specified
distance D0 from the origin of the (centered) transform. Such a filter is called a two-dimensional
ideal lowpass filter and has transfer function shown below.
where D0 is a specified non-negative quantity, and D(u,v) is the distance from point (u,v) to the
origin of the frequency rectangle. If the image is of size MxN, we know that its transform also is
of same size, so the center of the frequency rectangle is at (u,v) = (M/2,N/2). In this case, the
distance from any point (u,v) to the center (origin) of the fourier transform is given by
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 60 Department of ECE
The name ideal filter indicates that all frequencies inside a circle of radius D0 are passed with no
attenuation, where all frequencies outside this circle are completely attenuated. This filter is
radially symmetric about the origin. The complete filter transfer function can be visualized by
rotating the cross section 3600 about the origin. For an ideal lowpass filter cross section, the
point of transition between H(u,v) = 1 and H(u,v) = 0 I called the cutoff frequency, D0.
The lowpass filters can be compared by studying their behavior as a function of the same cutoff
frequencies. One way to establish a set of standard cutoff frequency loci is to compute circles
that enclose specified amounts of total image power PT. This quantity is obtained by summing
the components of the power spectrum at each point (u,v), for u = 0,1,2,…M-1 and v =
0,1,2,…N-1;
where P(u,v) is the power spectrum. If the transform has been centered, a circle of radius r with
origin at the center of the frequency rectangle encloses α percent of the power, where
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 61 Department of ECE
It is clear from this example that ideal lowpass filtering is not very practical. The blurring and
ringing properties of the ideal lowpass filter can be explained by the reference to the convolution
theorem. The fourier transforms of the original image f(x,y) and the blurred image g(x,y) are
related in the frequency domain by the equation.
Where, H(u,v) is the filter function F and G are the fourier transforms of the two images just
mentioned. The convolution theorem states that the corresponding process in the spatial domain
is
where h(x,y) is the inverse fourier transform of the filter transfer function H(u,v). This h(x,y) has
two majot distinctive characteristics: a dominant component at the origin, and concentric,
circular components about the center component. The center component is primarily responsible
for blurring. The concentric components are responsible primarily for the ringing characteristics
of ideal filters. Both the radius of the center component and the number of circles per unit
distance from the origin are inversely proportional to the value of the cutoff frequency of the
ideal filter. So, as the cutoff frequency increases, blurring and ringing effect decreases.
Butterworth Lowpass Filter
The Butterworth filter has a parameter, called the filter order. For high values of this
parameter the Butterworth filter approaches the form of the ideal filter. For lower-order values,
the Butterworth filter has a smooth form similar to the Gaussian filter.
The transfer function of the Butterworth lowpass filter of order n, and with cutoff frequency at a
distance D0 from the origin, is defined as
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 62 Department of ECE
where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle..
Unlike the Ideal lowpass filter, the BLPF transfer function does not have a sharp
discontinuity that establishes a clear cutoff between passed and filtered frequencies. For filters
with smooth transfer functions, defining a cutoff frequency locus at points for which H(u,v) is
down to a certain fraction of its maximum value is customary. In this case H(u,v) = 0.5 when
D(u,v) = D0.
Butterworth filtered image has a smooth transition in blurring as a function of increasing
cutoff frequency. A Butterworth filter of order 1 has no ringing. Ringing generally is
imperceptible in filters of order 2, but can become a significant factor in filters of higher order.
Spatial representations of BLPFs of for n = 1,2,5 and 20 respectively are shown below.
The BLPF of order 1 has neither ringing not negative values. The filter of order 2 does
show mild ringing and small negative value, but certainly less pronounces than in the ILPF. As
the remaining image show, ringing in the BLPF becomes significant for higher-order filters. A
Butterworth filter of order 20 exhibits the characteristics of the ILPF. In general, BLPFs of order
2 are a good compromise between effective lowpass filtering and acceptable ringing
characteristics.
Gaussian Lowpass Filter
The transfer function of a two dimensional Gaussian lowpass filter is given by
where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle., σ is a
measure of the spread of the Gaussian curve. By letting σ = D0, the transfer function changes to
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 63 Department of ECE
When D(u,v) = D0, the filter is down to 0.607 of its maximum value.
The inverse Fourier transform of the Gaussian lowpass filter also is Gaussian. A spatial Gaussian
filter, obtained by computing the inverse Fourier transform of above equation will have no
ringing.
The Gaussian lowpass filter did not achieve as smoothing as the BLPF of order 2 for the same
value of cutoff frequency. This is because; the profile of the GLPF is not as tight as the profile of
the BLPF of order 2.
SHARPENING FREQUENCY-DOMAIN FILTERS
An image can be blurred by attenuating the high frequency components of its Fourier
transform. Because edges and other abrupt changes in gray levels are associated with high-
frequency components, image sharpening can be achieved in the frequency domain by a highpass
filtering process, which attenuates the low frequency components without disturbing the high
frequency information in the Fourier transform.
Because the intended function of the highpass filter is to perform the reverse operation of
the low pass filters, the transfer function of a highpass filter (Hhp(u,v)) can be obtained form its
corresponding transfer function of lowpass filter (Hlp(u,v)) as
we consider 3 types of lowpass filters: 1) Ideal Highpass Filter (IHPF)
2) Butterworth Highpass Filter (BHPF)
3) Gaussian Highpass Filter (GHPF)
Ideal Highpass Filter
The transfer function of a 2-D Ideal highpass filter is defined as
where D0 is the cutoff distance measured from the origin of the frequency rectangle, and D(u,v)
is the distance from point (u,v) to the origin of the frequency rectangle.. This filter is the opposite
of the ideal lowpass filter in the sense that it sets to zero all frequencies inside a circle of radius
D0 while passing, without attenuation, all frequencies outside the circle.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 64 Department of ECE
As in the case of ILPF, Ideal Highpass also has ringing effect. Because the spatial representation
of IHPF contains rings. It also contains a black spot at the center. Smaller objects in the image
cannot be filters because of the black spot in the spatial representation of the IHPF. Distortion of
the edges is also main problem in Ideal highpass filter. As the cutoff frequency increases,
distortion in the output image decreases and the spot size also decreases in the h(x,y) resulting in
the better filtering of smaller objects in the image f(x,y).
Butterworth Highpass Filter
The Butterworth Filter represents a transition between the sharpness of the ideal filter and
the total smoothness of the Gaussian filter. The transfer function of a Butterworth highpass filter
of order n and with cutoff frequency locus at a distance D0 from the origin is given by
where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle.
Butterworth filters behave smoother than Ideal highpass filters. In this filter, the distortion is
compared to that of IHPF. Since the center spot size in the spatial representations of IHPF and
BHPF are similar, the performance of the two filters in terms of filtering the smaller objects is
comparable. The transition into higher values of cutoff frequencies is much smoother with the
BHPF.
Gaussian Highpass Filter
The transfer function of the Gaussian highpass filter with cutoff frequency locus at a distance of
D0 from the origin is given by
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 65 Department of ECE
where D(u,v) is the distance from point (u,v) to the origin of the frequency rectangle.
The results obtained with Gaussian highpass filter are smoother than that of IHPF and BHPF.
Even the filtering of smaller objects and thin bars is cleaner with Gaussian filter.
Laplacian in the frequency Domain
It can be shown that
… (1)
From the above expression, it follows that
… (2)
The expression inside the brackets on the left side of above equation is nothing but the Laplacian
of f(x,y). Thus we got the important result,
… (3)
The above equation says that the Laplacian can be implemented in the frequency domain by
using the filter
…(4)
But we generally center the F(u,v) by performing the operation f(x,y)(-1)x+y prior to taking the
transform of the image. If f or F are of size MxN, this operation shifts the center transform so
that (u,v)=0 is at point (M/2,N/2) is the frequency rectangle. So, the center of the filter function
also needs to be shifted as:
…(5)
The Laplacian filtered image in the spatial domain is obtained by computing the inverse fourier
transform of H(u,v)F(u,v) as shown below:
…(6)
Computing the Laplacian in the spatial domain and taking the fourier transform of result is
equivalent to multiplying F(u,v) by H(u,v) in (6)
…(7)
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 66 Department of ECE
The spatial domain Laplacian filter function obtained by taking the inverse Fourier transform of
equation (5). Figure below shows the mask used to implement the definition of Laplacian in the
spatial domain.
The enhanced image g(x,y) can be obtained by subtracting the Laplacian image from the original
image:
…(8)
Instead of enhancing the image in two steps (first calculating the Laplacian image and
subtracting from original image), a single mask can be used to perform the entire operation in the
frequency domain with only one filter given by (substituting (7) in (8))
The above equation was obtained from the following equation, from which the enhanced image
can be obtained with a single transformation operation:
Unsharp Masking, High Boost Filtering, High Frequency Emphasis Filtering
The average background intensity in a highpass filtered image is near to black. This is
due to the fact that the highpass filters eliminate the zero-frequency component of their Fourier
transforms. The solution to this problem consists of adding a portion of the image back to the
filtered result as in Laplacian technique. Sometimes it is advantageous to increase the
contribution made by the original image to the overall filtered result. This approach is called
high-boost filtering, which is a generalization of unsharp masking.
Unsharp masking consists of simply generating a sharp image by subtracting from an image, a
blurred version of itself. That is, obtaining a highpass filtered image by subtracting from the
image a lowpass filtered version of itself. That is,
fhp(x,y) = f(x,y) - flp(x,y)… (1)
High-Boost filtering generalizes this by multiplying f(x,y) by a constant A≥1
fhb = Af(x,y)-f lp(x,y)… (2)
Thus, high-boost filtering gives us the flexibility to increase the contribution made by the image
to the overall enhanced image. The above equation can be changed as
fhb = (A-1)f(x,y) + f(x,y) - flp(x,y)
=> fhb = (A-1)f(x,y)-fhp(x,y)…(3)
The above result is based on a highpass rather than a lowpass image. When A=1, high-boost
filtering reduces to regular highpass filtering. As A increases past 1, the contribution made by the
image itself becomes more dominant.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 67 Department of ECE
We know,
Flp(u,v) = Hlp(u,v)F(u,v)…(4)
Fhp(u,v) = Hhp(u,v)F(u,v)…(5)
Where Hlp is the transfer function of a lowpass filter and Hhp is the transfer function of highpass
filter.
Converting equation (1) into frequency domain
Fhp(u,v) = F(u,v) - Flp(u,v)…(6)
Substituting equation (4) in equation (6)
Fhp(u,v) = F(u,v) - Hlp(u,v)F(u,v)
=> Fhp(u,v) = F(u,v) (1- Hlp(u,v))… (7)
Therefore, unsharp masking can be obtained directly in the frequency domain by using the
composite filter
Hhp(u,v) = (1- Hlp(u,v))…(8)
Converting equation (3) into frequency domain
Fhb = (A-1)F(u,v)-Fhp(u,v)…(9)
Substituting equation (5) in equation (9)
Fhb = (A-1)F(u,v)- Hhp(u,v)F(u,v)
=> Fhb = F(u,v) ((A-1) - Hhp(u,v))… (10)
Therefore, high-boost filtering can be obtained directly in the frequency domain by using the
composite filter
Hhb(u,v) = (A-1) - Hhp(u,v)…(11)
Sometimes it is advantageous to accentuate the contribution to enhancement made by the high-
frequency components of an image. In this case, we simply multiply a highpass filter function by
a constant and add an offset so that the frequency term is not eliminated by the filter. This
process is called High-Frequency Emphasis filtering. It has a transfer function given by
where a≥0 and b>a. typical values of a range from 0.25 to 0.5 and typical values of b range from
1.5 to 2.0. High-frequency emphasis filtering reduces to high boost filtering when a=(A-1) and
b=1. When b>1, the high frequencies are emphasized (highlighted), thus giving this procedure its
name.
HOMOMORPHIC FILTERING
The illumination-reflectance model can be used to develop a frequency domain procedure for
improving the appearance of an image by simultaneous gray-level range compression and
contrast enhancement. An image f(x,y) can be expressed as the product of illumination and
reflectance components:
…(1)
The above equation cannot be used directly to operate separately on the frequency components
of illumination and reflectance because the fourier transform of the product of two functions is
not separable.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 68 Department of ECE
So, Let us define
…(2)
Then
… (3)
or
…(4)
Where Fi(u,v) and Fr(u,v) are the fourier transforms of ln i(x,y) and ln r(x,y) respectively.
If we process Z(u,v) by means of a filter function H(u,v) then,
…(5)
Where S(u,v) is the fourier transform of the result s(x,y). In the spatial domain,
… (6)
By letting
… (7)
and
… (8)
Now, the equation (6) can be expressed as
… (9)
Finally, as z(x,y) is formed by taking the logarithm of the original image f(x,y), the inverse
(exponential) operation yields the desired enhanced image, denoted by g(x,y)
Where
,
The above operations can be represented in the form of block diagram as
This method is based on a special case of a class of systems known as homomorphic systems. In
this particular application, the key approach is the separation of the illumination and reflectance
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 69 Department of ECE
components achieved in the form shown in equation (4). The homomorphic filter function H(u,v)
can then operate on these components separately as in equation (5).
The illumination component of an image is generally characterized by slow spatial variations,
while the reflectance component tends to vary abruptly, particularly at the junctions of dissimilar
objects. So, we can associate the low frequencies of the Fourier transform of the logarithm of an
image with the illumination and the high frequencies with reflectance.
A good deal of control can be gained over the illumination and reflectance components with a
homomorphic filter. This control requires specification of a filter function H(u,v) that effects the
low and high frequency components of the fourier transform in different ways. The below figure
shows the cross section of such a filter.
If the parameters γL and γH are chosen so that γL <1 and γH>1, the filter function tends to decrease
the contribution made by low frequencies (illumination) and amplifies the contribution made by
high frequencies (reflectance). The net result is simultaneous dynamic range compression (by log
function) and contrast enhancement (by H(u,v)).
Figure 4.33 is typical of the results that can be obtained with the homomorphic filter function. N
in the original image shown in figure 4.33(a) the details inside the shelter are obscured by the
glare from the outside walls. Figure 4.33(b) shows the result of processing this image by
homomorphic filtering, with γL = 0.5 and γH = 2. A reduction of dynamic range in the brightness,
together with an increase in contrast, brought out the details of objects inside the shelter and
balanced the gray levels of the outside wall. The enhanced image also is sharper
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 70 Department of ECE
UNIT-VI
IMAGE RESTORATION
Restoration attempts to reconstruct or recover an image that has been degraded by using a
priori knowledge of the degradation phenomenon. Thus restoration techniques are oriented
towards modeling the degradation and applying the inverse process in order to recover the
original image.
We consider the restoration problem only from the point where degraded, digital image is
given; thus we consider topics dealing with sensor, digitizer and display degradations only.
A MODEL OF THE IMAGE DEGRADATION/RESTORATION PROCES S
A degradation function that, together with an additive noise term, operates on an input
image f(x,y) to produce a degraded image g(x,y). Given g(x,y), some knowledge about the
degradation function H, and some knowledge about the additive noise term η(x,y), the objective
of restoration is to obtain an estimate of the original image. We want to estimate to be
as close as possible to the original image and, in general, the more we know about the H and η,
the closer will be to f(x,y).
If H is linear, position invariant process then the degraded image is given in the spatial
domain by
…(1)
where h(x,y) is the spatial representation of the degradation function and the symbol * indicates
the spatial resolution. The convolution in the spatial domain is equal to multiplication in the
frequency domain, so we may write the model in above equation in an equivalent frequency
domain representation.
where the terms in the capital letters are the fourier transforms of the corresponding terms in
equation (1).
NOISE MODELS
The principal sources of noise in digital images arise during image acquisition
(digitization) and/or transmission.
� The performance of imaging sensors is affected by a variety of factors, such as
environmental conditions during image acquisition, and by the quality of the sensing
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 71 Department of ECE
elements themselves. For instance, in acquiring image with a camera, light levels and
sensor temperature are major factors affecting the amount of noise in the resulting image.
� Images are corrupted during transmission principally due to interference in the channel
used for transmission. For example, an image transmitted using a wireless network might
be corrupted as a result of lightening or other atmospheric disturbance.
Spatial characteristics of a noise refer to whether the noise is correlated with an image.
Frequency properties refer to the frequency content of noise in the Fourier sense. For example,
when the Fourier spectrum of noise is constant, the noise is usually is called white noise. This
terminology is a carry over from the physical properties of white light, which contains nearly all
frequencies in the visible spectrum in equal proportions.
The noise we are going to consider here are 1. Gaussian Noise 2. Rayleigh Noise 3. Erlang Noise 4. Exponential Noise 5. Uniform Noise 6. Impulse (salt-and-pepper) Noise 7. Periodic Noise
With the exception of periodic noise, we assume here that noise is independent of spatial
coordinates and that it is uncorrelated with respect to the image itself (that is, there is no
correlation between pixel values and the values of noise components). Because it is difficult to
deal with noises that are spatially dependent and correlated.
Gaussian Noise
Because of its mathematical tractability in both spatial and frequency domains, Gaussian (also
called normal) noise models are used frequently practice. In fact, this tractability is so convenient
that it often results in Gaussian models being used in situations in which they are marginally
applicable at best.
The PDF of a Gaussian random variable, z, is given by
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 72 Department of ECE
Where z represents the gray level, µ is the mean average value of z, and σ is its standard
deviation. The standard deviation squared, σ2 is called the variance of z. when z is described by
the above equation, 70% of its values will be in the range [(µ−σ),(µ+σ)], and about 95% will be
in the range [(µ−2σ),(µ+2σ)].
Rayleigh Noise
The PDF of Rayleigh noise is given by
The mean and variance of this density are given by
Note the displacement from the origin and the fact that the basic shape of this density is skewed
to the right. The Rayleigh density can be quite useful for approximating skewed histograms.
Erlang (Gamma) Noise
The PDF of Erlang Noise given by
Where the parameters are such that a>0, b is a positive integer and ! indicates factorial. The
mean and variance of this density are given by
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 73 Department of ECE
Although the above equation is often referred to as the gamma density, strictly speaking this
correct only when the denominator is the gamma function, Γ(b). When the denominator is as
shown, the density is more appropriately called the Erlang density.
Exponential Noise
The PDF of exponential noise is given by
where a>0. The mean and variance of this density function given by
The PDF of exponential noise is a special case of the Erlang PDF, with b=1
Uniform Noise
The PDF of uniform noise is given by
The mean and variance of this density function is given by
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 74 Department of ECE
Impulse (Salt-and-Pepper) Noise
The PDF of (bipolar) impulse noise is given by
If b>a, gray level b will appear as a light dot in the image and level ‘a’ will appear like a dark
dot. If either Pa or Pb is zero, the impulse noise is called unipolar. If neither probability is zero,
and especially if they are approximately equal, impulse noise values will resemble salt-and –
pepper granules randomly distributed over the image. This noise is called shot noise or spike
noise.
DEGRADATION MODEL
The degradation process can be modeled as an operator or system H, which together with
an additive noise term η(x,y) operates on an input image f(x,y) to produce a degraded image
g(x,y). Image restoration may be viewed as the process of obtaining an approximation to f(x,y),
given g(x,y) and a knowledge of the degradation in the form of the operator H.
The input output relation in above figure is expressed as
g(x,y) = H[f(x,y)]+ η(x,y)…(1)
For a moment, let us assume that η(x,y)=0, so that g(x,y) = H[f(x,y)]
The operator H is said to be linear if
H[k1f1(x,y) + k2f2(x,y)] = k1h[f1(x,y)] + k2H[f2(x,y)]…(2)
where k1 and k2 are constants and f1(x,y) and f2(x,y) are two input images.
If k1 = k2 =1,then equation (2) becomes
H[f1(x,y) + f2(x,y)] = h[f1(x,y)] + H[f2(x,y)]…(3)
The above equation is called the property of additivity; this property simply says that, if H is a
linear operator, the response to a sum of two points is equal to the sum of the two responses.
When f2(x,y) = 0, equation (2) becomes
H + f(x,y)
g(x,y)
η(x,y)
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 75 Department of ECE
H[k1f1(x,y)] = k1h[f1(x,y)]…(4)
The above equation is called the property of homogeneity. It says that the response to a constant
multiple of any input is equal to the response to that input multiplied by the same constant. Thus
the linear property possesses both the property of additivity and the property of homogeneity.
An operator having the input-output relation g(x,y) = H[f(x,y)] is said to be position (or space)
invariant if
H[f(x-α,y-β)] = g(x-α,y-β)…(5)
This definition indicates that the response at any point in the image depends only on the value of
the input at that point and not on the position of the point.
Degradation model for Continuous case
f(x,y) can be expressed in impulse form as
( , ) ( , ) ( , )f x y f x y d dα β δ α β α β∞ ∞
−∞ −∞
= − −∫ ∫…(6)
Then, if η(x,y) = 0, substituting equation (6) in (1)
( , ) [ ( , )] ( , ) ( , )[ ]g x y H f x y H f x y d dα β δ α β α β∞ ∞
−∞ −∞
= = − −∫ ∫…(7)
If H is a linear operator, above equation changes as
( , ) ( , ) ( , )[ ]g x y H f x y d dα β δ α β α β∞ ∞
−∞ −∞
= − −∫ ∫…(8)
Since f(α,β) is independent of x,y and from the homogeneity property
( , ) ( , ) ( , )[ ]g x y f H x y d dα β δ α β α β∞ ∞
−∞ −∞
= − −∫ ∫…(9)
The term H[δ(x-α,y-β)] is called the impulse response of H and is denoted as
h(x,α,y,β)= H[δ(x-α,y-β)]…(10)
from equations (9) and (10), we can write
( , ) ( , ) ( , , , )g x y f h x y d dα β α β α β∞ ∞
−∞ −∞
= ∫ ∫…(11)
The above equation is called the superposition (or Fredholm) integral of the first kind. This
expression states that of the response of H to a impulse is known, the response to any inpout
f(α,β) can be calculated by means of equation (11)
If H is position invariant from equation (5)
H[δ(x-α,y-β)] = h(x-α,y-β)…(12)
Now from equations (12), (11) and (10)
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 76 Department of ECE
( , ) ( , ) ( , )g x y f h x y d dα β α β α β∞ ∞
−∞ −∞
= − −∫ ∫…(13)
This is nothing but the convolution integral.
In the presence of additive noise the above expression describing the linear degradation model
becomes
( , ) ( , ) ( , ) ( , )g x y f h x y d d x yα β α β α β η∞ ∞
−∞ −∞
= − − +∫ ∫…(15)
Many types of degradations can be approximated by linear, position invariant processes. The
advantage of this approach is that the extensive tools of linear system theory then become
available for the solution of image restoration problems.
Degradation model for Discrete case
Suppose that f(x) and h(x) are sampled uniformly to from arrays of dimensions A and B
respectively. In this case x is a discrete variable in the ranges 0,1,2,…A-1 for f(x) and 0,1,2,…
B-1 for h(x).
The discrete convolution is based in the assumption that the sampled functions are periodic, with
a period M. Overlap in the individual periods of the resulting convolution is avoided by choosing
M ≥ A+B-1 and extending the functions with zeroes so that their length is equal to M.
Let fe(x) and he(x) represent the extended functions. Their convolution is given by
1
0
( ) ( ) ( )M
e e em
g x f m h x m−
=
= −∑…(1)
for x=0,1,2,…, M-1. As both fe(x) and he(x) are assumed to have period equal to M, ge(x) also
has the same period.
The above equation can be represented in matrix form as
g = Hf …(2)
where f and g are M-dimensional column vectors
(0)
(1)
.
.
.
( 1)
e
e
e
f
f
f
f M
=
− …(3)
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 77 Department of ECE
(0)
(1)
.
.
.
( 1)
e
e
e
g
g
g
g M
=
− …(4)
and H is an MxM matrix
(0) ( 1) ( 2) ... ( 1)
(1) (0) ( 1) ... ( 2)
(2) (1) (0) ... ( 3)
. . . . .
. . . . .
. . . . .
( 1) ( 2) ( 3) ... (0)
e e e e
e e e e
e e e e
e e e e
h h h h M
h h h h M
h h h h M
H
h M h M h M h
− − − + − − + − + = − − −
Because of the periodicity assumption on he(x), it follows that he(x) = he (M+x). Using this
property the above matrix can be changed as
(0) ( 1) ( 2) ... (1)
(1) (0) ( 1) ... (2)
(2) (1) (0) ... (3)
. . . . .
. . . . .
. . . . .
( 1) ( 2) ( 3) ... (0)
e e e e
e e e e
e e e e
e e e e
h h M h M h
h h h M h
h h h h
H
h M h M h M h
− − − = − − −
In the above matrix, the rows are related by a circular shift to the right; that is the right-most
element in one row is equal to the left-most element in the row immediately below. The shift is
called circular because an element shifted off the right end of row reappears at the left end of the
next row. Moreover, the circularity of the H is complete in the sense that it extends from the last
row back the first row. A square matrix in which each row is a circular shift of the preceding
row, and the first row is a circular shift of the last row, is called a circulant matrix.
Extension of the discussion to a 2D, discrete degradation model is straightforward. For two
digitized images f(x,y) and h(x,y) of sizes AxB and CxD respectively, extended sizes of MxN
may be formed by padding the above functions with zeroes. That is
fe(x,y) = f(x,y) 0 ≤ x ≤ A-1 and 0 ≤ y ≤ B-1
= 0 A ≤ x ≤ M-1 or B ≤ y ≤ N-1
and
he(x,y) = h(x,y) 0 ≤ x ≤ C-1 and 0 ≤ y ≤ D-1
= 0 C ≤ x ≤ M-1 or D ≤ y ≤ N-1
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 78 Department of ECE
Treating the extended functions fe(x,y) and he(x,y) as periodic in two dimension, with periods M
and N in the x and y directions, respectively
1 1
0 0
( , ) ( , ) ( , )M N
e e em n
g x y f m n h x m y n− −
= =
= − −∑∑
For x=0,1,2,…,M-1 and y=0,1,2,…,N-1
The convolution function ge(x,y) is periodic with the same period of fe(x,y) and he(x,y). Overlap
of the individual convolution periods is avoided by chosing M ≥ A+C-1 and N ≥ B+D-1.
Now, the complete discrete degradation model can be given by adding an MxN extended discrete
noise term ηe(x,y) to the above equation
( )1 1
e0 0
( , ) ( , ) ( , ) x, yM N
e e em n
g x y f m n h x m y n η− −
= =
= − − +∑∑
For x=0,1,2,…,M-1 and y=0,1,2,…,N-1
The above equation can be represented in matrix from as
g=Hf+n
where f,g,n are MN-dimensional column vectors formed by stacking the rows of the MxN
functions fe(x,y), ge(x,y) and ηe(x,y). The first N elements of f, foe example are the elements in
the first row of fe(x,y), the next N elements are form the second row, and so on for all the M
rows of fe(x,y). So, f,g and n of dimension MNx1and H is of dimension MnxMN. This matrix
consists of M2 partitions, each partition being of size NxN and ordered according to
0 1 2 1
1 0 1 2
2 1 0 3
1 2 3 0
...
...
...
. . . . .
. . . . .
. . . . .
...
M M
M
M M M
H H H H
H H H H
H H H H
H
H H H H
− −
−
− − −
=
Each partition Hj is constructed from the jth row of the extended function he(x,y) as follows
( ,0) ( , 1) ( , 2) ... ( ,1)
( ,1) ( ,0) ( , 1) ... ( , 2)
( ,2) ( ,1) ( ,0) ... ( ,3)
. . . . .
. . . . .
. . . . .
( , 1) ( , 2) ( , 3) ... ( ,0)
e e e e
e e e e
e e e e
j
e e e e
h j h j N h j N h j
h j h j h j N h j
h j h j h j h j
H
h j N h j N h j N h j
− − − = − − −
Here, Hj is a circulant matrix, and the blocks of H are subscripted in a circular manner. For these
reasons, the matrix H is called a Block-Circulant Matrix.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 79 Department of ECE
ALGEBRAIC APPROACH TO RESTORATION
The objective of image restoration is to estimate an original image f from a degraded image g
and some knowledge or assumption about H and n. Central to the algebraic approach is the
concept of seeking an estimate of f, denoted f̂ , that minimizes a predefined criterion of
performance. Because of their simplicity, least squares method is used here.
Unconstrained Restoration
From g=Hf+n, the noise term in the degradation model is
n=g-Hf … (1)
In the absence of any knowledge of n, a meaningful criterion function is to seek an f̂ such that
ˆHf approximates g in a least squares sense by assuming that the norm of the noise term is as
small as possible. In other words, we want to find an f̂ such that
22 ˆn = g-Hf… (2)
is minimum, where
2 Tn n n=and
2Tˆ ˆ ˆg-Hf (g-Hf ) (g-Hf )=
are the squared norms of n and ˆ(g-Hf ) respectively.
Equation (2) allows the equivalent view of this problem as one of minimizing the criterion
function with respect to ̂f .
2ˆ ˆ(f ) g-HfJ =… (3)
Aside from the requirement that it should minimize equation (3) ̂f is not constrained in any other
way.
Now, we want to know, for what value of f̂ , the function ˆ(f )J minimizes to least value. For
that, simply differentiate J with respect to f̂ and set the result equal to zero vector.
Tˆ(f ) ˆ0 2H (g-Hf)
f̂
J∂ = = −∂
Solving the above equation for f
=>T T ˆ2H g+2H Hf=0−
=> T T ˆH g=H Hf
=> T -1 Tf̂=(H H) H g
Letting M=N so that H is a square matrix and assuming that H-1 exists, the above equation
reduces to
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 80 Department of ECE
-1 T -1 Tf̂=H (H ) H g
-1f̂=H g
Constrained Restoration
In this section, we consider the least squares restoration problem as one of minimizing
functions of the form
2ˆQf, where Q is a linear operator on f, subject to the constraint
2 2ˆg-Hf n=.This approach introduces considerable flexibility in the restoration process
because it yields different solutions for different choices of Q.
The addition of an equality constraint in the minimization problem can be handled without
difficulty by using the method of Lagrange Multipliers. The procedure calls for expressing the
constraint in the form
2 2ˆg-Hf n( )α −and then appending it to the function
2ˆQf. In other
words, we seek an f̂ that minimizes the criterion function
2 2 2ˆ ˆ ˆ(f ) Qf g-Hf n( )J α= + −
Where α is a constant called the Lagrange multiplier. After the constraint has been appended,
minimization is carried out in the usual way.
Differentiating above equation with respect to f̂ and setting the result equal to zero vector yields
T Tˆ(f ) ˆ ˆ0 2Q Qf - 2 g-Hf
f̂Hα ( )J∂ = =
∂
Now, solving for ̂f ,
T T1ˆ ˆf = H H+ H (g-Hf)( )α
The quantity 1/ α must be adjusted so that the constraint is satisfied.
INVERSE FILTERING
The simplest approach to restoration is direct inverse filtering, where we compute an estimate,
F̂(u,v), of the transform of the original image simply by dividing the transform of the degraded
image , G(u,v) by the degradation function:
But we know, G(u,v)=F(u,v)H(u,v)+N(u,v) Substituting this in above equation gives
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 81 Department of ECE
The image restoration approach in above equations is commonly referred to as the inverse
filtering method. This terminology arises from considering H(u,v) as filter function that
multiplies F(u,v) to produce the transform of the degraded image g(x,y).
The above equation tells us that even if we know the degradation function we cannot
recover the undegraded image exactly because N(u,v) is a random function whose fourier
transform is not known.
If the degradation has zero or very small values, then the ratio N(u,v)/H(u,v) could easily
dominate the estimate F̂(u,v).
One approach to get around the zero or small-value problem is to limit the filter frequencies to
values neat the origin. By limiting the analysis to frequencies near the origin, we reduce the
probability of encountering zero values.
LEAST MEAN SQUARE FILTER/
MINIMUM MEAN SQUARE ERROR (WIENER) FILTERING
The inverse filtering makes no explicit provision for handling noise. This Wiener filtering
method incorporates both the degradation function and statistical characteristics images and
noise as random process, and the objective is to find an estimate ̂f of the uncorrupted image f
such that the mean square error between them is minimized. This error measure is given by
…(1)
where E{.} is the expected value of the argument. It is assumed that the noise and the image are
uncorrelated; that one or the other has zero mean; and that the gray levels in the estimate are a
linear function of the levels in the degraded image. Based on these conditions, the minimum of
the error function in above equation is given in the frequency domain by the expression
…(2)
The terms in the above equations are as follows:
The result in equation (2) is known as the Weiner filter. It is also referred to as the
minimum mean square error filter or the least square error filter. It does not have the same
problem as the inverse filter with zeroes in the degradation function, unless both H(u,v) and
Sη(u,v) are zero for the same values of u and v.
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 82 Department of ECE
If the noise is zero, then the noise power spectrum vanishes and the wiener filter reduces to the
inverse filter. When we are dealing with spectrally white noise, the spectrum |N(u,v)|2 is a
constant, which simplifies things considerably. Now, the above equation can be written as
where K is a specified constant.
Generally, Wiener filter works better than inverse filtering in the presence of noise and
degradation function.
CONSTRIAINED LEAST SQUARES FILTERING
The difficulty in Wiener filtering is, the power spectra of the undegraded image and noise
must be known. But this constrained least squares filtering method requires knowledge of only
the mean and variance of the noise. These parameters can be calculated from a given degraded
image, so this is an important advantage. Another difference is that the Wiener filter is based on
minimizing a statistical criterion and, as such, it is optimal in an average sense. But this method
has a notable feature that it yields an optimal result for each image to which it is applied.
The degraded image can be represented in matrix form as
…(1)
The problem here is H is highly sensitive to noise. One way to lighten the noise sensitivity
problem is to base optimality of restoration on a measure of smoothness, such as second
derivative of an image. To be meaningful, the restoration must be constrained by the parameters
of the problems. Thus, what is desired is to find the minimum of a criterion function, defined by
…(2)
subject to the constraint
…(3)
is the vector norm and f̂ is the estimate of the undegraded image.
The frequency domain solution to this optimization problem is given by the following expression
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 83 Department of ECE
where γ is a parameter that must be adjusted so that the constraint in equation (3) is satisfied, and
P(u,v) is the fourier transform of the function p(x,y)
We can recognize the above function as the Laplacian operator.
By comparing the constrained least squares and Wiener results, it is noted that the former
yielded slightly better results for the high and medium noise cases. It is not unexpected that the
constrained least squares filter would outperform the Wiener filter when selecting the parameters
manually for better visual results. The parameter γ is a scalar, while the value of K in Wiener
filtering is an approximation to the ratio of two unknown frequency domain functions, whose
ratio seldom is constant. Thus, it stands to reason that a result based on manually selecting γ
would be more accurate estimate of the undegraded image. The difference between Wiener
filtering and constrained least square restoration method is
1. The Wiener filter is designed to optimize the restoration in an average statistical sense over a
large ensemble of similar images. The constrained matrix inversion deals with one image only
and imposes constraints on the solution sought.
2. The Wiener filter is based on the assumption that the random fields involved are homogeneous
with known spectral densities. In the constrained matrix inversion it is assumed that we know
only some statistical property of the noise.
In the constraint matrix restoration approach, various filters may be constructed using the same
formulation by simply changing the smoothing criterion.
RESTORATION IN THE PRESENCE OF NOISE ONLY-
SPATIAL FILTERING :
We know that the general equations for degradation process in spatial and frequency domain are
given by
When the only degradation present in an image is only noise, the above equations become
The noise terms are unknown, so subtracting them from g(x,y) or G(u,v) is not a realistic option.
Spatial filtering is the method of choice in situations when only additive noise is present.
MEAN FILTERS
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 84 Department of ECE
Arithmetic Mean Filter
This is the simplest of the mean filters. Let Sxy represent the set of coordinates in a rectangular
subimage window of size mxn, centered at point (x,y). The arithmetic mean filtering process
computes the average value of the corrupted image g(x,y) in the area defined by Sxy. The value
of the restored image ̂f at any point (x,y) is simply the arithmetic mean computed using the
pixels in the region defined by Sxy.
This operation can be implemented using a convolution mask in which all coefficients have
value 1/mn. A mean filter simply smoothes local variations in the image. Noise is reduced as
result of blurring.
Geometric Mean Filter
An image restored using a geometric mean filter is given by the expression
Here, each restored pixel is given by the product of the pixels in the subimage window, raised to
the power 1/mn.
A geometric mean filter achieves smoothing comparable to the arithmetic mean filter, nut it
tends to lose less image detail in the process.
Harmonic Mean Filter
The harmonic mean filtering operation is given by the expression
The harmonic mean filter works well for salt noise, but fails for pepper noise. It does well also
with other types of noise like Gaussian noise.
Contraharmonic Mean Filter
The Contraharmonic mean filtering operation yields a restored image based in the expression.
where Q is called the order of the filter.
This filter is well suited for reducing or virtually eliminating the effects of salt and pepper noise.
For positive values of Q, the filter eliminates pepper noise.
For negative values of Q, the filter eliminates salt noise
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 85 Department of ECE
For Q=0, this filter reduces to arithmetic mean filter
For Q=-1, this filter reduces to harmonic mean filter.
In general, the arithmetic mean and geometric mean filters are well suited for random
noise like Gaussian or uniform noise. The Contraharmonic filter is well suited for impulse noise,
but it has the disadvantage that it must be known whether the noise is dark or light in order to
select the proper sign for Q. The results of choosing the wrong sign for Q can be disastrous.
ORDER-STATISTICS FILTERS
Order-statistics filters are spatial filters whose response is based on ordering the pixels
contained in the image area encompassed by the filter. The response of the filter at any point is
determined by the ranking result.
Median Filter
It replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel:
For certain types of noises, median filters provide excellent noise reduction capabilities, with
considerably less blurring than linear smoothing filters of similar size. Median filters are
particularly effective in the presence of both bipolar and unipolar noise.
Max and Min Filters
The median filter represents the 50th percentile of a ranked set of numbers. The 100th
percentile result is represented by the Max filter, given by
Max filter is useful for finding the brightest points in an image. It can be used to reduce the
pepper noise from the image. But it removes (sets to a light gray level) some dark pixel from the
borders of the dark objects
The 0th percentile result is represented by the Min filter, given by
Min filter is useful for finding the darkest points in an image. It can be used to reduce the salt
noise from the image. But it removes white points around the border of light objects.
Mid point Filter
The midpoint filter simply computes the midpoint between the maximum and minimum
values in the area encompassed by the filter
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 86 Department of ECE
This filter combines order statistics ad averaging. This filter works best for randomly distributed
noise like Gaussian noise.
Alpha-Trimmed mean Filter
Suppose that we delete the d/2 lowest and d/2 highest gray-level values of g(s,t) in the
neighborhood Sxy. Let gr(s,t) represent the remaining mn-d pixels. A filter formed by averaging
these remaining pixels is called the alpha-trimmed mean filter.
Where the value of d can range from 0 to mn-1
When d=0, this filter reduces to the arithmetic mean filter
When d=(mn-1)/2 this filter becomes to median filter.
For other values of d, the alpha-trimmed filter is useful in situations involving multiple types of
noise, such as combination of salt and pepper and Gaussian noise.
ADAPTIVE FILTERS
Once selected, the mean filters and order-statistics filters are applied to an image without
regard for how image characteristics vary from one point to another. Adaptive filters are those,
whose behavior changes based on statistical characteristics of the image inside the filter region
defined by the mxn rectangular window Sxy. Adaptive filters are capable of performance superior
to that of the other filters, but with increase in filter complexity.
Adaptive Local Noise Reduction Filter
The simplest statistical measures of a random variable are its mean and variance. These
are reasonable parameters on which to base an adaptive filter because they are quantities closely
related to the appearance of an image. The mean gives a measure of average gray level in the
region over which the mean is computed, and the variance gives a measure of average contrast in
that region.
Our filter is to operate in a local region Sxy. The response of the filter at any point (x,y) on which
the region is centered is to be based on four quantities :
i) g(x,y), the value of the noisy image at (x,y)
ii) ση2, the variance of the noise corrupting f(x,y) to form g(x,y)
iii) mL, the local mean of the pixels in Sxy and
iv) σL2, the local variance of the pixels in Sxy.
The behavior of the filter is to be as follows:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 87 Department of ECE
An adaptive expression for obtaining f̂(x,y) based on the above assumptions may be written as
The only quantity that needs to be known or estimated is the variance of the overall noise ση2.
The other parameters are computed from the pixels in Sxy at each location (x,y) on which the
filter window is centered. An implicit assumption in above expression is that ση2 ≤ σL
2, because
the noise in our model is additive and position independent.
Adaptive Median Filter
The median filter performs well as long as the spatial density of the impulse noise is not
large. Adaptive median filtering can handle impulse noise even with large probabilities. An
additional advantage of the adaptive median filter is that it seeks to preserve detail while
smoothing non-impulse noise, something that the traditional median filter does not do. The
adaptive filter also works in a rectangular window area Sxy. Unlike the other filters, the adaptive
median filter changes (increases) the size of Sxy during filter operation, depending on certain
conditions.
Consider the following notation:
The adaptive median filtering algorithm works in two levels, denoted level A and level B, as
follows:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 88 Department of ECE
The adaptive median filtering has three main purposes:
1. To remove salt-and-pepper (or impulse) noise,
2. To provide smoothing of other noise that may not be impulsive and
3. To reduce distortion, such as excessive thinning or thickening of object boundaries.
Every time the algorithm outputs a value, the window Sxy is moved to the net location in the
image. The algorithm is then reinitialized and applied to the pixels in the new location.
PERIODIC NOISE REDUTION BY FREQUENCY DOMAIN FITLERI NG
Periodic Noise
Periodic noise in an image arises typically form electrical and electromechanical
interference during image acquisition. This is the only type of spatially dependent noise.
Periodic noise can be reduced significantly with frequency domain filtering.
Band Reject Filters
Band Pass Filters
Notch Filters
Optimum Notch Filtering/ Interactive Restoration
Clearly defined interference patterns are not common. Images derived from electro-optical
scanners, such as those used in space and aerial imaging, sometimes are corrupted by coupling
and amplification of low-level signals in the scanners’ electronics circuitry. The resulting images
tend to contain pronounced, 2D periodic structures superimposed on the scene data with more
complex patterns.
When several interference components are present, the methods like band pass and band
reject are not always acceptable because they may remove too much image information in the
filtering process. The method discussed here is optimum, in the sense that it minimizes local
variances of the restored imagef̂(x,y) .
The procedure consists of first isolating the principal contributions of the interference
pattern and then subtracting a variable, weighted portion of the pattern from the corrupted image.
QUESTION AND ANSWERS
1. What is image restoration?
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 89 Department of ECE
Image restoration is the improvement of an image using objective criteria and prior knowledge
as to what the image should look like.
2. What is the difference between image enhancement and image restoration?
In image enhancement we try to improve the image using subjective criteria, while in image
restoration we are trying to reverse a specific damage suffered by the image,using objective
criteria.
3. Why may an image require restoration?
An image may be degraded because the grey values of individual pixels may be altered, or it may
be distorted because the position of individual pixels may be shifted away from their correct
position. The second case is the subject of geometric restoration.
Geometric restoration is also called image registration because it helps in finding corresponding
points between two images of the same region taken from different viewing angles. Image
registration is very important in remote sensing when aerial photographs have to be registered
against the map, or two aerial photographs of the same region have to be registered with each
other.
4. What is the problem of image restoration?
The problem of image restoration is: given the degraded image g, recover the original
undegraded image f .
5. How can the problem of image restoration be solved?
The problem of image restoration can be solved if we have prior knowledge of the point spread
function or its Fourier transform (the transfer function) of the degradation process.
6. The white bars in the test pattern shown in figure are 7 pixels wide and 210 pixels high.
The separation between bars is 17 pixels. What would this image look like after application
of different filters of different sizes?
Solution:
The matrix representation of a portion of the given image at any end of a vertical bar is
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 90 Department of ECE
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0
0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0
0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0
0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0
0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0
0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0
a) A 3x3 Min Filter:
b) A 5x5 Min Filter:
c) A 7x7 Min Filter:
d) A 9x9 Min Filter:
Explanation:
The 0th percentile result is represented by the Min filter, given by
Min filter is useful for finding the darkest points in an image. It can be used to reduce the salt
noise from the image. But it removes white points around the border of light objects. But for the
given image, the effect of Min filter is decrease in the width and height of the white vertical bars.
As the size of the filter increase, the width and height of the vertical bars decrease.
(a) (c) (d)
a) The resulting image consists of vertical bars of 5 pixels wide and 208 pixels height. There will
be no deformation of the corners. The matrix after the application of 3x3 Min filter is shown
below:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 91 Department of ECE
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
b) The resulting image consists of vertical bars of 3 pixels wide and 206 pixels height. There will
be no deformation of the corners. The matrix after the application of 5x5 Min filter is shown
below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
c) The resulting image consists of vertical bars of 1 pixels wide and 204 pixels height. There will
be no deformation of the corners. The matrix after the application of 7x7 Min filter is shown
below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 92 Department of ECE
d) The resulting image consists of vertical bars of 0 pixels wide and 202 pixels height. There will
be no deformation of the corners. The white bars completely disappear from the image. The
matrix after the application of 9x9 Min filter is shown below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
e) A 3x3 Max Filter:
f) A 5x5 Max Filter:
g) A 7x7 Max Filter:
h) A 9x9 Max Filter:
Explanation
Max filter is useful for finding the brightest points in an image. It can be used to reduce the
pepper noise from the image. But it removes (sets to a light gray level) some dark pixel from the
borders of the dark objects. But for the given image, the effect of Max filter is increase in the
width and height of the white vertical bars. As the size of the filter increase, the width and height
of the vertical bars also increases.
(e) (f) (g)
e) The resulting image consists of vertical bars of 9 pixels wide and 212 pixels height. There will
be no deformation of the corners. The matrix after the application of 3x3 Max filter is shown
below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0
0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0
0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0
0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 93 Department of ECE
0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0
0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0
0 0 0 0 255 255 255 255 255 255 255 255 255 0 0 0 0
f) The resulting image consists of vertical bars of 11 pixels wide and 214 pixels height. There
will be no deformation of the corners. The matrix after the application of 5x5 Max filter is shown
below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
0 0 0 255 255 255 255 255 255 255 255 255 255 255 0 0 0
g) The resulting image consists of vertical bars of 13 pixels wide and 216 pixels height. There
will be no deformation of the corners. The matrix after the application of 7x7 Max filter is shown
below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
0 0 255 255 255 255 255 255 255 255 255 255 255 255 255 0 0
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 94 Department of ECE
h) The resulting image consists of vertical bars of 15 pixels wide and 218pixels height. There
will be no deformation of the corners. The matrix after the application of 9x9 Max filter is shown
below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
0 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
i) A 3x3 Arithmetic Mean Filter:
j) A 5x5 Arithmetic Mean Filter:
k) A 7x7 Arithmetic Mean Filter:
k) A 9x9 Arithmetic Mean Filter:
(i) (j) (k)
Explanation:
Arithmetic mean filter causes blurring. Burring increases with the size of the mask.
i) Since the width of each vertical bar is 7 pixels wide, a 3x3 arithmetic mean filter slightly
distorts the edges of the vertical bars. As a result, the edges of the vertical bars become a bit
darker. There will be some deformation at the corners of the bars, they become rounded.
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 28 113 170 170 170 170 113 28 0 0 0 0 0
0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0
0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 95 Department of ECE
0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0
0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0
0 0 0 0 85 170 255 255 255 255 255 85 0 0 0 0 0
j) As the size of the mask or filter increases, the vertical bars will distort more, and blurring
increases. Since the size of the mask here is 5x5, after the application of the filter, only the 3
centre lines of the vertical bars remains white. As move we move from the center of the vertical
bar to the either of the edge, the pixels become darker. There will be some deformation at the
corners of the bars, they become rounded.
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 255 255 255 255 255 255 255 0 0 0 0 0
0 0 0 0 0 122 163 163 163 163 163 122 0 0 0 0 0
0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0
0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0
0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0
0 0 0 0 0 191 204 255 255 255 204 191 0 0 0 0 0
k) As the size of the mask or filter increases, the vertical bars will distort more, and blurring
increases. Since the size of the mask here is 7x7, after the application of the filter, only the centre
line of the vertical bars remains white. As move we move from the center of the vertical bar to
the either of the edge, the pixels become darker. There will be some deformation at the corners
of the bars, they become rounded.
l) As the size of the mask is larger than the width of the bars, the vertical bars are completely
distorted. The burring also increases compared to the previous case. The corners also become
more rounded and deformed.
m) A 3x3 Geometric Mean Filter
n) A 5x5 Geometric Mean Filter
o) A 7x7 Geometric Mean Filter
p) A 9x9 Geometric Mean Filter
Explanation
An image restored using a geometric mean filter is given by the expression
Here, each restored pixel is given by the product of the pixels in the subimage window, raised to
the power 1/mn. A geometric mean filter achieves smoothing comparable to the arithmetic mean
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 96 Department of ECE
filter, nut it tends to lose less image detail in the process. But for the given image, the effect of
Min filter is decrease in the width and height of the white vertical bars. As the size of the filter
increase, the width and height of the vertical bars decrease.
(n) (o) (p)
m) The resulting image consists of vertical bars of 5 pixels wide and 208 pixels height. There
will be no deformation of the corners. The matrix after the application of 3x3 Geometric Mean
filter is shown below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
0 0 0 0 0 0 255 255 255 255 255 0 0 0 0 0 0
n) The resulting image consists of vertical bars of 3 pixels wide and 206 pixels height. There will
be no deformation of the corners. The matrix after the application of 5x5 Geometric Mean filter
is shown below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
0 0 0 0 0 0 0 255 255 255 0 0 0 0 0 0 0
o) The resulting image consists of vertical bars of 1 pixels wide and 204 pixels height. There will
be no deformation of the corners. The matrix after the application of 7x7 Geometric Mean Filter
is shown below:
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 97 Department of ECE
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0
p) The resulting image consists of vertical bars of 0 pixels wide and 202 pixels height. There will
be no deformation of the corners. The white bars completely disappear from the image. The
matrix after the application of 9x9 Geometric Mean filter is shown below:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 98 Department of ECE
UNIT-VI
IMAGE SEGMENTATION
DETECTION OF DISCONTINUITIES
There are 3 types of gray level discontinuities in an image: Points, Lines and Edges
Point Detection
Line Detection
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 99 Department of ECE
Edge Detection
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 100 Department of ECE
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 101 Department of ECE
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 102 Department of ECE
Gradient Operators
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 103 Department of ECE
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 104 Department of ECE
The Laplacian
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 105 Department of ECE
THRESHOLDING
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 106 Department of ECE
Global Thresholding
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 107 Department of ECE
Adaptive/Local Thresholding
REGION BASED SEGMENTATION
Basic Formulation
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 108 Department of ECE
Region Growing
Region Splitting and Merging
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 109 Department of ECE
QUESTION AND ANSWERS
1. Suppose that an image has the following intensity distributions. Where p1 (z)
corresponds to the intensity of the objects p2 (z) corresponds to the intensity of the
background. Assume that p1= p2 and find the optimal between objects and
background pixels. Shown in figure 7. [16]
Solution
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 110 Department of ECE
2. A binary image contains straight lines oriented horizontally, vertically, at 450 and at -450 give a set of 3×3 mask that can be used to detect 1-pixel-long brakes in these lines. Assume that the gray levels of lines are one and that the gray level of the background is 0.
3. What exactly is the purpose of image segmentation and edge detection?
The purpose of image segmentation and edge detection is to extract the outlines ofdifferent
regions in the image; i.e. to divide the image in to regions which are made up of pixels
which have something in common. For example, they may have similar brightness, or
colour, which may indicate that they belong to the same object or facet of an object.
4. Are there any segmentation methods that take into consideration the spatial proximity of
pixels?
Yes, they are called region growing methods. In general, one starts from some seed pixels
and attaches neighbouring pixels to them provided the attributes of the pixels in the region
created in this way vary within a predefined range. So, each seed grows gradually by
IV B.Tech I Semester ECE Digital Image Processing
www.asrece.com 111 Department of ECE
accumulating more and more neighbouring pixels until all pixels in the image have been
assigned to a region.
5. How can one choose the seed pixels?
There is no clear answer to this question, and this is the most important drawback of this
type of method. In some applications the choice of seeds is easy. For example, in target
tracking in infrared images, the target will appear bright, and one can use as seeds the few
brightest pixels. A method which does not need a predetermined number of regions or
seeds is that of split and merge.
6. Is it possible to segment an image by considering the dissimilarities between regions, as
opposed to considering the similarities between pixels?
Yes, in such an approach we examine the differences between neighbouring pixels and say
that pixels with different attribute values belong to different regions and therefore we
postulate a boundary separating them. Such a boundary is called an edge and the process is
called edge detection.
7. Are Sobel masks appropriate for all images?
Sobel masks are appropriate for images with low levels of noise. They are inadequate for
noisy images.