Automatic Mail (Letter) Sorting Machine

AUTOMATIC MAIL SORTING MACHINE (AMSM)

By

Sami ur Rehman 2006-NUST-BEE-146

Sarosh Khan 2006-NUST-BEE-147

Waqas Siddique 2006-NUST-BEE-162

Project Report in fulfillment of the requirements for the award of

Bachelor of Engineering (Electronics Engineering) degree

In

School of Electrical Engineering and Computer Science (SEECS)

National University of Sciences and Technology (NUST) H-12 Islamabad, Pakistan

Final Year Project Report Spring `09

CERTIFICATE

This is to certify that, the work contained in the report entitled “Automatic Mail Sorting Machine” was carried out by Mr.Waqas Siddique , Mr.Sami ur Rehman, and Mr.Sarosh Khan of Faculty Electrical Engineering, under my supervision and that in my opinion it is fully adequate, in scope and quality, for the degree of B.S in their respective faculties.

Advisor: ___________________ Mr. Raheel Querashi

Co-Advisor: _________________ Dr. Rehan Hafiz

Acknowledgment

All praise to Almighty Allah, Who bestowed us with the knowledge and enable us to complete

this project work. We present our humble respect to the last and final Prophet Muhammad

(peace be upon him), whose life is a perfect model for whole mankind.

We are greatly thankful to our Advisor Mr. Raheel Querashi and Dr. Rehan Hafizfortheir effort in

completion of this project work. Their inspiring guidance, dynamic supervisionand constructive

criticism, helped us out to accomplish the task fairly.We would like to thank all our teachers whose

valuable knowledge, assistance, cooperation and guidance enabled us to take initiative, develop and

furnish our academic career.

ABSTRACT

Our project aims to develop “Automatic Mail Sorting Machine” (AMSM) which is a promising

replacement for the labor intensive and time consuming job of manual mail sorting hence bringing

efficiency to the mail service. As far as the software technology is concerned we implemented

Address Block Location and OCR technologies to identify the city name written in the address block

since it is this parameter on which automated sorting is being performed on mail letters. Using the

feature extraction, template matching and edge detection algorithms as the underlying concepts we

successfully locate the address block on the letter, extract the city name written on it, apply the OCR

system and move the output to the SIMULINK block. The SIMULINK block, using SIMULINK card

installed in the CPU than drives the plunger mechanism maintained on the conveyer belt to throw

the letters in the required destination bins. All the software modules are written and integrated in

MATLAB software. A speed-controlled DC motor continuously drives the conveyer belt. A webcam

inputs the image to the software which processes it and generates output through SIMULINK for the

plunger motors to throw the letters on the belt into the destination bins.

CHAPTER # 1

INTRODUCTION

HISTORY OF AUTOMATED MAIL SORTING

During most of the 21st century mail was sorted by hand using what is called a “pigeon-hole message

box” method. Addresses were read and manually slotted into specific compartments. While early

forms of a mechanical mail sorter were developed and tested in the 1920s, the first sorting machine

was put into operation in the 1950s.

In 1965, the Postal Service put the first high-speed optical character reader (OCR) into operation

that could handle a preliminary sort automatically. And in 1982, the first computer-driven single-line

optical character reader was employed – which reads the mailpiece destination address then prints

a barcode on the envelope that could be used to automate mail sorting from start to finish.

Such automated mail services are available in Post Offices of advance countries but the concept is

quite new in the country like Pakistan. We developed and designed this system to present a model

of an automated mail sorting machine which could be a stunning replacement for the tedious and

time consuming job of manual letter sorting.

BASIC MODULES OF OUR ARCHITECTURE

Our system is equipped with a computer database system, input peripheral devices, user input

devices, a webcam and plunger mechanism. The computer database system processes the data

generated from the input peripheral device and generates sorted database output in according to

the user selected sorting option. The mail or package is delivered to the appropriate designation

following the sorted database output.

We have implemented Using the feature extraction, template matching and edge detection

algorithms as the underlying concepts we successfully locate the address block on the letter, extract

the city name written on it, apply the OCR system and move the output to the SIMULINK block. The

SIMULINK block, using SIMULINK card installed in the CPU than drives the plunger mechanism

maintained on the conveyer belt to throw the letters in the required destination bins. All the

software modules are written and integrated in MATLAB software. A speed-controlled DC motor

continuously drives the conveyer belt. A webcam inputs the image to the software which processes

http://en.wikipedia.org/wiki/Pigeon-hole_messagebox

http://en.wikipedia.org/wiki/Pigeon-hole_messagebox

http://en.wikipedia.org/wiki/Optical_character_reader

it and generates output through SIMULINK for the plunger motors to throw the letters on the belt

into the destination bins.

Block Diagram of our System

IMAGE ACQUISITION

The data processing part of the system starts with the image capturing which has to be done by a

fast and efficient camera. The selection of the camera most suitable for OCR was done and finalized.

The camera has its own software used for getting images from camera memory and loading it into

processing unit’s memory (which in our case will be PC). But we are doing this job using MATLAB.

The output of this stage will be an image to be processed by the OCR part.

PREPROCESSING STAGE

Now the image may have different irrelevant data etched onto it i.e. various advertisements and

unnecessary hand written information. Even in an address complete address of recipient is written

but we only want to know the city where the letter is supposed to be delivered. To separate

irrelevant data from the relevant one we need a preprocessing stage which we can say Address

Block Locator (ABL).. Now this is the second software included in our system which has to somehow

talk to the camera software.

OPTICAL CHARACTER RECOGNITION

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic

translation of scanned images of handwritten, typewritten or printed text into machine-encoded

text. It is widely used to convert books and documents into electronic files, to computerize a record-

keeping system in an office, or to publish the text on a website. OCR makes it possible to edit the

text, search for a word or phrase, store it more compactly, display or print a copy free of scanning

http://en.wikipedia.org/wiki/Mechanical

http://en.wikipedia.org/wiki/Electronics

http://en.wikipedia.org/wiki/Image

artifacts, and apply techniques such as machine translation, text-to-speech etc. OCR is a field of

research in pattern recognition, artificial intelligence and computer vision.

In our case we wrote the the code of OCR in MATLAB software which successfully translated the text

written on the image into editable text for further processing.

OCR

Hardware Infrastructure

Our hardware consists primarily of 1.5 meters conveyer belt upon which are mounted the plungers

to throw the letters into their required destination bins and a web cam which acquires the image of

the letter to be sorted and sends it to computer for further processing. Our aim was to learn new

things during this project and we incorporated various new ideas into our project. The webcam

takes images of the still letter and then precedes it to the computer for further processing.

http://en.wikipedia.org/wiki/Machine_translation

http://en.wikipedia.org/wiki/Text-to-speech

http://en.wikipedia.org/wiki/Pattern_recognition

http://en.wikipedia.org/wiki/Artificial_intelligence

http://en.wikipedia.org/wiki/Computer_vision

CHAPTER # 2

LITERATURE REVIEW

RGB Image:

An RGB image has three channels: red, green, and blue. RGB channels roughly follow the color receptors in

the human eye, and are used in computer displays and image scanners.

If the RGB image is 24-bit (the industry standard as of 2005), each channel has 8 bits, for red, green,

and blue—in other words, the image is composed of three images (one for each channel), where

each image can store discrete pixels with conventional brightness intensities between 0 and 255. If

the RGB image is 48-bit (very high resolution), each channel is made of 16-bit images.

RGB image from the perspective of MATLAB:

An RGB image, sometimes referred to as a true color image, is stored in MATLAB as an m-by-n-by-3

data array that defines red, green, and blue color components for each individual pixel. RGB images

do not use a palette. The color of each pixel is determined by the combination of the red, green, and

blue intensities stored in each color plane at the pixel's location. Graphics file formats store RGB

images as 24-bit images, where the red, green, and blue components are 8 bits each. This yields a

potential of 16 million colors. The precision with which a real-life image can be replicated has led to

the commonly used term truecolor image.

An RGB array can be of class double, uint8, or uint16. In an RGB array of class double, each color

component is a value between 0 and 1. A pixel whose color components are (0,0,0) is displayed as

black, and a pixel whose color components are (1,1,1) is displayed as white. The three color

components for each pixel are stored along the third dimension of the data array. For example, the

red, green, and blue color components of the pixel (10,5) are stored in RGB(10,5,1), RGB(10,5,2),

and RGB(10,5,3), respectively.

The following figure depicts an RGB image of class double:

http://en.wikipedia.org/wiki/RGB_color_model

http://en.wikipedia.org/wiki/Eye

http://en.wikipedia.org/wiki/Computer_display

http://en.wikipedia.org/wiki/Image_scanner

Consider the picture above:

To determine the color of the pixel at (2,3), you would look at the RGB triplet stored in (2,3,1:3).

Suppose (2,3,1) contains the value 0.5176, (2,3,2) contains 0.1608, and (2,3,3) contains 0.0627. The

color for the pixel at (2,3) is 0.5176 0.1608 0.0627

Illustration of RGB in MATLAB

To further illustrate the concept of the three separate color planes used in an RGB image, the code

sample below creates a simple RGB image containing uninterrupted areas of red, green, and blue,

and then creates one image for each of its separate color planes (red, green, and blue). It displays

each color plane image separately, and also displays the original image.

RGB=reshape(ones(64,1)*reshape(jet(64),1,192),[64,64,3]);

R=RGB(:,:,1);

G=RGB(:,:,2);

B=RGB(:,:,3);

imshow(R)

figure, imshow(G)

figure, imshow(B)

figure, imshow(RGB)

Grayscale Image

Grayscale digital image is an image in which the value of each pixel is a single sample, that is, it

carries only intensity information. Images of this sort, also known as black-and-white, are composed

exclusively of shades of gray, varying from black at the weakest intensity to white at the strongest.

Grayscale images are distinct from one-bit black-and-white images, which in the context of

computer imaging are images with only the two colors, black, and white . Grayscale images have

many shades of gray in between. Grayscale images are also called monochromatic, denoting the

absence of any chromatic variation.

Numerical representations:

The intensity of a pixel is expressed within a given range between a minimum and a maximum,

inclusive. This range is represented in an abstract way as a range from 0 (total absence, black) and 1

(total presence, white), with any fractional values in between.

Another convention is to employ percentages, so the scale is then from 0% to 100%. This is used for

a more intuitive approach, but if only integer values are used, the range encompasses a total of only

101 intensities, which are insufficient to represent a broad gradient of grays. Also, the percentile

notation is used in printing to denote how much ink is employed in half toning, but then the scale is

reversed, being 0% the paper white (no ink) and 100% a solid black (full ink).

Converting color (RGB) to grayscale:

Conversion of a color image to grayscale is not unique; different weighting of the color channels

effectively represents the effect of shooting black-and-white film with different-colored

photographic on the cameras. A common strategy is to match the luminance of the grayscale image

to the luminance of the color image.

To convert any color to a grayscale representation of its luminance, first one must obtain the values

of its red, green, and blue (RGB) primaries in linear intensity encoding, by gamma expansion. Then,

add together 30% of the red value, 59% of the green value, and 11% of the blue value (these

weights depend on the exact choice of the RGB primaries, but are typical). The formula (11*R +

16*G + 5*B) /32 is also popular since it can be efficiently implemented using only integer operations.

Regardless of the scale employed (0.0 to 1.0, 0 to 255, 0% to 100%, etc.), the resultant number is the

desired linear luminance value; it typically needs to be gamma compressed to get back to a

conventional grayscale representation.

Here is an example of color channel splitting of a full RGB color image. The column at left shows the

isolated color channels in natural colors, while at right there are their grayscale equivalences:

http://en.wikipedia.org/wiki/Digital_image

http://en.wikipedia.org/wiki/Pixel

http://en.wikipedia.org/wiki/Sample_(signal)

http://en.wikipedia.org/wiki/Luminous_intensity

http://en.wikipedia.org/wiki/Black-and-white

http://en.wikipedia.org/wiki/Gray


http://en.wikipedia.org/wiki/White

http://en.wikipedia.org/wiki/Chromaticity

http://en.wikipedia.org/wiki/Percentage

http://en.wikipedia.org/wiki/Printing

http://en.wikipedia.org/wiki/Halftoning


http://en.wikipedia.org/wiki/Luminance_(relative)

http://en.wikipedia.org/wiki/RGB

http://en.wikipedia.org/wiki/Gamma_correction

http://en.wikipedia.org/wiki/Gamma_correction

Binary Image:

A binary image is a digital image that has only two possible values for each pixel. Typically the two

colors used for a binary image are black and white though any two colors can be used.The color

used for the object(s) in the image is the foreground color while the rest of the image is the

background color.

Binary images are also called bi-level or two-level. This means that each pixel is stored as a single bit

(0 or 1). The names black-and-white, B&W ,monochrome or monochromatic are often used for this

concept, but may also designate any images that have only one sample per pixel, such as grayscale.

YUV Image:

YUV is a color space typically used as part of a color image pipeline. It encodes a color image or video

taking human perception into account, allowing reduced bandwidth for chrominance components, thereby

typically enabling transmission errors or compression artifacts to be more efficiently masked by the human

perception than using a "direct" RGB-representation..

The term YUV is commonly used in the computer industry to describe file-formats that are encoded using

YCbCr.

The Y'UV model defines a color space in terms of one luma (Y') and two chrominance (UV) components


http://en.wikipedia.org/wiki/Pixel

http://en.wikipedia.org/wiki/Monochrome

http://en.wikipedia.org/wiki/Monochrome

http://en.wikipedia.org/wiki/Color_space

http://en.wikipedia.org/wiki/Color_image_pipeline

http://en.wikipedia.org/wiki/Chrominance

http://en.wikipedia.org/wiki/Color_space

http://en.wikipedia.org/wiki/Luma_(video)

http://en.wikipedia.org/wiki/Chrominance

Converting between YUV and RGB:

Edges in image Processing:

Edges are significant local changes of intensity in an image.

Edges typically occur on the boundary between two different regions in an image.

What is Edge Detection?

Edge detection is a terminology in image processing and computer vision, particularly in the areas

of feature detection and feature extraction, to refer to algorithms which aim at identifying points in

a digital image at which the image brightness changes sharply or more formally has discontinuities.

The purpose of detecting sharp changes in image brightness is to capture important events and

changes in properties of the world. It can be shown that under rather general assumptions for an

image formation model, discontinuities in image brightness are likely to correspond to:

discontinuities in depth

discontinuities in surface orientation

changes in material properties

variations in scene illumination

In the ideal case, the result of applying an edge detector to an image may lead to a set of connected

curves that indicate the boundaries of objects, the boundaries of surface markings as well curves

that correspond to discontinuities in surface orientation. Thus, applying an edge detector to an

image may significantly reduce the amount of data to be processed and may therefore filter out

information that may be regarded as less relevant, while preserving the important structural

properties of an image. If the edge detection step is successful, the subsequent task of interpreting

the information contents in the original image may therefore be substantially simplified.

Unfortunately, however, it is not always possible to obtain such ideal edges from real life images of

moderate complexity. Edges extracted from non-trivial images are often hampered by

fragmentation, meaning that the edge curves are not connected, missing edge segments as well

as false edges not corresponding to interesting phenomena in the image – thus complicating the

subsequent task of interpreting the image data

http://en.wikipedia.org/wiki/Image_processing

http://en.wikipedia.org/wiki/Computer_vision

http://en.wikipedia.org/wiki/Feature_detection_(computer_vision)

http://en.wikipedia.org/wiki/Feature_extraction

http://en.wikipedia.org/wiki/Algorithm


http://en.wikipedia.org/wiki/Luminous_intensity

Edge properties:

The edges extracted from a two-dimensional image of a three-dimensional scene can be classified as

either viewpoint dependent or viewpoint independent. A viewpoint independent edge typically

reflects inherent properties of the three-dimensional objects, such as surface markings and surface

shape. A viewpoint dependent edge may change as the viewpoint changes, and typically reflects the

geometry of the scene, such as objects occluding one another.

A typical edge might for instance be the border between a block of red color and a block of yellow,

In contrast a line ,can be a small number of pixels of a different color on an otherwise unchanging

background. For a line, there may therefore usually be one edge on each side of the line.

Edges play quite an important role in many applications of image processing, in particular

for machine vision systems that analyze scenes of man-made objects under controlled illumination

conditions.

A simple edge model:

Although certain literature has considered the detection of ideal step edges, the edges obtained

from natural images are usually not at all ideal step edges. Instead they are normally affected by

one or several of the following effects:

Focal blur caused by a finite depth-of-field and finite point spread function.

Penumbral blur caused by shadows created by light sources of non-zero radius.

Shading at a smooth object

A one-dimensional image f which has exactly one edge placed at x = 0 may be modeled as:

At the left side of the edge, the intensity is

,

And right of the edge it is

.

The scale parameter σ is called the blur scale of the edge.

http://en.wikipedia.org/wiki/Line_(mathematics)

http://en.wikipedia.org/wiki/Machine_vision

http://en.wikipedia.org/wiki/Depth-of-field

http://en.wikipedia.org/wiki/Point_spread_function

http://en.wikipedia.org/wiki/Penumbra

http://en.wikipedia.org/wiki/Shading

Why edge detection is a non-trivial task?

To illustrate why edge detection is not a trivial task, let us consider the problem of detecting edges

in the following one-dimensional signal. Here, we may intuitively say that there should be an edge

between the 4th and 5th pixels.

5 7 6 4 152 148 149

If the intensity difference were smaller between the 4th and the 5th pixels and if the intensity

differences between the adjacent neighboring pixels were higher, it would not be as easy to say that

there should be an edge in the corresponding region. Moreover, one could argue that this case is

one in which there are several edges.

5 7 6 41 113 148 149

Hence, to firmly state a specific threshold on how large the intensity change between two

neighboring pixels must be for us to say that there should be an edge between these pixels is not

always a simple problem. Indeed, this is one of the reasons why edge detection may be a non-trivial

problem unless the objects in the scene are particularly simple and the illumination conditions can

be well controlled.

Goal of edge detection:

Produce a line drawing of a scene from an image of that scene.

Important features can be extracted from the edges of an image (e.g., corners,lines, curves).

These features are used by higher-level computer vision algorithms (e.g., recognition).

What causes intensity changes?

Various physical events cause intensity changes.

Geometric events

object boundary (discontinuity in depth and/or surface color and texture)

surface boundary (discontinuity in surface orientation and/or surface color

and texture)

Non-geometric events

specularity (direct reflection of light, such as a mirror)

shadows (from other objects or from the same object)

inter-reflections

Edge descriptors

Edge normal: unit vector in the direction of maximum intensity change.

Edge direction: unit vector to perpendicular to the edge normal.

Edge position or center: the image position at which the edge is located.

Edge strength: related to the local image contrast along the normal.

Modeling intensity changes:

Edges can be modeled according to their intensity profiles.

Step edge: the image intensity abruptly changes from one value to one side of the

discontinuity to a different value on the opposite side.

Ramp edge: a step edge where the intensity change is not instantaneous but occur

over a finite distance.

Ridge edge: the image intensity abruptly changes value but then returns to the

starting value within some short distance (generated usually by lines).

Roof edge: a ridge edge where the intensity change is not instantaneous but occur

over a finite distance (generated usually by the intersection of surfaces).

The four steps of edge detection:

(1) Smoothing: Suppress as much noise as possible, without destroying the true edges.

(2) Enhancement: apply a filter to enhance the quality of the edges in the image (sharpening).

(3) Detection: determine which edge pixels should be discarded as noise and which should be

retained (usually, thresholding provides the criterion used for detection).

(4) Localization: determine the exact location of an edge (sub-pixel resolution might be required for

some applications, that is, estimate the location of an edge to better than the spacing between

pixels). Edge thinning and linking are usually required in this step.

Edge detection using derivatives:

Calculus describes changes of continuous functions using derivatives, An image is a 2D function, so

operators describing edges are expressed using partial derivatives.

- Points which lie on an edge can be detected by:

(1) Detecting local maxima or minima of the first derivative

(2) Detecting the zero-crossing of the second derivative

Definition of the gradient

- The gradient is a vector which has certain magnitude and direction:

To save computations, the magnitude of gradient is usually approximated

by:

Properties of the gradient

The magnitude of gradient provides information about the strength of the edge. The direction of

gradient is always perpendicular to the direction of the edge (the edge direction is rotated with

respect to the gradient direction by -90 degrees).

Estimating the gradient with finite differences

The gradient can be approximated by finite differences:

Using pixel-coordinate notation (remember: j corresponds to the x direction and

i to the negative y direction):

Standard deviation

Standard deviation of a statistical population, a data set, or a probability distribution is the square

root of its variance. Standard deviation is a widely used measure of the variability or dispersion,

being algebraically more tractable though practically less robust than the expected

deviation or average absolute deviation.

It shows how much variation there is from the "average" (mean) (or expected/ budgeted value). A

low standard deviation indicates that the data points tend to be very close to the mean, whereas

high standard deviation indicates that the data are spread out over a large range of values.

For example, the average height for adult men in Pakistan about 70 inches (178 cm), with a standard

deviation of around 3 in (8 cm). This means that most men (about 68 percent, assuming a normal

distribution) have a height within 3 in (8 cm) of the mean (67–73 in (170–185 cm)) – one standard

deviation, whereas almost all men (about 95%) have a height within 6 in (15 cm) of the mean (64–76

in (163–193 cm)) – 2 standard deviations. If the standard deviation were zero, then all men would

be exactly 70 in (178 cm) high. If the standard deviation were 20 in (51 cm), then men would have

much more variable heights, with a typical range of about 50 to 90 in (127 to 229 cm). Three

standard deviations account for 99.7% of the sample population being studied, assuming the

distribution is normal (bell-shaped).

http://en.wikipedia.org/wiki/Statistical_population

http://en.wikipedia.org/wiki/Probability_distribution

http://en.wikipedia.org/wiki/Square_root



http://en.wikipedia.org/wiki/Variance

http://en.wikipedia.org/wiki/Statistical_dispersion

http://en.wikipedia.org/wiki/Expected_deviation



http://en.wikipedia.org/wiki/Average_absolute_deviation

http://en.wikipedia.org/wiki/Mean

http://en.wikipedia.org/wiki/Normal_distribution



OPTICAL CHARACTER RECOGNITION MODULE

WHAT IS OCR?

Full form of OCR is Optical Character Recognition. It is a computer program designed to convert scanned or digital images of handwritten or typewritten text into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or Unicode). OCR began as a field of research in pattern recognition, artificial intelligence and machine vision. Though academic research in the field continues, the focus on OCR has shifted to implementation of proven techniques.

OCR BACKGROUND

Developing proprietary OCR system is a complicated task and requires a lot of effort. Such systems usually are really complicated and can hide a lot of logic behind the code. The use of artificial neural network in OCR applications can dramatically simplify the code and improve quality of recognition while achieving good performance. Another benefit of using neural network in OCR is extensibility of the system - ability to recognize more character sets than initially defined. Most of traditional OCR systems are not extensible enough. Why? Because such task as working with tens of thousands Chinese characters, for example, is not as easy as working with 68 English typed character set and it can easily bring the traditional system to its knees.

MODULES OF OCR SYSTEM

OCR systems consist of five major stages

1. Pre-processing 2. Segmentation 3. Feature Extraction 4. Classification 5. Post-processing

1-Pre-processing

The raw data is subjected to a number of preliminary processing steps to make it usable in the descriptive stages of character analysis. Pre-processing aims to produce data that are easy for the OCR systems to operate accurately. The main objectives of pre-processing are :

• Binarization • Noise reduction • Stroke width normalization • Skew correction

• Slant removal

Binarization

Document image binarization (thresholding) refers to the conversion of a gray-scale image into a binary image. Two categories of thresholding

Document image binarization (thresholding) refers to the conversion of a gray-scale image into a binary image. Two categories of thresholding

Adaptive (local), uses different values for each pixel according to the local area information

Noise Reduction

Noise reduction improves the quality of the document. Two main approaches:

• Filtering (masks) • Morphological Operations (erosion, dilation, etc)

Normalization provides a tremendous reduction in data size, thinning extracts the shape information of the characters.

Skew Correction

Skew Correction methods are used to align the paper document with the coordinate system of the scanner. Main approaches for skew detection include correlation, projection profiles, Hough transform.

Slant Removal

The slant of handwritten texts varies from user to user. Slant removal methods are used to normalize the all characters to a standard form.

Popular deslanting techniques are:

Bozinovic – Shrihari Method (BSM).

• Calculation of the average angle of near-vertical elements

2-Segmentation

Segmentation implies segmenting the characters with in the text:

Two approaches are commonly used for this purpose:

Explicit Segmentation

In explicit approaches one tries to identify the smallest possible word segments (primitive segments) that may be smaller than letters, but surely cannot be segmented further. Later in the recognition process these primitive segments are assembled into letters based on input from the character recognizer. The advantage of the first strategy is that it is robust and quite straightforward, but is not very flexible.

Implicit Segmentation

In implicit approaches the words are recognized entirely without segmenting them into letters. This is most effective and viable only when the set of possible words is small and known in advance, such as the recognition of bank checks and postal address

3-Feature Extraction

In feature extraction stage each character is represented as a feature vector, which becomes its identity. The major goal of feature extraction is to extract a set of features, which maximizes the recognition rate with the least amount of elements.

Due to the nature of handwriting with its high degree of variability and imprecision obtaining these features, is a difficult task. Feature extraction methods are based on 3 types of features:

Statistical Structural Global transformations and moments

Statistical Features

Representation of a character image by statistical distribution of points takes care of style variations to some extent.

The major statistical features used for character representation are:

• Zoning • Projections and profiles • Crossings and distances

Zoning

The character image is divided into NxM zones. From each zone features are extracted to form the feature vector. The goal of zoning is to obtain the local characteristics instead of global characteristics

Zoning – Density Features

The number of foreground pixels, or the normalized number of foreground pixels, in each cell is considered a feature.

Projection Histograms

The basic idea behind using projections is that character images, which are 2-D signals, can be represented as 1-D signal. These features, although independent to noise and deformation, depend on rotation.

Projection histograms count the number of pixels in each column and row of a character image. Projection histograms can separate characters such as “m” and “n”.

Profiles

The profile counts the number of pixels (distance) between the bounding box of the character image and the edge of the character. The profiles describe well the external shapes of characters and allow distinguishing between a great number of letters, such as “p” and “q”.

Structural Features

Three types of features

Horizontal and Vertical projection histograms.

Radial histogram .

Radial out-in and radial in-out profiles.

Feature Extraction

Two types of features :

Features based on zones:

The character image is divided into horizontal and vertical zones and the

density of character pixels is calculated for each zone.

Features based on character projection profiles:

The centre mass of the image is first found.

Upper/ lower profiles are computed by considering for each image column,

the distance between the horizontal line and the closest pixel to the

upper/lower boundary of the character image. This ends up in two zones

depending on . Then both zones are divided into vertical blocks. For all blocks

formed we calculate the area of the upper/lower character profiles.

Similarly, we extract the features based on left/right profiles.

MOTION BLUR REMOVAL

A blurred image has an associated Point Spread Function, the mathematical function responsible for

the distortion of the image itself. We have various algorithms and filters used for removing the blur

from the image but all these algorithms assume that the knowledge of PSF is already known,

therefore these algorithms simply deconvolve the PSF with the blurred image to get the original

image.

DEBLURRING FUNCTIONS

Wiener Filter (deconvwnr)

Implements a least squares solution. You should provide some information about the noise to

reduce possible noise amplification during deblurring.

Regularized Filter (deconvreg)

Implements a constrained least squares solution, where you can place constraints on the output

image (the smoothness requirement is the default). You should provide some information about the

noise to reduce possible noise amplification during deblurring. See Deblurring with a Regularized

Filter for more information.

Lucy-Richardson Algorithm (deconvlucy)

Implements an accelerated, damped Lucy-Richardson algorithm. This function performs multiple

iterations, using optimization techniques and Poisson statistics. You do not need to provide

information about the additive noise in the corrupted image. See Deblurring with the for more

information.

Blind Deconvolution Algorithm (deconvblind)

Implements the blind deconvolution algorithm, which performs deblurring without knowledge of

the PSF. You pass as an argument your initial guess at the PSF. The deconvblind function returns a

restored PSF in addition to the restored image. The implementation uses the same damping and

iterative model as the deconvlucy function. See Deblurring with the for more information.

SOBEL’S EDGE DETECTION ALGORITHM

There are many ways to perform edge detection. However, the majority of different methods may be grouped into two categories, gradient and Laplacian. The gradient method detects the edges by looking for the maximum and minimum in the first derivative of the image. The Laplacian method searches for zero crossings in the second derivative of the image to find edges. An edge has the one-dimensional shape of a ramp and calculating the derivative of the image can highlight its location. Suppose we have the following signal, with an edge shown by the jump in intensity below:

If we take the gradient of this signal (which, in one dimension, is just the first derivative with respect to t) we get the following:

Based on this one-dimensional analysis, the theory can be carried over to two-dimensions as long as there is an accurate approximation to calculate the derivative of a two-dimensional image. The Sobel operator performs a 2-D spatial gradient measurement on an image. Typically it is used to find the approximate absolute gradient magnitude at each point in an input grayscale image. The Sobel edge detector uses a pair of 3x3 convolution masks, one estimating the gradient in the x-direction (columns) and the other estimating the gradient in the y-direction (rows). A convolution mask is usually much smaller than the actual image. As a result, the mask is slid over the image, manipulating a square of pixels at a time. The actual Sobel masks are shown below:

The magnitude of the gradient is then calculated using the formula:

An approximate magnitude can be calculated using:

|G| = |Gx| + |Gy|

SOBEL EXPLANATION

The mask is slid over an area of the input image, changes that pixel's value and then shifts one pixel to the right and continues to the right until it reaches the end of a row. It then starts at the beginning of the next row. The example below shows the mask being slid over the top left portion of the input image represented by the green outline. The formula shows how a particular pixel in the output image would be calculated. The center of the mask is placed over the pixel you are manipulating in the image. And the I & J values are used to move the file pointer so you can mulitply, for example, pixel (a22) by the corresponding mask value (m22). It is important to notice that pixels in the first and last rows, as well as the first and last columns cannot be manipulated by a 3x3 mask. This is because when placing the center of the mask over a pixel in the first row (for example), the mask will be outside the image boundaries.

CHAPTER # 3

SYSTEM DESING AND

ARCHITECTURE

SOFTWARE PART

Image Acquisition

We are using DANY made 1.3Mpixel webcam to acquire the image of the letter moving above

the conveyer belt for further processing.

Specifications of the camera:

Sensor: 1/6" CMOS (OV7670)

Hardware Resolution: 300K(640H*480V)

Software Resolution: 1.3 Megapixels

Pixel Point: 3.6um*3.6um

Image Format: VGA

Data Format: YUV, RGB

Operating Port: USB 2.0 and down to USB1.1

Max. Frame Rate: 30fps (VGA)

Min. Low Photo Light: 6Lux

S/N Ratio: >46Db

Definition Level: >300TV Line(middle level)

Lens: 5P F1.1.8/f2.95

Focus Range: 2cm to infinity

Visual Angle: 60°

Photo Control: Saturation, Compare, Sharp

White Balance: Automatic, Manual

Automatic White

Balance:

2600°K~5000°K

Exposure: Automatic, Manual

Image/Video: Automatic, Manual

Storage Temperature: -40° to 95°

Operating Temperature: 30° to 70°

Interfacing Camera with PC:

Interfacing webcam with the Matlab is the first Step in the software module of the system. Main

cammands used for this purpose are follow:

The imaqhwinfo gives the installed adaptors,we are using ‘winvideo’ for the purpose.

When winvideo is fed to this we get the following info;

When the command below is used following interface appeared;

Collecting the best image frame:

We have made an algorithm based on the pixel sum, It calculates the sum of the all the pixels of the

binary image, this can be done easily in the matlab.we have 640 x 480 binary image.

These images are continuously taken by the webcam and by calculating the sum of all the pixels

(white=1 and black=0).

This is done by taking sum two times,

When we take sum of the image first time a column vector is returned and taking sum of the column

vector we have a single value. This value obtained is compared with the threshold for the best

image, if it is close to that value; we assume that it is the best image.

The threshold was between 280000 to 290000.

Interaction with the Best frame selector:

In this case the pixel sum was greater than 290000 so this frame was not selected

In this case the pixel sum was determined to be between 280000 and 290000 so that frame was

selected.

Now again in this next frame (on next page) the pixel sum exceeded 290000 so that was not

selected.

Locating Address Block

We have developed an algorithm based on the zero crossing; algorithm is detecting a change in the

pixels from black pixel to the white pixels as we are dealing with the binary image.

Initially we have an RGB image which is achieved from the camera,

Consider the above part of the algorithm, by applying this on our best image this kind of image

transformation was done.

Initially we had a YUV image from the webcam,

Now by applying the conversion mentioned above in the literature survey this image was

transformed into the RGB image,

After achieving the RGB image it was converted to the gray scale image,

Now that particular image was converted into the binary image.

By negating the above image we had a negated binary image and on that image we applied our

basic algorithm.

In the algorithm below address block is located by calculating the x and y coordinates of the

address, considering the negated binary image which is obtained after certain manipulations in the

original image.

As it is known from the literature survey that the white image has a pixel value of 1 and black has a

value of zero, so what is done that this algorithm uses summing approach for identifying the address

horizontally, so if we take C1 and C2 be the starting and ending columns of the binary image matrix

above,they can be determined easily with the help of the algorithm below.

As we have found C1 with the above algorithm and now finding C2,

As the sum approach was more useful for finding the address block horizontally but after experiments

the standard deviation approach was found better than finding address block vertically.Standard

deviation is explained in the literature survey.

As we will be using the city name for applying the OCR, so what we will be doing is that our R1 (that

is the start of address block vertically) the algorithm is designed such that it identifies the city name.

After this algorithm we have R1 identified which is the start of address block, as we can roughly

estimate R2 by adding 50 into R1 as this image is going to be cropped so we just need the highte of

the rectangle which is not that critical as R1 is,so this is done by estimation.

When these values are fed to imcrop(command in matlab) the image is cropped, as shown below,

Ouput of ABL:

So this city name is passed to the OCR for further processing.

REMOVING MOTION BLUR

The blurring, or degradation, of an image can be caused by many factors. In our case the reason for

blur is the movement during the image capture process, by the camera.

A blurred image can be simply represented by the following mathematical equation:

g = Hf + n

Here “g” represents the blurred image .

“H” represents the distortion factor, also called the point spread function (PSF). In the spatial

domain, the PSF describes the degree to which an optical system blurs (spreads) a point of

light. The PSF is the inverse Fourier transform of the optical transfer function (OTF). In the

frequency domain, the OTF describes the response of a linear, position-invariant system to

an impulse. The OTF is the Fourier transform of the point spread function (PSF). The

distortion operator, when convolved with the image, creates the distortion. Distortion

caused by a point spread function is just one type of distortion.

Small “f “is the original image

Small “n” is any additive noise

The Point Spread Function describes the response of the system to a point source much like impulse

response is the response of any linear function at a certain point. If PSF is convolved with an image it

will blur the image or distort it. In our case we already receive a blurred image with an unknown PSF

and additive noise. We therefore first convolve the image with the Sobel’s operator to work out the

edges of the blurred image and then subtract the resultant by the original image to get the

deblurred image.

There are various different filters and algorithms used for deblurring the image but in most cases

either the PSF is known or the additive noise has to be known but in our case we needed to work

out PSF first.

We were now left with two options: either to calculate the PSF or stop the conveyer and then take

the image.

OPTICAL CHARACTER RECOGNITION

Our OCR system aims to read the text written on the letter and convert it into machine readable

format. Although an OCR may consists of various image processing techniques like image

segmentation, image classification, pattern detection, edge detection and so on, but the line of

action we have followed is quite simple. We only aim to detect the computer prints and not the

handwritten texts. The OCR system we established performs following two operations:

Character recognition through edge detection

Template matching with stored character templates

CHARACTER RECOGNITION THROUGH EDGE DETECTION

We used Sobel operator and convolved it with the image to detect its images. This operator

detects the intensity variations in the image through convolution and creates a resultant

image indicating only the points where intensity varies.

The gradient at a point in an image is calculated as:

The Sobels operator:

The Sx kernel is used to determine the derivatives in horizontal direction and Sy used to

detect the derivatives in vertical direction. The two kernels are convolved with the 320*240

pixel image and maintain an intensity variation threshold, points who exceed this threshold

are assigned certain values and they are shown in the resultant image while points who lie

with in the limits of this threshold imply their derivative is zero and hence are neglected.

http://nullprogram.com/img/spatial/image-test-edge.png

http://nullprogram.com/img/spatial/image-test-edge.png

TEMPLATE MATCHING

The resultant image from the above process is than segmented using character

segmentation techniques and finally matched with set of characters maintained a templates

folder.

The maintained database of detectable characters

OUTPUT OF THE OCR SYSTEM

Running OCR system correctly yields following output in text format:

INTERFACING WITH THE SIMULINK

After the city is detected by the OCR the software sends the signal for that detected city to the

SIMULINK BLOCK in the form a vector. For each city there exist a unique vector as follow:

For Wah cantt: *0 0 1+’

For Multan : *0 1 0+’

For Lahore : *1 0 0+’

Detecting Wah Cantt:

Detecting Multan

Detecting Lahore

HARDWARE PART

DC POWER SUPPLY:

We needed a variable DC power supply for increasing and decreasing the speed of the conveyer

belt.

We designed a DC power supply for this purpose.

Circuitry and Explanation:

This power supply is based on the LM317 Variable Regulator., LM317 is an adjustable 3 terminal

regulator to supply a current of up to 5A over a variable output voltage of 2V to 25V DC. It will come

in handy to power up many electronic circuits when you are assembling or building any electronic

devices. The schematic and parts list are designed for a power supply input of 220VAC.

Components used in the DC power Supply:

Transformer(To Convert 220V AC to 12V AC)

Capacitor 2200 micro Farad

LM 317

TIP 142(Darlington BJTs)

Vraiable Resistor(Linear Potentiometer) 5K

Resistor of 220 ohms

The above power supply converts 220V AC to 2V-20V DC s that voltage is fed to the DC motor.

Features of our dc power supply:

* Adjustable output down to 1.2V

* Guaranteed 1.5A output current

* Line regulation typically 0.01%/V

* Load regulation typically 0.1%

* 80 dB ripple rejection

DC motor:

A brushless DC (BLDC) motor also known as a electronically commutated motor is a

synchronous electric motor powered by direct-current (DC) electricity and having an electronic

commutation system, rather than a mechanical commutator and brushes.

In BLDC motors, current to torque and voltage to rpm are linear relationships.

We have used a DC motor instead of a stepper motor.

Why not Stepper Motor for the

Conveyer Belt ?

Disadvantages of stepper motors:

Resonances can occur if not

properly controlled.

Not easy to operate at extremely

high speeds.

Advantages of DC motor:

Brushless d.c. (BLDC) motors provide performance advantages over PSC and brushed d.c. (BDC)

motors, including the following:

• The ratio of output power to frame size is higher in BLDC motors. This reduces the size and weight

of the product. This also saves the cost of motor mounting and shipping expenses.

• The BLDC motors operate at higher-power efficiency compared to induction motors and BDC

motors because they have permanent magnets on the rotor and there are no brushes for

commutation.

• Brush inspection is eliminated, making them suitable for limited-access areas like compressors and

fans. This also increases the life of the motor and reduces the service requirements.

• BLDC motors have less electromagnetic interference (EMI) generation. With BDC motors, the

brushes tend to break and make contacts while the motor is rotating, resulting in the emission of

electromagnetic noise into the surroundings.

http://en.wikipedia.org/wiki/Electronically_commutated_motor

http://en.wikipedia.org/wiki/Electric_motor

http://en.wikipedia.org/wiki/Direct-current_electricity

http://en.wikipedia.org/wiki/Commutator_(electric)

http://en.wikipedia.org/wiki/Brush_(electric)

• BLDC motors have a relatively flat speed-torque characteristic ( See Figure). This enables the

motor to operate at lower speeds without compromising torque when the motor is loaded.

Better speed versus torque characteristics

High dynamic response

High efficiency

Long operating life

Noiseless operation

Higher speed ranges

DC motors provide excellent speed control for acceleration and deceleration with effective and

simple torque control.

Power supply of a DC motor connects directly to the field of the motor allows for precise voltage

control, which is necessary with speed and torque control applications.

Comparison of both the motors:

Type Advantages Disadvantages Typical Application Typical

Drive

Stepper DC Precision

positioning

High holding

torque

Requires a

controller

Positioning in printers and

floppy drives

DC

Brushless

DC

Long lifespan

low maintenance

High efficiency

High initial cost

Requires a

controller

Hard drives

CD/DVD players

electric vehicles

DC

AutoCAD Layout of the Hardware:

http://en.wikipedia.org/wiki/Stepper_motor

http://en.wikipedia.org/wiki/Brushless_DC_electric_motor

http://en.wikipedia.org/wiki/Brushless_DC_electric_motor

AutoCAD is a CAD (Computer Aided Design or Computer Aided Drafting) software application for

2D and 3D design and drafting.

We have made our Hardware Design in AutoCAD.

PLUNGER MOTORS

Now we could have used two types of motors for plunging the letters into the destination bins

Free spinning electric motors

Stepper motors

Free-Spinning Electric Motors - A free-spinning electric motor uses precisely timed opposing

magnetic fields to cause an armature shaft to rotate. Free-spinning electric motors can be

designed to run on AC or DC current, with brushes or brushless, depending upon their

application.

Stepper Motors - The armature of a stepper motor can be rotated an exact number of turns

or just a fraction of a turn. Stepper motors are controlled by a computer to position a

mechanical device in an exact location. A typical stepper motor can be positioned to 256

different positions.

We are using window power motor which is of the type of free spinning electric motor. It

requires a minimum current of 3-4 amps for proper functionality. The motor rotates at 360

degrees and plunges accurately the letter on conveyer belt into the destination bins using

plungers mounted on it. We are using three motors for this purpose as we are sorting three

letters based on city.

http://en.wikipedia.org/wiki/Computer-aided_design

http://en.wikipedia.org/wiki/Software_application

http://en.wikipedia.org/wiki/2d_computer_graphics

http://en.wikipedia.org/wiki/3d_computer_graphics

http://en.wikipedia.org/wiki/Design

http://en.wikipedia.org/wiki/Technical_drawing

Window motor used to plunge the letters

PLUNGER MOTOR DRIVER CIRCUITRY

To drive these heavy current motors we have used battery to provide these motors a current

of 3-4 amps. Designing of this circuit was done by us and it worked fine and it employed a

heavy battery.

Circuit made in multisim

CIRCUIT COMPONENTS DESCRIPTION:

Optocouplers

Optocouplers are used to detect the 5V signal at the output of the SIMULINK card. Optocouplers are

great for tinkering. They enable you to control one circuit from another circuit when there is no

electronic connection between the two circuits.

Op Amp

Op Amps are used to convert the 5 volt into 15 volt signal. Because of the high impedance of the op-

amps' input stage, combined with "bootstrapping" effects caused by negative feedback, the input

impedance of an op-amp is infinite for all practical purposes.

Relays

Relays are used to serve as current switches. They are electro-magnetically activated switches.

Literally, there is an electromagnet inside the relay, and energizing that electromagnet causes the

switch to change position by pulling the movable parts of the switch mechanism to a different

position. To the greatest extent possible, the electromagnet is made to be electrically isolated from

the signal path.

Battery

The motors drive current from the battery placed at the corner of the circuit board. The battery is

capable of producing 3-4 Volts of voltage.

SIMULINK CARD and SIMULINK MODEL

We are using NI 6052E model SIMULINK card. It takes the data from the OCR code and produces a

five volt signal with a minimal current. The output from this card is fed into the current amplifier

circuitry discussed above.The SIMULINK model is shown below:

The SIMULINK card we are using is National instruments PIC 6052E:

Used SIMULINK card in our project

Our SIMULINK card yields three outputs each with its own ground. The key specifications of the card

we are using are given below:

Functionally equivalent to National Instruments' PCI-6052E

16 single-ended/8 differential 16-bit analog inputs

333 kS/s maximum sample rate

Two 16-bit analog outputs

Eight digital I/O lines and 2 counter/timers

Triggering of measurement and control via both analog and digital signals

Easy synchronization of multiple measurement boards

The SIMULINK card has an output board which generates 5V for each signal sent to it as a city is

detected by the MATLAB. From here the signal passes through the current amplifier and the

plunger motor is driven.

CHAPTER # 4

DISCUSSION AND

CONCLUSION

This chapter aims to discuses the merits and demerits the Automatic Mail Sorting machine. It

captures the essence of the project and techniques employed to complete the project. The project is

also compared with other of kinds in a more subjective way.

PRESENT STATURS OF THE PROJECT:

The goal of the project was to develop the automated mail sorting machine which should have

sorted out letters at enormous speeds, more than at least the time required for manual letter

sorting. But due to non availability of the resources and limited financial capabilities we could only

make the circuit which operates just at the same speed as required in manual letter sorting. Also our

project is only 70-80 % operational. Although the image processing part has been achieved quite

accurately and successfully in the MATLAB software but the interfacing problem presented some

major loopholes in automation.

Due to undetectable motion blur which accompanies the image acquired by the camera as the letter

moves on the conveyer belt, image processing on moving image could not be achieved although we

successfully implemented various image processing algorithms on the still image.

UPS AND DOWNS DURING THE PROJECT:

Motion blur represented a major loophole in automation. The blur could have been some how

reduced using some high resolution camera but due to our financial limitations we abandoned this

idea of buying such a state of the art design. Another way to reduce the blur was to use the

software coding and use some other image processing technique, and it was this idea which we

adopted. But for each motion deblurr technique we either need to know the PSF or the noise

function which in our case was simply unknown. Calculating the PSF is monotonous task but still we

proceeded and gained almost 30 percent successes in this area.

The very first task was to determine the address block from the acquired image. The code we wrote

for this purpose was named as Address Block Location (ABL), this task was followed by the task of

locating the city name from the worked out address block. This was yet another momentous task

and took weeks before final furnishing. This software code is followed by OCR code. It took us yet

other weeks to accurately interface both the two codes together. By now both the codes work just

fine and take at the maximum of 3 seconds to produce the output.

The plunger motor drive presented to us yet another problem. After a great deal of experimentation

and thinking we were left with only one option: using window power motor. We designed its driving

circuitry as it requires almost 3-4 amp of current for proper functionality. With the output from

SIMULINK they are accurately operating. This presents another achievement.

Throughout our experimentation on the AMSM we ended up with various achievements and were

stuck in certain areas as well. We successfully located the address block from the acquired image

and extracted the city name from that address. Then we finally ran optical character recognition

software on the extracted city and sorted out letters as per required.

Image processing on moving image could not be fully achieved, although we went a long way ahead

in this domain.

Overall we believe we have almost achieved the goal we set for ourselves in the beginning of the

project.

APPENDEX

MATLAB CODE:

% OCR (Optical Character Recognition).

% Auotomatic Mail Sorting Machine (AMSM)

%_this code is intended to receive the frames produced by camera, perform

%optical characrer recognition and generate corresponding signal for the

%simulink to derrive the plunger mechanism.

% PRINCIPAL PROGRAM

%////////////////////////////////////////////////////////////////////

warning off %#ok<WNOFF>

% Clearing command window

clc

% Closing all the opened figures

close all

%//////////////////////////////////////////////////////////////////////

Read image

camera interfacing....

vid1 = videoinput('winvideo',1,'YUY2_320x240') % for 320x240 video

set(vid1,'Returnedcolorspace','RGB')% for RGB image

preview(vid1)

snap=getsnapshot(vid1);

%imshow(snap)

imwrite(snap,'lahore.jpg')

imshow(address)

%/////////////////////////////////////////////////////////////////////////

%IMAGE FILTERING...SMOOTHING.....

f=imread('wah.jpg');

f=rgb2gray(f);

[M,N]=size(f);

F=fft2(double(f));

u=0:(M-1);

v=0:(N-1);

idx=find(u>M/2);

u(idx)=u(idx)-M;

idy=find(v>N/2);

v(idy)=v(idy)-N;

[V,U]=meshgrid(v,u);

D=sqrt(U.^2+V.^2);

%H=double(D<=P);

G=1.*F;

g=real(ifft2(double(G)));

%/////////////////////////////////////////////////////////////////////////

%imshow(f),figure,

%a=imshow(g,[0 255])

%figure;

%READING THE IMAGE.........................

imagen=imread('wah.jpg');

% Show image

imshow(imagen);

title('IMAGE TRANSFORMED INTO MATLAB')

%////////////////////////////////////////////////////////////////////////

% Converting the RGB image to gray scale to reduce processing

if size(imagen,3)==3 %RGB image

imagen=rgb2gray(imagen);

end

%////////////////////////////////////////////////////////////////////////

% Convert to BW

% First we determine the level of threshold

threshold = graythresh(imagen);

% Now we convert the image to binary format

imagen =~im2bw(imagen,threshold);

%/////////////////////////////////////////////////////////////////////////

% Adress Block location Software

% Clear variables and functions from memory.

clear all

clc

%putting two variables=0 for the further use as the counters

k=0;

u=0;

e=0;

f=0;

% Now reading a grayscale or color image from the file specified by the string FILENAME

A=imread('addd.jpg');

% % RGB2GRAY converts RGB images to grayscale by eliminating the hue and saturation information

while retaining the luminance

A=rgb2gray(A);

% %computing a global threshold (LEVEL) that can be used to convert an

% %intensity image to a binary image with IM2BW.

threshold = graythresh(A);

% %image converted to binary and negated

A=~im2bw(A,threshold);

% X=[0 -.25 0;-.25 1 -.25;0 -.25 0]

% C = convn(A,X)

% C=~C;

% calculatin the sum colum wise giving a row vector

B=sum(A)

%looop for the start of the adress block

for m=1:640

%checking the black pixels

if (B(m)>1)

e=e+1

if(e>25)

C1=m-e

% Start of the adress block identified

break

end

end

end

%loop for the end of the adress block horizontally

for i=C1:640

if(B(i)<1)

k=k+1

if(k>42)

C2=i

%End of the adress block Identified

break

end

end

end

%now calculatin the start of adress block horizontally and vertically

Z=sum(A,2)

for l=35:480

if(Z(l)>1)

f=f+1

if(f>15)

r1=l-f

%start of adress block identified vertically

break

end

end

end

for s=r1:480

if(Z(s)<1)

u=u+1

if(u>20)

r2=s

%start of adress block identified vertically

break

end

end

end

r1=r1+55

r2=r2-1

C1=C1-1

C2=C2-1

% Image negated converted to normal form

A=~A;

% RECT is a 4-element vector with the form [XMIN YMIN WIDTH HEIGHT];

% these values are specified in spatial coordinates and are provided below

% after calculation above.

D = imcrop(A,[C1 r1 C2-C1 r2-r1]);

%showing the original image and adress block

imshow(A)

figure, imshow(D)

% Remove all object containing fewer than 30 pixels

imagen = bwareaopen(imagen,30);

%Storage matrix word from image

word=[ ];

re=imagen;

%Opens text.txt as file for write

fid = fopen('text.txt', 'wt');

% Load templates

load templates

global templates

% Compute the number of letters in template file

num_letras=size(templates,2);

while 1

%Fcn 'lines' separate lines in text

[fl re]=lines(re);

imgn=fl;

%Uncomment line below to see lines one by one

%imshow(fl);pause(0.5)

%-----------------------------------------------------------------

% Label and count connected components

[L Ne] = bwlabel(imgn);

for n=1:Ne

[r,c] = find(L==n);

% Extract letter

n1=imgn(min(r):max(r),min(c):max(c));

% Resize letter (same size of template)

img_r=imresize(n1,[42 24]);

%Uncomment line below to see letters one by one

%imshow(img_r);pause(0.5)

%-------------------------------------------------------------------

% Call fcn to convert image to text

letter=read_letter(img_r,num_letras);

% Letter concatenation

word=[word letter];

end

%fprintf(fid,'%s\n',lower(word));%Write 'word' in text file (lower)

fprintf(fid,'%s\n',word);%Write 'word' in text file (upper)

% Clear 'word' variable

word=[ ];

%*When the sentences finish, breaks the loop

if isempty(re) %See variable 're' in Fcn 'lines'

break

end

end

%/////////////////////////////////////////////////////////////////////////

fclose(fid);

%Open 'text.txt' file

winopen('text.txt')

save text

load ('text')

fid = fopen('text.txt')

C = textscan(fid, '%s',2)

city=C{:}

sami=cell2mat(city);

fclose(fid);

city1= 'LAHORE'

city2='MULTAN'

city3='WAHCANTT'

match4lahore = strncmp(city,city1,3)

if matcah4lahore==1

simulink=*1 0 0+’;

match4multan = strncmp(city,city2,3)

if matcah4lahore==1


match4wahcantt = strncmp(city,city3,3)

if matcah4lahore==1


REFERENCES

BOOKS:

Digital Image Processing (2nd Edition) by Rafael C. Gonzalez Rafael C. Gonzalez

Practical Algorithms for Image Analysis by Lawrence O'Gorman Michael Seul

The Image Processing Handbook, Fifth Edition John C. Russ

Machine Vision, Third Edition: Theory, Algorithms, Practicalities (Signal Processing and its

Applications) E. R. Davies

RESEARCH PAPERS:

INTERNET RESOURCES:

www.wikipedia.org

www.owlnet.rice.edu

www.pages.drexel.edu

homepages.inf.ed.ac.uk/rbf/HIPR2/sobel.htm

www.trcelectronics.com

powerelectronics.com

www.electronics-lab.com/projects

OTHER SOURCES:

Power Point presentation by Giorgos Vamvakas titled “Optical Character Recognition for

Handwritten Characters” from National Center for Scientific Research “Demokritos” Athens

– Greece.

Introduction to Optical Character Recognition (OCR)…. Workshop on international standards,

contemporary technologies and regional cooperation Noumea, New Caledonia, 4 – 8 February 2008

Castleman, K.R., Digital Image Processing, Prentice Hall, 1995

John Canny, ”A computational approach to edge detection.” IEEE Transactions on PAMI,

8(6):679–

698, 1986.

James Elder and Richard Goldberg, ”Image editing in the contour domain.” IEEE

Transactions on

PAMI, 23(3):291–296, 2001.

Scott Konishi, Alan Yuille, James Coughlin, and Song Chun Zhu, ”Statistical edge detection:

Learning and evaluating edge cues.” IEEE Transactions on PAMI, 25(1):57–74, 2003.

William Freeman and Edward Adelson, ”The design and use of steerable filters.” IEEE

Transactions

on PAMI, 13:891–906, 1991.

http://www.wikipedia.org/

http://www.owlnet.rice.edu/

http://www.pages.drexel.edu/

http://www.trcelectronics.com/

David Martin, Charless Fowlkes, and Jitendra Malik, ”Learning to detect natural image

boundaries

using local brightness, color, and texture cues.” IEEE Transactions on PAMI, 26(5):530–549,

2004.

Documents

Automatic Mail (Letter) Sorting Machine