The Final Report

1

1. INTRODUCTION

1.1 ABSTRACT

There are situations where it is not possible to capture large documents with a given

camera in a single stretch because of their inherent limitations. This results in capture

of a large document in terms of split components of a document image. Hence the

need is to mosaic the split components into original and put together the document

image. The proposed system is a simple approach to mosaic two split images of a

large document based on SURF feature matching.

The aim of image mosaicing is to stitch the images that have an overlapped area to a

higher resolution or wide-angle image. Using image mosaicing the images at different

scales can also be stitched.

Fig 1.1.1 Mosaiced image of of Mill City Museum and surrounding area

1.2 INTRODUCTION AND MOTIVATION

Image mosaicing is a technique used in creating vast scene images and panoramic

images. Image mosaicing technology is becoming more and more popular in the fields

of image processing, computer graphics, computer vision and multimedia. It is well

known that the human visual system has a wide field of view of around 135 * 50

degrees. While a typical camera only has a field of view around 35 * 50 degrees. We

2

are not satisfied with viewing only one picture taken by the camera. Panoramic image

mosaics are employed to solve this problem and give us a wider view of the

environments. It addresses the problem by taking a bunch of pictures around one

place, and then stitching them together to form a composite image. The resulting

panoramic image will enable the human to view the environment up to 360 degrees.

They could also be converted into environment maps.

Image mosaicing not only allow you to create a large field of view using normal

camera, the result image can also be used for texture mapping of a 3D environment

such that users can view the surrounding scene with real images. It has wide

application value in the cultural heritage protection and virtual reality field [4].

1.2.1 Panoramic images

A picture or series of pictures representing a continuous scene, often exhibited a part

at a time by being unrolled and passed before the spectator. Panoramic images show a

field view approximately or greater than that of human eye- about 160 by 75 degree.

The resulting images take the form of a wide strip. This generally means that the

image is as twice as wide as it is high.

Fig 1.2.1.1 Shows a panorama of Sydney featuring (from left) the Sydney Opera

House, the central business district skyline, and the Sydney Harbour Bridge [3].

1.3 PROBLEM STATEMENT

This project aims at building an efficient method for helping to create image mosaic

mainly for cultural heritage protection. This method will also be applicable for

stitching pictures with hand-held camera for daily life usage which can display whole

3

scenes vividly or constructing virtual environment such as virtual travel on the

internet, building virtual environments in games etc. The user can take images at

different angles with some overlapping area using a hand held camera, and then the

user can use the program to load the image in and make the mosaic.

The project’s goal is to create an application that will mosaic two images together to

create one larger image. Image Mosaicing has wide uses in photo applications and has

become a required toolset for many photographers. These mosaiced images become

panoramic views which increase the visual aesthetics of a scene, and are widely

sought out for posters, postcards and other printed materials.

1.4 SCOPE OF THE PROJECT

As part of achievable scope, we propose to create a system that takes as input two

overlapping images, based on cultural heritage, from a GUI. These images are clicked

from a certain fixed horizontal distance and vertical height and can have different

intensities. The algorithm uses SURF operator which has strong robustness and

superior performance to extract features. The extracted features are matched by a fast

bidirectional matching scheme. Then a RANSAC algorithm is applied to eliminate

outliers and obtain the transformation matrix between images. Finally images are

stitched by a multi-band blending algorithm.

However, it must be noted that pictures taken at different angles are not considered in

this project.

4

Fig. 1.4.1 Fig.1.4.2

Fig 1.4.3

1.5 ORGANIZATION OF PROJECT REPORT

The project report is spread across the following sections:

Introduction to the overall understanding of need for Image Mosaicing.

Description of project in specific, along with its scope.

Listing the basic necessities and requirements for the project, in terms of

Hardware, Software and other constraints.

Overview of the existing algorithms and methods, along with the detailed

explanation of the algorithm used in the project.

Implementation details and various results obtained.

Various test cases during the course of the project.

2. REVIEW OF LITERATURE

5

2.1 DOMAIN SCOPE

Due to the major use of image mosaicing, its applications are widespread and

commonly used these days.

1. PHOTO MOSAIC:

In the field of photographic imaging, a photographic mosaic is a picture that has been

divided into (usually equal sized) sections, each of which is replaced with another

photograph of appropriate average color. When viewed at low magnifications, the

individual pixels appear as the primary image, while close examination reveals that

the image is infact made up of many smaller images. Originally, the term photo

mosaic referred to compound photographs created by stitching together a series of

adjacent pictures of a scene.

2. PANORAMIC PHOTOGRAPHY:

To capture a panoramic view on camera is not feasible given their large field of view.

Thus, it becomes beneficial when one can click multiple overlapping images of the

panorama and stitch them to form a single panoramic view, based on matching pixels

and intensities. The panoramic photography helps in protection of cultural heritage by

creating a panoramic view.

3. WALKTHROUGHS:

An architectural walkthrough utilizes computer software to provide a virtual tour of a

building or structure prior to its real life construction. The walkthrough is an

important communication tool to demonstrate how the building will be seen by

pedestrian users of a building; the software doesn’t confine the viewer to this

viewpoint and permits the possibility of seeing the building from all the angles. A 3D

6

view of a building can be created using image mosaicing., after having taken multiple

angular pictures [4].

2.2 EXISTING SYSTEMS

In the various attempts to produce panoramic views using image mosaicing, there

have been several algorithms over the years, each having a unique technique.

However, each of the techniques deals with a common orderly procedure to mosaic

images. These include the following:

1. PHASE CORRELATION METHOD:

In this method a large proportion of overlap is required. (Usually requires overlap

ratio over 50%). If the proportion of overlap is small it will lead to high error match.

2. CORNER MATCHING METHODS:

Methods such as Harris, SUSAN etc. none of them are scale-invariant. For images

with different scales, the method is unable to establish the corresponding relationship

between feature points and thus fail to stitch.

3. SIFT FEATURE MATCHING:

SIFT is based on scale space and presents its stability in most situation except rotation

and illumination changes. The image mosaic method based on SIFT feature matching

has strong robustness. The images at different scales can be stitched using SIFT

feature matching. However the computing of SIFT operator is very time consuming

which leads to low efficiency of image mosaic. SURF has later been shown to have

similar performance to SIFT, while at the same time being much faster.

7

Fig. 2.2.1 Computing performance comparison for SIFT and SURF

The above figure shows that the SURF operator has better computing performance

than SIFT operator [1].

2.3 HARDWARE AND SOFTWARE REQUIREMENTS

HARDWARE REQUIREMENTS-

The minimum system hardware requirements for MATLAB are:

Processors: Pentium III, IV, Xeon, Pentium M

Memory: 256 MB (512 MB or more highly recommended)

At least 200 MB of free disk space for full installation.

5 TO 10 megapixel camera

1GB RAM, processing speed 2GHz

SOFTWARE REQUIREMENTS-

Windows XP/Vista/7

MATLAB 7.5 onwards

8

3. ANALYSIS AND DESIGN

3.1 FUNCTIONAL REQUIREMENTS

The structure of Image Mosaicing system has to be designed in such a way that it

offers efficiency along with ease of use a lot of effort has to be given in order to

improve the fault tolerance of the system (regarding possible errors of the hardware

and the software). The matching system ought to have good performance in order to

provide results to the user in reasonable time.

Fig 3.1.1 Use Case

3.2 NON-FUNCTIONAL REQUIREMENTS

Efficiency: The software should efficiently utilize scarce resources: CPU

cycles, disk space, memory, bandwidth etc.

Flexibility: If the organization intends to increase or extend the functionality

of the software after it is deployed, that should be planned from the beginning;

9

it influences choices made during the design, development, testing and

deployment of the system.

Integrity: Integrity requirements define the security attributes of the system,

restricting access to features or data to certain users and protecting the privacy

of data entered into the software.

Scalability: The system should be scalable and work well with software

upgrades.

3.3 PROPOSED SYSTEM

The proposed system for image mosaicing is to mosaic two split cultural heritage

images of a large document based on SURF feature matching. This system needs

three steps to be followed which are as follows:

I. Extract SURF features.

In Image mosaicing, Feature extraction is a special form of dimensionality reduction.

When the input data to an algorithm is too large to be processed and it is suspected to

be redundant i.e. too much data, but not much information, then the input data will be

transformed into a reduced representation set of features. Transforming the input data

into the set of features is called feature extraction. If the features extracted are

carefully chosen, the features set will extract the relevant information from the input

data in order to perform the desired task using this reduced representation instead of

the full size input. Thus, in this step the interesting and desired features are detected

and extracted using two methods as follows:

A. Fast-Hessian Detector

The SURF detector is based on the determinant of the Hessian matrix is calculated.

Given a point X = (x, y) in an image I, the Hessian matrix H(X,σ ) in X at scale σ is

defined as follows:

10

H(X ,σ) =

Where ) is the convolution of Gaussian second order derivative ∂/∂

multiplied by g( with the image I in point X and similarly for ) and

[1].

The Scale-Space is constructed using box filters instead of Gaussian filters which is

used in SIFT. These box filters can be evaluated very fast using integral images,

independently of size. Therefore, the scale space is analyzed by up-scaling the filter

size rather than iteratively reducing the image size, which greatly reduces the

computing time. To localise interest points in the image and over scales, a non-

maximal suppression in a 3×3×3 neighbourhood is applied. Then interpolate the

nearby data to find the location in both space and scale to sub-pixel accuracy. In order

to do this, the determinant of the Hessian function, H(x, y,σ ) is expressed as a Taylor

expansion up to quadratic terms centered at detected location[1].

H(X) = H+ +

The interpolated location of the extremem =(x,y,σ) is found by taking the derivative

of this function and setting it to zero such that:

= -

B. SURF Descriptor

11

Orientation is developed based on the circular region around the selected interest

point. Then Haar-wavelet is calculated for all the points in radius of 6s. Then a sector

covering an angle of π/3 around the origin is rotated in the circle, wherever the sector

has maximum weights of responses, select that sector. Now the sum of x and y

responses in each direction in that sector is found. The vector with maximum sum is

selected as the dominant vector and its orientation is assigned to the interest point. A

square region centered around the interest point and oriented along the selected

orientation is constructed. The region is split up into smaller 4X4 square sub regions.

Within each of these sub-regions Haar-wavelets of size 2s are calculated for 25

uniformly distributed sample points.

V(sub region) = [Σdx, Σdy, Σ|dx|, Σ|dy|]

Therefore each sub-region contributes four values to the descriptor vector leading to

an overall vector of length 4× 4×4 = 64, as fig 2.2.1 shows. The resulting SURF

descriptor is invariant to rotation, scale, brightness and, after reduction to unit length,

contrast.

Fig 3.3.1 Detected results

II. RANSAC

RANSAC is an abbreviation for "RANdom SAmple Consensus".

It is an iterative method to estimate parameters of a mathematical model from a set of

observed data which contains outliers. It is a non-deterministic algorithm in the sense

12

that it produces a reasonable result only with a certain probability, with this

probability increasing as more iterations are allowed.

A basic assumption is that the data consists of "inliers", i.e., data whose distribution

can be explained by some set of model parameters, and "outliers" which are data that

do not fit the model. In addition to this, the data can be subject to noise. The outliers

can come, e.g., from extreme values of the noise or from erroneous measurements or

incorrect hypotheses about the interpretation of data. RANSAC also assumes that,

given a (usually small) set of inliers, there exists a procedure which can estimate the

parameters of a model that optimally explains or fits this data.

The input to the RANSAC algorithm is a set of observed data values, a parameterized

model which can explain or be fitted to the observations, and some confidence

parameters.

RANSAC achieves its goal by iteratively selecting a random subset of the original

data. These data are hypothetical inliers and this hypothesis is then tested as follows:

A model is fitted to the hypothetical inliers, i.e. all free parameters of the model are

reconstructed from the data set. All other data are then tested against the fitted model

and, if a point fits well to the estimated model, also considered as a hypothetical

inlier. The estimated model is reasonably good if sufficiently many points have been

classified as hypothetical inliers. The model is re-estimated from all hypothetical

inliers, because it has only been estimated from the initial set of hypothetical inliers.

Finally, the model is evaluated by estimating the error of the inliers relative to the

model. This procedure is repeated a fixed number of times, each time producing either

a model which is rejected because too few points are classified as inliers or a refined

model together with a corresponding error measure. In the latter case, we keep the

refined model if its error is lower than the last saved model.

Advantages:

13

An advantage of RANSAC is its ability to do robust estimation of the model

parameters, i.e., it can estimate the parameters with a high degree of accuracy even

when significant amount of outliers are present in the data set.

Disadvantages:

A disadvantage of RANSAC is that there is no upper bound on the time it takes to

compute these parameters. When an upper time bound is used (a maximum number of

iterations) the solution obtained may not be the most optimal one.

Another disadvantage of RANSAC is that it requires the setting of problem-specific

thresholds. RANSAC can only estimate one model for a particular data set. As for any

one-model approach when two (or more) models exist, RANSAC may fail to find

either one.

Fig 3.3.2 A data set with

many outliers for which a

line has to be fitted.

Fig 3.3.3 Fitted Line with

RANSAC, outliers have no

influence on the result

http://en.wikipedia.org/wiki/Image:Fitted_line.svg%00%E5%A1%B9%EF%92%81%E1%B4%BB%E4%A1%BF%E2%B2%AF%E5%B6%82%E8%97%84%E6%8C%A7%00%00%EA%AE%A5

http://en.wikipedia.org/wiki/Image:Line_with_outliers.svg%00%E5%A1%B9%EF%92%81%E1%B4%BB%E4%A1%BF%E2%B2%AF%E5%B6%82%E8%97%84%E6%8C%A7%00%00%EA%AE%A5

14

Fig 3.3.4 Filtered & Purified Matched Points

III. Matching SURF features

Features from the images to be mosaiced are extracted and subset of these images is

considered and a matching algorithm is applied. The transformation matrix for the

images is thus formed and solved.

IV. Image Fusion

Images in overlapping region may have different brightness, different scale, so a good

blending strategy is needed that will fuse the images as required. The idea

behind multi-band blending is to blend low frequencies over a large spatial range and

high frequencies over a short range.

3.4 DESIGN CONSIDERTION

AREA OF APPLICATION

In the implementation of our project for image mosaicing, we consider the

application of image mosaicing in the area of cultural heritage protection. The

images of a cultural monument are taken at different instances of time and the

consecutive images having an overlapping portion are then mosaiced together

to get a full and clear view of the cultural monument. Thus, making it easy for

the tourists to have a better look at the monument and study it well.

PARAMETERS

The images required to be stitched should be of the uniform vertical height.

The two images should have an overlapped region such that the rightmost part

of one image overlaps with the leftmost part of the other image. Also, the

objects in the overlapping region of two images should have same angle with

respect to the camera. The images can be at different scale. The format of the

two images captured should be equal.

15

STITCHING METHOD

To stitch the two images, the overlapping region between the two images is

used to obtain the position where there is a seamless stitch. Feature points are

extracted from the two images. The extracted feature points are then matched

correctly. RANSAC algorithm is then applied to eliminate outliers to ensure

the effectiveness of matching. The two images are then blended accurately to

obtain a mosaiced image [4].

3.5 DESIGN DETAILS

The system is implemented in MATLAB. The implementation will consist of a

window in the front end which will prompt the user to take the two cultural heritage

images as input. There will be a output button which on clicking will display the

mosaiced image if the two input images meet the design considerations. If not then a

message box will be displayed showing ‘Image Mosaicing Not Possible’. The

flowchart for the method is:

INPUT IMAGES

FEATURE POINT EXTRACTION USING FAST HESSIAN

DETECTOR

SURF DESCRIPTOR

ELIMINATE ERROR MATCHINGS USING RANSAC

ALGORITHM

IMAGE FUSION

MATCHING SURF FEATURES

16

OUTPUT IMAGE MOSAIC

Fig 3.5.1 Project overview

To implement the design we follow the below given steps:

1. Interest points are found using Hessian matrix.

2. Extract the SURF descriptor by following the steps as explained in the

diagram below. Therefore each sub-region contributes four values to the

descriptor vector leading to an overall vector of length 4× 4×4 = 64. The

resulting SURF descriptor is invariant to rotation, scale, brightness and, after

reduction to unit length, contrast.

Fig 3.5.2 Orientation assignment

17

Fig 3.5.3 descriptor composition

3. Eliminate error matching using RANSAC algorithm

4. Match the extracted features by following the below steps:

The method only selects part of them to match. Let and represent SURF feature

set of the left and right image respectively, a sub-set of and a sub-set of .

Concrete matching steps are as follows:

i. Check the number of SURF features in left image, if M greater than the threshold η,

then jump to step ii for features selecting, in experiment η is set to 150; Otherwise

= , jump to step iii.

ii. The SURF features extracted from images which have rich texture are in great

amount and intensive, as Fig 4.2.3 shows. Match SURF features from direction ->

.We only match features for instead of We get the set .

iii. Then match SURF features from direction -> . We only match features for

instead of We get the set .

iv. Finally, the matched pairs set H is the intersection of and .

18

Fig 3.5.4 Detected results Fig 3.5.5 distance between 2 feature

points

5. Fuse the images using a Multi-band blending algorithm.

SOFTWARE ARCHITECTURE DESIGN

19

Fig 3.5.6 Software Architecture

4. IMPLEMTATION

20

4.1 IMPLEMENTATION RESULTS

This module is created using MATLAB GUI guide. The basic function of this module

is to connect all the modules and also to provide a simple interface to the user.

4.1.1 Main GUI Window

Main GUI window consists of a panel having 5 buttons mainly

About Image Mosaicing: It gives introduction to Image Mosaicing.

Implement Image Mosaicing: It shows the implementation module.

View Database: It shows the Database of Images.

About Us: It displays the project members’ details.

Exit: To exit from the GUI.

Fig 4.1.1.1 The Main GUI Window

4.1.2 About Image Mosaicing

21

Fig 4.1.2 GUI Window about Image Mosaicing

4.1.3 Implementation Module

Fig 4.1.3 The GUI Window showing implementation module

4.1.4 Existing Database

22

Fig 4.1.4 The GUI Window showing the Existing Database

4.1.5 Project Members’ Details

Fig 4.1.5 The GUI Window showing the details of the Members

4.1.6 Loading of the First Image

23

Fig 4.1.6 The GUI Window showing loading of image 1

4.1.7 Loading of the Second Image


4.1.8 System after Image Mosaicing

24


Given bellow are the results of few more input images.

Result 1

Fig 4.1.8.2 Input Image 1 Fig 4.1.8.3 Input Image 2

Fig 4.1.8.4 Input Image 3

25

Fig 4.1.8.4 Mosaiced Image

Result 2



26

Result 3



27

4.2 Implementation Analysis

INPUT IMAGES INLIRES DETAILS

Image 1

Image 2

Number of accepted matches was

41

Number of inliers was 35 (85%)


39


Finding key points...

414 key points found.

Finding key points...

368 key points found.

Found 70 matches.

Image 1

Image 2


228



245


Finding keypoints...

1909 keypoints found.



Found 369 matches.

28

Image 1

Image 2


375



381






Found 106 matches.

Image 1

Image 2


306



296






Found 102 matches.

Fig 4.2.1 Analysis Table

29

5. TESTING

5.1 TEST CASES

The system was tested with the help of the following test cases:

5.1.1. Images having overlapping region

Fig 5.1.1.1 The main GUI

The above figure shows the main GUI on which testing of different images is

performed. It takes as input two images with a common overlapped region and returns

the resulting mosaiced image. The images are browsed using the ‘Browse’ button.

The ‘Mosaic’ button gives the resulting output if all the requirements are met.

The ‘Reset’ button resets the GUI for new input images to be mosaiced. The ‘Main

Menu’ button displays the main menu.

30

Fig.5.1.1.2

The above test case shows that if two input images have some portion of overlapped

region, then the mosaicing is possible. A pop-up is displayed for the same.

The resulting output is seen as follows.

Fig.5.1.1.3 Test case 1

The above figure shows the resulting output after image mosaicing. The mosaiced

image is displayed in the area provided for it.

31

5.1.2. Images not having overlapping region

Fig.5.1.2.1


The above test case shows that if two different images are given as input with no

overlapped region between them, it results into an error displaying “Mosaicing not

possible”. The pop-up window shows the same.

32

5.1.3. Inappropriate selection of matching points

Fig.5.1.3.1

The above test case shows that if two input images have some portion of overlapped

region, then the mosaicing is possible. A pop-up is displayed for the same.

The resulting output is seen as follows:


Here we can observe the mosaiced image is slightly distorted. Its because the

matching points are not selected accurately.

33

6. CONCLUSION AND FURTHER WORK

We have seen that Image Mosaicing based on Feature Point Matching is capable of

producing good quality image mosaics and it automatically stitch images acquired by

a panning camera into a mosaic .The results of this short investigation are very

promising. The proposed system for to The result of mosaicing depends on the

accuracy of selecting the corresponding points from both the images. Higher the

accuracy better is the quality of mosaics generated. The algorithm can be

implemented on more than two images. The results produced demonstrate the mosaics

of panoramic images and the quality of the results is particularly encouraging.

Overall, the development of the program can be regarded as a success and forms solid

basis for further development.

The program developed has a clear potential for extension, particularly for Mosaicing

a large number of images. The primary task for enhancement would be to reduce

errors when stitching large quantity of images.

Documents

The Final Report