41
Fast 2D to 3D Conversion Using Wavelet Analysis CHAPTER 1 INTRODUCTION There are several technologies regarding the conversion of 2-D contents for 3-D TV systems, for example, Philips WOWVX system. In the system, a 3-D data representation which includes the traditional 2-D images and their associated per-pixel depth maps is adopted. The depth maps associated with X-Y information can be used to describe the spatial location of each point in the images. These data are processed by customized DSP and optical devices to emit rays into our eyes as stereoscopic images. The key problem rest in the above system is how to obtain the depth information from the 2D data. Recently a new technology called Depth Image-Based Rendering (OlBR) has been applied to the advanced 3-D TV system. One method to obtain a relative depth map from a single image using wavelet analysis and edge defocus estimation based on Lipschitz exponents was proposed in. Images were handled as series of 1-0 row signals, with the resulting horizontal stripes in the depth map. The depth map is further optimized and smoothed based on color segmentation to obtain much more accurate and reliable results. In this paper, a more simple approach is proposed to obtain the depth map of an image. In our approach, each image is first transformed to the grayscale Dept of DECS 1

2D to 3 D Conversion

Embed Size (px)

DESCRIPTION

This project proposes a depth estimation method which converts two-dimensional images of limited depth of field (DOF) into three-dimensional data. The goal is to separate the focused foreground objects from the blurred background objects in an image. Our approach is based on two observations: (1) the focused objects on an image of limited DOF correspond to the objects with high frequency; (2) the high-frequency area of an image appears high energy on its high frequency wavelet sub bands. In our approach, each image is first transformed to grayscale image then further transformed to the wavelet domain. Afterwards, the high frequency area of an image can be obtained from analyzing the high-frequency wavelet sub bands of the image. Finally, binarization and smoothing techniques are applied to find the position of the focused objects on the image. The experimental result demonstrates the effectiveness of our approach.

Citation preview

Page 1: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 1

INTRODUCTION

There are several technologies regarding the conversion of 2-D contents for 3-D

TV systems, for example, Philips WOWVX system. In the system, a 3-D data

representation which includes the traditional 2-D images and their associated per-pixel

depth maps is adopted. The depth maps associated with X-Y information can be used to

describe the spatial location of each point in the images. These data are processed by

customized DSP and optical devices to emit rays into our eyes as stereoscopic images.

The key problem rest in the above system is how to obtain the depth information

from the 2D data. Recently a new technology called Depth Image-Based Rendering

(OlBR) has been applied to the advanced 3-D TV system. One method to obtain a relative

depth map from a single image using wavelet analysis and edge defocus estimation based

on Lipschitz exponents was proposed in. Images were handled as series of 1-0 row

signals, with the resulting horizontal stripes in the depth map.

The depth map is further optimized and smoothed based on color segmentation to

obtain much more accurate and reliable results. In this paper, a more simple approach is

proposed to obtain the depth map of an image. In our approach, each image is first

transformed to the grayscale imaging color .Afterwards; the high-frequency area of an

image can be obtained from analyzing the high-frequency wavelet sub bands of the

image. Finally, binarization and smoothing techniques are applied to find the position of

the focused objects on the image.

Dept of DECS 1

Page 2: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 2

LITERATURE SURVEY

Te-Wei Chiang

Department of Accounting Information Systems

Chihlee Institute of Technology

Banciao City,Taiwan, R.O.C.

[email protected]

2.1 ABSTRACT

This paper proposes a depth estimation method which converts two-dimensional

images of limited depth of field (DOF) into three-dimensional data. The goal is to

separate the focused foreground objects from the blurred background objects in an image.

Our approach is based on two observations: (1) the focused objects on an image of

limited DOF correspond to the objects with high frequency; (2) the high-frequency area

of an image appears high energy on its high-frequency waveletsubbands.

In our approach, each image is first transformed to the grayscale, and then Y

component of the image is further transformed to the wavelet domain. Afterwards, the

high frequency area of an image can be obtained from analyzing the high-frequency

wavelet sub bands of the image. Finally, binarization and smoothing techniques are

applied to find the position of the focused objects on the image. The experimental Result

demonstrates the effectiveness of our approach. There are several technologies regarding

the conversion of 2-D contents for 3-D TV systems, for example, Philips WOWVX™

system.

In the system, a 3-D data representation which includes the traditional 2-D images

and their associated per-pixel depth maps is adopted. The depth maps associated with X-

Y information can be used to describe the spatial location of each point in the images.

These data are processed by customized DSP and optical devices to emit rays into our

eyes as stereoscopic images.

Dept of DECS 2

Page 3: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

1. Mathematics Discipline,

Khulna University, Khulna-9208, Bangladesh

2. Department of Mathematics,

Comilla University, Comilla-3500, Bangladesh

Email: [email protected]

2.2 ABSTRACT

Wavelet analysis is an exciting new method for solving difficult problems in

mathematics, physics, and engineering, with modern applications as diverse as wave

propagation, data compression, signal processing, image processing, pattern recognition,

computer graphics, the detection of aircraft and submarines and other medical image

technology. Wavelets allow complex information such as music, speech, images and

patterns to be decomposed into elementary forms at different positions and scales and

subsequently reconstructed with high precision. Signal transmission is based on

transmission of a series of numbers.

The series representation of a function is important in all types of signal

transmission. The wavelet representation of a function is a new technique. Wavelet

transform of a function is the improved version of Fourier transform.

2.3. INTRODUCTION

In 1982 Jean Morlet a French geophysicist, introduced the concept of a `wavelet'.

The wavelet means small wave and the study of wavelet transform is a new tool for

seismic signal analysis. Immediately, Alex Grossmann theoretical physicists studied

inverse formula for the wavelet transform.

The joint collaboration of Morlet and Grossmann yielded a detailed mathematical

study of the continuous Wavelet transforms and their various applications, of course

without the realization that similar results had already been obtained in 1950's by

Calderon, Littlewood, Paley and Franklin. However, the rediscovery of the old concepts

provided a new method for decomposing a function or a signal.

Wavelet analysis is originally introduced in order to improve seismic signal

analysis by switching from shortime Fourier analysis to new better algorithms to detect

and analyze abrupt changes in signals Daubechies [2,3], Mallat [6]. Intime-frequency

analysis of a signal, the classical Fourier transform analysis is inadequate because Fourier

transform of a signal does not contain any local information. This is the major drawback

Dept of DECS 3

Page 4: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

of the Fourier transform. To overcome this drawback, Dennis Gabor in 1946, first

introduced the windowed-Fourier transform, i.e. short-time Fourier transform known later

as Gabor transform. Meyer [7] found the existing literature of wavelets. Later many

eminent mathematicians e.g. I. Daubechies, A. Grossmann, S. Mallat, Y. Meyer, R. A.

deVore, Coifman, V. Wickerhauser made a remarkable contribution to the wavelet

theory. The modern applications of wavelet theory diverse mainly as wave propagation,

data compression, signal processing, image processing, pattern recognition, computer

graphics, the detection of aircraft and submarines, improvement of CAT scans and some

other medical image technology etc. In this study, our main goal is to find out the

advantages of wavelet transform compared to Fourier transform.

Dept of DECS 4

Page 5: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 3

IMAGE CLASSIFICATION AND DIGITIZATION

In general images can be defined as any two dimensional function f(x, y) where x,

y are spatial coordinates, and amplitude of f at any pair of coordinates(x,y) is called

intensity or gray level of the image at that point.

3.1 DIGITAL IMAGE

When x, y and the amplitude values of f are all finite, discrete quantities, we call

the image a digital image.

Fig 3.1 A Digital Image

3.1.1 PIXEL:

A pixel is a single point in a graphic image. Graphics monitors display pictures by

dividing the display screen into thousands (or millions) of pixels, arranged in rows and

columns. The pixels are so close together that they appear connected. The number of bits

used to represent each pixel determines how many colors or shades of gray can be

displayed. For example, in 8-bit color mode, the color monitor uses 8 bits for each pixel,

making it possible to display 2 to the 8th power (256) different colors or shades of gray.

3.2 IMAGE TYPES

The different types of images are

1. Binary Images

2. Indexed Images

3. Intensity Images

Dept of DECS 5

Page 6: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

4. Multi-frame Images

5. RGB Images.

3.2.1 BINARY IMAGE:

An image contains only black and white pixels. In MATLAB, a binary image is

represented by a uint8 or double logical matrix containing 0's and 1's (which usually

represent black and white, respectively). A matrix is logical when its "logical flag" is

turned "on." We often use the variable name BW to represent a binary image in memory.

Fig 3.2 (a) Binary Image (b) Intensity image (c) RGB image

3.2.2 INDEXED IMAGE:

An image pixel values are direct indices into an RGB color map. In MATLAB, an

indexed image is represented by an array of class uint8, uint16, or double. The color map

is always an m-by-3 array of class double. We often use the variable name X to represent

an indexed image in memory, and map to represent the color map.

3.2.3 INTENSITY IMAGE:

An image consists of intensity (grayscale) values. In MATLAB, intensity images

are represented by an array of class uint8, uint16, or double. While intensity images are

not stored with color maps, MATLAB uses a system color map to display them. We often

use the variable name I to represent an intensity image in memory. This term is

synonymous with the term "grayscale."

3.2.4 MULTI-FRAME IMAGE:

An image file contains more than one image, or frame. When in MATLAB

memory, a multiframe image is a 4-Darray where the fourth dimension specifies the

Dept of DECS 6

im2bw rgb2gray rgb2ind

Page 7: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

frame number. This term is synonymous with the term "multipage image."

3.2.5 RGB IMAGE:

In an image each pixel is specified by three values -- one each for the red, blue,

and green components of the pixel's color. In MATLAB, an RGB image is represented by

an m-by-n-by-3 array of class uint8, uint16, or double. We often use the variable name

RGB to represent an RGB image in memory.

3.3 IMAGE DIGITIZATION

An image captured by a sensor is expressed as a continuous function f(x, y) of two

coordinates in the plane. Image digitization means that the function f(x, y) is sampled into

a matrix with m rows and n columns. The image quantization assigns to each continuous

sample an integer value. The continuous range of image functions f(x, y) is split into k

intervals. The finer the sampling (i.e. the larger m and n) and quantization (larger k) the

better the approximation of the continuous image f(x, y).

3.4 SAMPLING AND QUANTIZATION

To be suitable for computer processing an image function must be digitized both

spatially and in amplitude. Digitization of spatial coordinates is called image sampling

and amplitude digitization is called gray level quantization.

Dept of DECS 7

Page 8: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 4

IMAGE PROCESSING

The field of digital image processing refers to processing of digital image by

means of a digital computer. A digital image is an image f(x, y) that has been discretized

both in spatial coordinates and brightness. A digital image can be considered as a matrix

whose row and column indices identifies a point in the image and corresponding matrix

element value identifies the gray level at that point. The elements of such a digital array

are called image elements, picture elements, pixels or pels. The last two being commonly

used abbreviations of “pictures elements”.

The term digital processing generally refers to a two dimensional picture by a

digital computer. In a broader context it implies digital processing of any two dimensional

data.

In the form in which they usually occur, images are not directly amenable to

computer analysis. Since computers work with numerical rather than pictorial data, an

image must be converted to numerical form before processing. This conversion process is

called “digitization”. The image is divided into small regions called picture elements or

“pixels “. At each pixel location the image brightness is sample and quantized. This step

generates an integer at each pixel representing the brightness or darkness of the image at

that point.

When this has been done for all pixels the image is represented by rectangular

array of integers. Each location has allocation or address, and an integer value called

“gray level”. This array digital data is now candidate for computer processing.

4.1 APPLICATIONS OF DIGITAL IMAGE PROCESSING

1. Office automation: optical character recognition; document processing cursive

script recognition; logo and icon recognition; etc.

2. Industrial automation: automation inspection system; non destructive testing;

automatic assembling; process related to VLSI manufacturing; PCB checking; etc.

3. Robotics: Oil and natural gas exploration; etc

Dept of DECS 8

Page 9: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

4. Bio-medical: ECG, EEG, EMG analysis; cytological, histological and

stereological applications; automated radiology and pathology; x-ray image

analysis; etc

5. Remote sensing: natural resources survey and management; estimation related to

agriculture, hydrology forestry, mineralogy; urban planning; environment control

and pollution control; etc

6. Criminology: finger print identification; human face registration and matching;

forensic investigation; etc.

7. Astronomy and space applications: restoration of images suffering from

geometric and photometric distortions; etc.

8. Information technology: facsimiles image transmission, video text; Video

conferencing and video phones; etc.

9. Entertainment and consumer electronics: HDVT; multimedia and video

editing.

10. Military applications: missile guidance and detection; target identification;

navigation of pilot less vehicle; reconnaissance; and range finding; etc.

11. Printing and graphics art: color fidelity in desktop publishing; art conservation

and dissemination; etc.

Dept of DECS 9

Page 10: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 5

WAVELET ANALYSIS

A wavelet is a waveform of effectively limited duration that has an average value

of zero. Compare wavelets with sine waves, which are the basis of Fourier analysis.

Sinusoids do not have limited duration they extend from minus to plus infinity. And

where sinusoids are smooth and predictable, wavelets tend to be irregular and

asymmetric.

Fourier analysis consists of breaking up a signal into sine waves of various

frequencies. Similarly, wavelet analysis is the breaking up of a signal into shifted and

scaled versions of the original (or mother) wavelet.

Just looking at pictures of wavelets and sine waves, you can see intuitively that

signals with sharp changes might be better analyzed with an irregular wavelet than with a

smooth sinusoid, just as some foods are better handled with a fork than a spoon. It also

makes sense that local features can be described better with wavelets that have local

extent.

5.1. WHAT CAN WAVELET ANALYSIS DO?

One major advantage afforded by wavelets is the ability to perform local analysis

that is to analyze a localized area of a larger signal.

Consider a sinusoidal signal with a small discontinuity one so tiny as to be barely

visible. Such a signal easily could be generated in the real world, perhaps by a power

fluctuation or a noisy switch.

Dept of DECS 10

Page 11: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

Wavelet analysis is capable of revealing aspects of data that other signal analysis

techniques miss aspects like trends, breakdown points, discontinuities in higher

derivatives, and self-similarity. Furthermore, because it affords a different view of data

than those presented by traditional techniques, wavelet analysis can often compress or de-

noise a signal without appreciable degradation.

Indeed, in their brief history within the signal processing field, wavelets have

already proven themselves to be an indispensable addition to the analyst's collection of

tools and continue to enjoy a burgeoning popularity today.

5.2. CONTINUOUS WAVELET TRANSFORM

Like the Fourier transform, the continuous wavelet transform (CWT) uses inner

products to measure the similarity between a signal and an analyzing function. In the

Fourier transform, the analyzing functions are complex exponentials. The resulting

transform is a function of a single variable, ω. In the short-time Fourier transform, the

analyzing functions are windowed complex exponentials and the result in a

function of two variables. The STFT coefficients represent the match between the

signal and a sinusoid with angular frequency ω in an interval of a specified

length centered at τ.

In the CWT, the analyzing function is a wavelet, ψ. The CWT compares the signal

to shifted and compressed or stretched versions of a wavelet. Stretching or compressing a

function is collectively referred to as dilation or scaling and corresponds to the physical

notion of scale. By comparing the signal to the wavelet at various scales and positions,

you obtain a function of two variables. The two-dimensional representation of a one-

dimensional signal is redundant. If the wavelet is complex-valued, the CWT is a

complex-valued function of scale and position. If the signal is real-valued, the CWT is a

real-valued function of scale and position. For a scale parameter, a>0, and position, b, the

CWT is:

Dept of DECS 11

Page 12: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

Where * denotes the complex conjugate. Not only do the values of scale and

position affect the CWT coefficients, the choice of wavelet also affects the values of the

coefficients.

By continuously varying the values of the scale parameter, a, and the position

parameter, b, you obtain the cwt coefficients C (a, b). Note that for convenience, the

dependence of the CWT coefficients on the function and analyzing wavelet has been

suppressed.

Multiplying each coefficient by the appropriately scaled and shifted wavelet yields

the constituent wavelets of the original signal.

There are many different admissible wavelets that can be used in the CWT. While

it may seem confusing that there are so many choices for the analyzing wavelet, it is

actually strength of wavelet analysis. Depending on what signal features you are trying to

detect, you are free to select a wavelet that facilitates your detection of that feature. For

example, if you are trying to detect abrupt discontinuities in your signal, you may choose

one wavelet. On the other hand, if you are interesting in finding oscillations with smooth

onsets and offsets, you are free to choose a wavelet that more closely matches that

behavior.

5.3. DISCRETE WAVELET TRANSFORM

If we choose scales and positions based on powers of two so-called dyadic scales

and positions then our analysis will be much more efficient and just as accurate. We

obtain such an analysis from the discrete wavelet transform (DWT).

Dept of DECS 12

Page 13: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

An efficient way to implement this scheme using filters was developed in 1988 by

Mallat. The Mallat algorithm is in fact a classical scheme known in the signal processing

community as a two-channel subband coder. This very practical filtering algorithm yields

a fast wavelet transform which is a box into which a signal passes, and out of which

wavelet coefficients quickly emerge.

5.3.1. ONE-STAGE FILTERING: APPROXIMATIONS AND

DETAILS

For many signals, the low-frequency content is the most important part. It is what

gives the signal its identity. The high-frequency content, on the other hand, imparts flavor

or nuance. Consider the human voice. If you remove the high-frequency components, the

voice sounds different, but you can still tell what's being said. However, if you remove

enough of the low-frequency components, you hear gibberish.

In wavelet analysis, the approximations are the high-scale, low-frequency

components of the signal and the details are the low-scale, high-frequency components.

The filtering process at its most basic level looks as shown in figure below:

The original signal ‘S’ passes through two complementary filters and emerges as 2

signals.

5.3.2. MULTIPLE-LEVEL DECOMPOSITION

The decomposition process can be iterated, with successive approximations being

decomposed in turn, so that one signal is broken down into many lower resolution

components. This is called the wavelet decomposition tree.

Dept of DECS 13

Page 14: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

5.3.3. NUMBER OF LEVELS

Since the analysis process is iterative, in theory it can be continued indefinitely.

In reality, the decomposition can proceed only until the individual details consist of a

single sample or pixel.

5.3.4. WAVELET RECONSTRUCTION

We have seen how the discrete wavelet transform can be used to analyze or

decompose signals and images. This process is called decomposition or analysis. The

other half of the story is how those components can be assembled back into the original

signal without loss of information. This process is called reconstruction or synthesis. The

mathematical manipulation that effects synthesis is called the inverse discrete wavelet

transforms (IDWT).

To synthesize a signal using Wavelet Toolbox software, we reconstruct it from the

wavelet coefficients.

Dept of DECS 14

Page 15: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

The downsampling of the signal components performed during the decomposition

phase introduces a distortion called aliasing. It turns out that by carefully choosing filters

for the decomposition and reconstruction phases that are closely related (but not

identical); we can "cancel out" the effects of aliasing.

The low- and high-pass decomposition filters (L and H), together with their

associated reconstruction filters (L' and H'), form a system of what is called quadrature

mirror filters:

The reconstructed details and approximations are true constituents of the original

signal. In fact, we find when we combine them that

A1 + D1 = S

Dept of DECS 15

Page 16: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

5.3.5. MULTISTEP DECOMPOSITION AND RECONSTRUCTION

A multistep analysis-synthesis process can be represented as shown in figure

below:

This process involves two aspects: breaking up a signal to obtain the wavelet coefficients,

and reassembling the signal from the coefficients.

Dept of DECS 16

Page 17: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 6

BACKGROUND AND RELATED WORK

6.1. YUV COLOR SPACE

There are some existing color models to describe images, known as color spaces,

such as ROB, HSV, HIS, YUV, etc. ROB is perhaps the simplest color space for people

to understand because it corresponds to the three colors that the human eyes can detect.

However, the ROB color model is unsuitable for similarity comparison. The luminance

and saturation information are implicitly contained in the R, G, and B values. The YUV

(brightness, blue chrominance, and red chrominance) model defines a color space in

terms of one luminance and two chrominance components, which are created from an

original ROB (red, green and blue) source. The weighted values of R, G and B are added

together to produce a single Y signal, representing the overall brightness, or luminance, of

that spot. The U signal is then created by subtracting the Y from the blue signal of the

original ROB, and then scaling; and V by subtracting the Y from the red, and then scaling

by a different factor. There are many slightly different formulas to convert between YUV

and ROB. The only major difference is a few decimal places. The equations used to

convert from RGB to YUV spaces can be found in

Y(x, y) =O.299R(x, y) +O. 587G(x, y) +O.114B(x, y),

U(x, y) = 0.492(B(x, y)-Y(x, y)),

V(x, y) = O.877(R(x, y)-Y(x, y)).

6.2 WAVELET ANALYSIS

The multi resolution wavelet transform has been shown to be an effective

technique and achieved very good performance for texture analysis. An image can be

decomposed into its wavelet coefficients by using Mallat's pyramid algorithm. After

wavelet decomposition, the object image energy is distributed in different sub bands, each

of which keeps a specific frequency component. In other words, each sub band image

contains one directional feature. The wavelet decomposition is illustrated in Fig. 6.1.

Given an image (see Fig. 6.1 (a)), four sub images (see Fig. 6.2(b)), i.e. DC-component

(upper left), H-component (upper right), V-component (lower left), and D-component

(lower right), of the image can be obtained after the wavelet decomposition. Here H, V

Dept of DECS 17

Page 18: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

and 0 are used to indicate horizontal, vertical and diagonal, respectively. From Fig. 6.1

(b), it can be found that horizontal edges, vertical edges and diagonal edges of the image

can be obtained from the wavelet decomposition of the image.

Fig 6.1(a) A test image and (b) its wavelet decomposed image (or sub images)

Fig 6.2 Illustration of the proposed wavelet-based edge detection method:(a) The horizontal component,(b) The vertical component,(c) The diagonal component, and(d) The combined result of the image given in Fig 6.1.

6.3 RELATED WORK

For images of limited depth of field (DOF), the main foreground objects are

focused with sharp edges and the objects in the background are blurred. In other words,

the high frequencies are retained in the focused foreground, but greatly attenuated in the

Dept of DECS 18

Page 19: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

background. This suggests that the spatial frequency is directly related with the degree of

blurring, and thus the relative distance of the object from the camera. The high

frequencies can be described by the coefficients of the wavelet transform of the image. If

there is larger energy in the wavelet bands of high frequency, it suggests that there are

more details and less blurring in this region, where the 3-D location is nearer. The

elementary relative depth can be estimated based on the values of wavelet coefficients in

the high frequency bands. Based on this, divide the images into macro blocks of size are

16-pixel by 16-pixel. A macro block wavelet transforms which generated 256 wavelet

coefficients was performed. Relative depth was estimated by counting the number of non-

zero wavelet coefficients. A method to obtain a relative depth map from a single image

using wavelet analysis and edge defocus estimation based on Lipschitz exponents was

proposed in. Images were handled as series of 1-0 row signals, with the resulting

horizontal stripes in the depth map.

To overcome this issue, an incremental algorithm based on wavelet transform and

edge focus analysis in two-dimensions was proposed in, taking into account the direction

of edges and the two-dimensional characteristics of images. The depth map is further

optimized and smoothed based on color segmentation to obtain much more accurate and

reliable results.

Dept of DECS 19

Page 20: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 7

METHODOLOGY

The proposed depth map estimation algorithm is introduced in this section, which

can be summarized as the following steps.

Fig 7: Steps Involved in Depth map Estimation

7.1 Y COMPONENT EXTRACTIONS

Since the focused object on an image is the object with high-frequency (or fine

texture), the simplest way to distinguish the focused object from others is to analyze the

texture of the image. Moreover, the Y component of an image represents the overall

brightness (or luminance) of the image; the texture of the Y component of the image is

similar to that of the original color image. Therefore, in our approach, each image is first

Dept of DECS 20

2d Input Image

Y component extraction

Wavelet-based edge detection

Smoothing for edge

defocusing

First binarization

for edge enhancement

Second binarization

for noise Removing

3d Depth map

Page 21: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

transformed from the standard RGB color space to the YUV space; then Y component of

the image is further transformed to the wavelet domain.

7.2 WAVELET-BASED EDGE DETECTION

As discussed before the depth of limited-DOF images can be measured by their

frequencies. In this step, we analyze the frequency energy based on the wavelet

transforms. Basically, the edges of the focused object appear high frequency energy. Each

pixel of the wavelet sub-band image corresponds to a wavelet coefficient. The larger the

value of a wavelet coefficient, larger is the energy within the corresponding pixel. The

values of the coefficients in the high frequency wavelet sub-bands (the H-component, V-

component, and D-component) show how much the details are not blurred, and therefore

give a relative depth value. The range of depth is adjusted from 0 to 255 (0 denotes black

and 255 denotes white in the depth map). Larger depth value indicates nearer in distance.

Since the wavelet analysis can extract the directional edges of an image easily, we

can obtain the overall edges of the image by merging its directional edges.

Given the test image shown in Fig. 6.1(a), three sub-images with different

directional edges, i.e. H-component (see Fig. 6.2(a)), V-component (see Fig. 6.2(b)) and

D-component (see Fig. 6.3(c)), of the image can be obtained after the wavelet

decomposition. By merging the three sub-images, we can obtain the overall edges of the

original image, which results in our initial depth map, as shown in Fig. 6.2(d).

Fig 7.1 (a) Input Image (b) its Y component (c) edges detected by wavelet-based approach

7.3 EDGE ENHANCEMENT BY BINARIZATION

Image binarization converts an image of up to 256 gray levels to a black and white

image. Frequently, binarization is used as a pre-processor before optical character

Dept of DECS 21

(a) (b) (C)

Page 22: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

recognition (OCR). The simplest way to use image binarization is to choose a threshold

value, and classify all pixels with values above this threshold as white, and all other

pixels as black.

In our study, each pixel of the wavelet sub-band image corresponds to a wavelet

coefficient. The larger the value of the corresponding wavelet coefficient, larger is the

energy within the pixel. After the previous steps, the edges of the focused object appear

high-frequency energy and the values of the corresponding wavelet coefficients range

from 0 to 255. For the purpose of enhancing the important edges, we re-assign the value

of a wavelet coefficient to 255 if its original is larger than a particular threshold; and re-

assign it to 0, otherwise. Therefore, the pixels with high-frequency energy over the

threshold will be enhanced.

Fig 7.2 the initial depth map after binarization using varying threshold value: (a) T=10, (b) T=15 and (c) T=20.

7.4 NOISE DEFOCUSING BY SMOOTHING

Smoothing algorithms are often applied in order to reduce noise and/or to prepare

images for further processing such as segmentation. They can be broadly categorized into

linear and non- linear algorithms where the former are amenable to analysis in the Fourier

domain and the latter are not. For the implementation of the linear algorithm, the filter

can be based on a rectangular support or a circular support.

In order to remove the noises on the initial depth map, smoothing techniques are

used to defocus them. Here noise refers to a high-energy pixel whose neighboring pixels

consist of low energy. In our study, a uniform rectangular filter is adopted for smoothing

where the output image is based on a local averaging of the input filter and all of the

values within the filter support have the same weight. To do so, for each pixel of the

depth map, its value is re-assigned to the average of the values of the GxG pixels whose

Dept of DECS 22

(a) (b) (C)

Page 23: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

center is the pixel. Note that the noises on the depth map can be further removed by the

subsequent binarization step.

Fig 7.3 the depth map after smoothing: (a) G=l, (b) G=2 and (c) G=3.

7.5 NOISE REMOVING BY BINARIZATION

After applying smoothing techniques on the depth map, the energy of the noisy

pixels can be reduced significantly. Therefore, the noises can be removed by the

binarization method, i.e. setting a threshold and remove those below the threshold. The

problem is how to select the correct threshold. In many cases, finding one threshold

compatible to the entire image is very difficult, and in many cases even impossible. In our

study, an optimal threshold will be examined through a series of experiments for the

illustrative image.

Fig 7.4 the depth map after smoothing and binarization: (a) G=1, T=0 (best), (b)

G=2, T=20 (best) and (c) G=3, T=50 (best)

Dept of DECS 23

(a) (b) (C)

(a) (b) (C)

Page 24: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 8

ADVANTAGES AND DISADAVNTAGES

8.1 ADVANTAGES OF DWT OVER DCT

1. No need to divide the input coding into non-overlapping 2-D blocks, it has higher

compression ratios avoid blocking artifacts.

2. Allows good localization both in time and spatial frequency domain.

3. Transformation of the whole imageà introduces inherent scaling.

4. Better identification of which data is relevant to human perceptionà higher

compression ratio

5. Higher flexibility: Wavelet function can be freely chosen

6. No need to divide the input coding into non-overlapping 2-D blocks, it has higher

compression ratios avoid blocking artifacts.

7. Transformation of the whole imageà introduces inherent scaling

8. Better identification of which data is relevant to human perceptionà higher

compression ratio (64:1 vs. 500:1)

8.2 DISADVANTAGES OF DWT

1. The cost of computing DWT as compared to DCT may be higher.

2. The use of larger DWT basis functions or wavelet filters produces blurring and

ringing noise near edge regions in images or video frames

3. Longer compression time

4. Lower quality than JPEG at low compression rates

8.3 FUTURE ENHANCEMENT OF WAVELET ANALYSIS

1. The combined use of wavelet transforms and Singular value decomposition (SVD)

gives the promising applications in finger print reading, mine detection.etc.

Dept of DECS 24

Page 25: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

2. The application of wavelet transform to determine the type of fault and its

automation incorporating PNN could achieve an accuracy of 100% for all type of

faults. Back propagation algorithm could not distinguish all of phase-ground and

double-line to ground faults.

3. The application of wavelets as a possible vehicle for investigating the issue of

market efficiency in futures markets for oil.

4. The application of wavelet theory in modeling and analyzing economic data (and

phenomena) is still in its infancy and many properties of these models are not

explored yet in economic and finance literature.

8.4 APPLICATIONS OF WAVELET ANALYSIS

Wavelets are a powerful statistical tool which can be used for a wide range of

applications, namely

Signal processing

Data compression

Smoothing and image denoising

Fingerprint verification

Biology for cell membrane recognition, to distinguish the normal from the

pathological membranes

DNA analysis, protein analysis

Blood-pressure, heart-rate and ECG analyses

Finance (which is more surprising), for detecting the properties of quick variation

of values

In Internet traffic description, for designing the services size

Industrial supervision of gear-wheel

Speech recognition

Computer graphics and multifractal analysis

Many areas of physics have seen this paradigm shift, including molecular

dynamics, astrophysics, optics, turbulence and quantum mechanics.

Dept of DECS 25

Page 26: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CHAPTER 9

EXPERIMENTAL RESULT

In this preliminary experiment, a focused red flower with obscure background is

used as the illustrative example (see Fig. 7.1(a)). Fig. 7.1(b) shows the Y component of

the image. Since Y component is the luminance of the image, it looks like a gray level

image. Then the edges of the flower can be detected using wavelet analysis, as shown in

Fig. 7.1(c). The sensitivity of varying the values of threshold for binarization is next

investigated. Fig. 7.2 shows the initial depth map using varying threshold value, T, for

binarization. It can be found that the best result occurs when T=15 (see Fig. 7.2(b)).

Although the focused flower is separated from the background, some noises appear in the

contour of the flower. In what follows, we examine the impact of smoothing to the depth

map. Fig. 7.3 gives the depth map after smoothing. It can be found that the noise in the

depth map is defocused. Fig. 7.4(a) gives the depth map after smoothing and binarization,

using G=1 and varying threshold value T=0. Fig. 7.4(b) gives the depth map after

smoothing and binarization, using G=2 and varying threshold value T=20. Fig. 7.4(c)

gives the depth map after smoothing and binarization, using G=3 and varying threshold

value T=50. Fig.8.1 summarizes the best one out of each depth map, which results from

one value of 0 (in our case, G = 1, 2, or 3) and different values of threshold T. It is

observed that Fig. 8.1(c) gives the best performance. In other words, the optimal

parameter combination is G=3 and T=50, in an attempt to achieve the depth map for Fig.

3(a).

Fig 8.1 the resulting depth map: (a) G=1 and T=O, (b) G=2 and T=20, and (c) G=3

and T=50(best).

Dept of DECS 26

(a) (b) (C)

Page 27: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

CONCLUSION

This paper proposes a depth estimation method which converts two-dimensional

images of limited depth of field (DOF) into three-dimensional data. The experimental

result shows that we can get simple depth maps easily through the wavelet analysis,

binarization and smoothing techniques. However, our approach has difficulty in face of

the DOF images with smoothed focused object. In such a situation, high-frequency

energy only lies in the edge of the focused object. To overcome this drawback, our future

work is to incorporate the color features. In addition, user-assisted workflow associated

with visual cues might be used to solve the problems.

Dept of DECS 27

Page 28: 2D to 3 D Conversion

Fast 2D to 3D Conversion Using Wavelet Analysis

REFERENCE

[1] A. Redert, R.P. Berretty, C. Varekamp, O. Willemsen, 1. SwiJlens, and H. Driessen,

"Philips 3D Solutions from Content Creation to Visualization,"The 3rd Int.

Symposium on 3D Data Processing, Visualization, and Transmission, pp.429-431,

June 2006

[2] Daubechies, I., "The Wavelet Transform, Time-Frequency Localization and Signal

Analysis," IEEE Trans. on Information Theory, vol. 36, pp.961-1005,1990.

[3] Mallat, S. “A Wavelet Tour of Signal Processing”, Academi Press, New York,1999.

[4] Walnut, D.F “An Introduction to Wavelet Analysis”,Birkhäuser, Boston, 2001.

[5] C Fehn, R.D.L. Barre, and S. Pastoor, "Interactive 3-DTV – Concepts and Key

Technologies," Proc. IEEE, vol. 94, no. 3, March 2006.

[6] S. A. Valencia, R. M. Rodriguez-Dagnino, "Synthesizing Stereo 3D Views from

Focus Cues in Monoscopic 2D images," Proc. SPIE, vol. 5006, pp.377-388, 2003.

[7] G. Gou, N. Zhang, L. Hou and W. Gao, " 20 to 3D Conversion Based on Edge

Defocus and Segmentation ", Proc. JCASSP, pp.2181-2184, 2008.

[8] A D. Bimbo,“Visual Information Retrieval”, San Francisco Morgan Kaufmann,

1999.

Dept of DECS 28