50
Computer Vision and Graphics (ee2031) Digital Image Processing I Dr John Collomosse [email protected] Centre for Vision, Speech and Signal Processing University of Surrey

Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Computer Vision and Graphics (ee2031) Digital Image Processing I

Dr John Collomosse [email protected]

Centre for Vision, Speech and Signal Processing University of Surrey

Page 2: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Learning Outcomes After attending this lecture, and doing the reading and labwork, you should be able to:

•  Describe the basic framework for performing linear filtering on a digital image (convolution)

•  Implement image blurring and sharpening operations.

•  Compare and contrast several low-pass filters and describe their operation in the context of image processing.

•  Define the Fourier transform in both continuous and discrete terms for the 1D and 2D cases.

•  Describe the convolution theorem, its links to the Fourier transform, and its implications for digital image processing.

Credit: Some images in these slides from Noah Snavly (Cornell). David Lowe (Columbia). Steve Seitz (Washington). Various creative commons sources.

Page 3: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Further reading:

Page 4: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

What is an Image?

An image is a rectangular grid (raster) of picture cels (= pixels)

= 255  255  255  255  255  255  255  255  255  255  255  255  

255  255  255  255  255  255  255  255  255  255  255  255  

255  255  255   20   0  255  255  255  255  255  255  255  

255  255  255   75   75   75  255  255  255  255  255  255  

255  255   75   95   95   75  255  255  255  255  255  255  

255  255   96  127  145  175  255  255  255  255  255  255  

255  255  127  145  175  175  175  255  255  255  255  255  

255  255  127  145  200  200  175  175   95  255  255  255  

255  255  127  145  200  200  175  175   95   47  255  255  

255  255  127  145  145  175  127  127   95   47  255  255  

255  255   74  127  127  127   95   95   95   47  255  255  

255  255  255   74   74   74   74   74   74  255  255  255  

255  255  255  255  255  255  255  255  255  255  255  255  

255  255  255  255  255  255  255  255  255  255  255  255  

Typically 1 byte (8 bits) per pixel. 0=black, 255=white.

Greyscale (Y)

Page 5: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Colour Images

An image is a rectangular grid (raster) of picture cels (= pixels) Colour images use 3 rasters (Red, Green, Blue)

Y= 0.30 (R) + 0.59 (G) + 0.11 (B)

For this introduction we process a colour image simply by processing R, G and B rasters independently, as you would a greyscale image.

Page 6: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Image as a function

We can think of a greyscale image as a function R2 → R

=255  255  255  255  255  255  255  255  255  255  255  255  

255  255  255  255  255  255  255  255  255  255  255  255  

255  255  255   20   0  255  255  255  255  255  255  255  

255  255  255   75   75   75  255  255  255  255  255  255  

255  255   75   95   95   75  255  255  255  255  255  255  

255  255   96  127  145  175  255  255  255  255  255  255  

255  255  127  145  175  175  175  255  255  255  255  255  

255  255  127  145  200  200  175  175   95  255  255  255  

255  255  127  145  200  200  175  175   95   47  255  255  

255  255  127  145  145  175  127  127   95   47  255  255  

255  255   74  127  127  127   95   95   95   47  255  255  

255  255  255   74   74   74   74   74   74  255  255  255  

255  255  255  255  255  255  255  255  255  255  255  255  

255  255  255  255  255  255  255  255  255  255  255  255  

f(x,y) = the intensity of pixel (x,y)

A digital image f(x,y) only has compact support

Page 7: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Image as a function

When we “do image processing” we can think of transforming the function f(x,y) to form a new function g(x,y).

These lectures will focus on a class of transform called

Linear transforms Because they:

1) are useful (e.g. Noise reduction, finding edges, sharpen detail)

2) can be performed efficiently via convolution

 g (x,y) = f (x,y) + 20  g (x,y) = f (-x,y)

Page 8: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Noise reduction

Suppose we take a photo of a stationary scene.

f(x,y) = I(x,y) + N(0,σ)

If noise obeys central limit theorem, then we can take many photos and average them to obtain a less noisy result.

1 10 100 1000

image = signal + noise

Page 9: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Gaussian

In that example we used a Gaussian distribution to model noise

N(µ,σ)

A Gaussian distribution is a generalised normal distribution with any mean (here µ=0) and standard deviation (here σ=10).

N(0,1) is the Normal distribution

Page 10: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Noise reduction

What if we only have one photo (i.e. typical image processing)?

A pixel is very similar to its neighbours (spatial coherence)

Average groups of neighbouring pixels together.

The window centres on each pixel and computes mean value of pixels beneath it. The result is written to a new image.

0 4 0 0 0 0

0 0 0 0 0 0

0 0 0 3 0 0

0 0 0 0 0 0

0 2 0 0 0 0

0 0 0 0 0 0

0 1 0 0

0 0 0 0

0 1 0 0

0 0 0 0

Input f(x,y) Output g(x,y)

Page 11: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Another way of saying the same...

Consider a window containing a set of values.

For each pixel in the input:

1. Window values are multiplied with image beneath

2. The sum of these products is written to output image.

e.g. (0 x 1) + (4 x 1) + (0 x 1) +... = 4/9

0 4 0 0 0 0

0 0 0 0 0 0

0 0 0 3 0 0

0 0 0 0 0 0

0 2 0 0 0 0

0 0 0 0 0 0

1 1 0 0

0 0 0 0

0 1 0 0

0 0 0 0

1 1 1

1 1 1

1 1 1

1/9 x

Convolution

Input f(x,y) Output g(x,y)

Page 12: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Terminology Image f(x,y) was transformed into g(x,y) via convolution.

Each pixel was “replaced” by a linear combination of its neighbours. This is called linear filtering.

The weightings for each pixel were defined by the window

Input f(x,y)

1  1  1  

1  1  1  

1  1  1  

Output g(x,y)

* =  

“Window” = “Template” = “Kernel” = “Filter” = “Mask”=...

Convolution operator, not multiplication!

i.e. the prescription for a linear filter is the values in the window

Page 13: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Closer look at the box filter

1/9  1/9  1/9  

1/9  1/9  1/9  

1/9  1/9  1/9   “Box filter” / “Box blur”

“Mean filter”

Any filter is itself a signal h(x,y)

Can be padded with zeros to match image size

Page 14: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Example of Box Filter

Blocky / square artifacts

More on this later...

Page 15: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Closer look at convolution process

Convolu,on  expressed  using  *  operator  

1/9  1/9  1/9  

1/9  1/9  1/9  

1/9  1/9  1/9  

side  2k+1  

i.e.  k=1  

Page 16: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Other choices of filter

We can put different values in the window to create different effects (i.e. produce different linear filters):

0  0  0  

0  1  0  

0  0  0  

Original Identical image

* =  

Page 17: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Other choices of filter:

We can put different values in the window to create different effects (i.e. produce different linear filters):

0  0  0  

0  0  1  

0  0  0  

Original Shifted left By 1 pixel

* =  

Page 18: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Other choices of filter:

Original

1  1  1  

1  1  1  

1  1  1  

0  0  0  

0  2  0  

0  0  0   - Sharpening  filter  

(accentuates  edges)  

=  *

We can put different values in the window to create different effects (i.e. produce different linear filters):

Page 19: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Example of Image Sharpening

Page 20: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

More on blurring

We get better results blurring with a “Gaussian” filter vs. the “box” filter.

0.11  0.11  0.11  

0.11  0.11  0.11  

0.11  0.11  0.11  

0.06  0.13  0.06  

0.13  0.24  0.13  

0.06  0.13  0.06  

3x3 box filter

3x3 Gaussian

Original

Box Gaussian

Page 21: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

2D Gaussian

0.06  0.13  0.06  

0.13  0.24  0.13  

0.06  0.13  0.06  

In this case x and y are measured as offsets from the centre of the template. The standard deviation is fixed. The window truncates the Gaussian function beyond a certain distance.

Page 22: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Convolution - Topics

Convolution is a versatile filtering mechanism, but:-

1) As described, it is slow O(nm) and will take ages to process modern digital images e.g. multi-megapixel

2) We don’t yet understand why particular sets of values in the filters have the result they do....

To answer both we need to understand Fourier’s theorem.

Page 23: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier’s Theorem

“Any periodic signal can be synthesised by summing (possibly infinitely) many sine and cosine waves of various amplitudes and frequencies”

(or equivalently: many cosine waves with various phase, ampltiude and frequency)

Page 24: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier Synthesis Example Adding sine waves of increasing frequency (with decreasing amplitude) to make a square wave:

y = sin(t); y = sin(t) + sin(3*t)/3; y = sin(t) + sin(3*t)/3 + sin(5*t)/5 + sin(7*t)/7 + sin(9*t)/9;

Page 25: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier Transform (Terminology) The Fourier Transform (FT) is a piece of mathematics that decomposes a real signal into its individual frequency components.

Spatial domain

Frequency domain

FT ( Analysis)

IFT ( Synthesis)

You can convert a signal in the spatial domain to the frequency domain, and back again, with no loss of information.

Page 26: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier Transform (Continuous) 1D Fourier transform: 1D Inverse Fourier Transform

f(x) is the signal.

F(u) is the “response” at frequency ‘u’.

The response comprises a magnitude (r) and a phase (φ).

Complex numbers

Page 27: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier Transform (Continuous) 1D Fourier transform: 1D Inverse Fourier Transform

f(x) is the signal.

F(u) is the “response” at frequency ‘u’. Complex numbers

Normalisation required because u is angular frequency (u=2πv)

Page 28: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Discrete Fourier Transform

1D Discrete Fourier Transform: 1D Inverse DFT:

Because a digital signal has compact support, the DFT is used on digital signals. It is near-identical in form to continuous FT.

The Fast Fourier Transform (FFT) is a fast way of computing DFT, that works only when N is a power of 2. (Cooley et al. 1965)

Demo

Page 29: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

2D DFT FT and DFT also work over 2D (or any-D) signals. The 2D case is very important, because images are 2D signals - recall f(x,y)

2D DFT:

2D IDFT:

Recall that converting to/from frequency domain is loss-less.

2D DFT / IDFT allows us to manipulate image in frequency domain.

Image Frequency domain

Manipulate frequencies

Result image

FT IFT

Page 30: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

2D DFT – Implementation The 2D DFT is a “separable” transform.

Separability makes 2D DFT fast to compute.

If image has side length in powers of 2, can use FFT instead of DFT to speed up even further (equivalently you can pad the image with a border of zeros until it has sides of power 2).

... is computable by running 1D DFT over each image row, and then running each column of the result through its own 1D DFT.

Page 31: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

2D DFT What does an image “look” like in frequency domain?

Visualising |F(u,v)| (i.e. amplitude of frequencies)

Demo

Origin (i.e. dc component)

Lower frequencies

Higher frequencies

f(x,y)

F(u,v)

Page 32: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

2D DFT Simpler examples

What will |F(u,v)|

look like?

f(x,y)

F(u,v)

Page 33: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

2D DFT Simpler examples

Although the result is predominantly what you would expect, there are additional high frequencies introduced.

This is because the signal isn’t periodic (most images aren’t)

What will |F(u,v)|

look like?

Page 34: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

2D DFT Image processing by manipulating frequency domain F(u,v):-

“Ideal” Low-pass filter “Ideal” High-pass filter

Page 35: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Recall: Convolution

Convolu,on  expressed  using  *  operator  

1/9  1/9  1/9  

1/9  1/9  1/9  

1/9  1/9  1/9  

side  2k+1  

i.e.  k=1  

Slow O(nm)

Page 36: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Convolution Theorem Convolution can be performed faster by converting both the image and filter into the frequency domain (2D DFT), multiplying them together, and converting the result back (2D IDFT).

f(x,y) image FT

FT h(x,y) filter

IFT

F[.] indicates FT of function.

By considering convolution in this way, we can also understand why filters behave the way they do.

Page 37: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Recall: Why is Gaussian better?

We get better results blurring with a “Gaussian” filter vs. the “box” filter.

0.11  0.11  0.11  

0.11  0.11  0.11  

0.11  0.11  0.11  

0.06  0.13  0.06  

0.13  0.24  0.13  

0.06  0.13  0.06  

3x3 box filter

3x3 Gaussian

Original

Box Gaussian

Page 38: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier Analysis of Common Filters Visualisations of 1D box and Gaussian filters.

FT

FT

Page 39: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier Analysis of Common Filters

Visualisations of 2D box and Gaussian filters.

FT

FT

Gaussian

Box Sinc

Gaussian

Page 40: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Observations

The “Ideal” low pass filter is “sinc”: sinc (x) = sin(x)/x

The FT of a box is a sinc scaled according to the size of the box.

The FT of a Gaussian of σ is a Gaussian of 1/σ

The opposite holds too (i.e. FT of a sinc is a box, etc.).

Page 41: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Recall: Box vs. Gaussian blur

Can you explain the artifacts in the box filtered image?

Original

Box Gaussian

Box

Gaussian

Page 42: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Question

How do we produce this “ideal” low-pass filtering scenario:

f(x,y) image FT

FT h(x,y) filter

IFT

?

Answer: Use 2D sinc filter (“ideal low-pass filter”) But sinc is an infinite series and thus cannot be represented in digital images, because they have compact support

Page 43: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Ideal low-pass filter

Sinc is an oscillating, infinite series and unsuitable for digital images, because they have compact support

A truncated sinc signal in spatial filter creates artifacts in the frequency domain and thus ringing artifacts in the image.

FT

FT

IFT

Page 44: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Gaussian low-pass filter

A Gaussian does not have this problem. Although it is an infinite series it does not oscillate, and is “well behaved”

(FT of a Gaussian σ is a Gaussian 1/σ).

So, 1/σ determines the bandwidth of the frequencies passed.

FT

IFT

FT

Page 45: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Back to sharpening:

Original

1  1  1  

1  1  1  

1  1  1  

0  0  0  

0  2  0  

0  0  0   - Sharpening  filter  

(accentuates  edges)  

=  *

unfiltered  

filtered  

Page 46: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Back to sharpening What does the blurring take out?

original   smoothed  (5x5)  

–  

detail  

=  

sharpened  

=  

original   detail  

+  α  

Source:  S.  Lazebnik  

Boost detail and add it back :

Page 47: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Sharpening

Gaussian  scaled  impulse   Laplacian  of  Gaussian  

image  blurred  image  

1  1  1  1  1  1  1  1  1  

0  0  0  0  2  0  0  0  0   -

Page 48: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Fourier analysis - LoG

Laplacian  of  Gaussian  (LoG)  

FT

Similar to Gaussiam, the FT of LoG is a LoG. What will this do to the high frequencies?

Page 49: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Example of Image Sharpening

Page 50: Computer Vision and Graphics (ee2031) - University of Surreyinfo.ee.surrey.ac.uk/Teaching/.../ImageProcessing... · Learning Outcomes After attending this lecture, and doing the reading

Summary After attending this lecture, and doing the reading and labwork, you should be able to:

•  Describe the basic framework for performing linear filtering on a digital image (convolution)

•  Implement image blurring and sharpening operations.

•  Compare and contrast several low-pass filters and describe their operation in the context of image processing.

•  Define the Fourier transform in both continuous and discrete terms for the 1D and 2D cases.

•  Describe the convolution theorem, its links to the Fourier transform, and its implications for digital image processing.