Mobile Robotics: 6. Vision 1 Dr. Brian Mac Namee (

Mobile Robotics:6. Vision 1

Dr. B

rian Mac N

amee (w

ww

.comp.dit.ie/bm

acnamee)

2of25

2of34 Acknowledgments

These notes are based (heavily) on those provided by the authors to accompany “Introduction to Autonomous Mobile Robots” by Roland Siegwart and Illah R. Nourbakhsh

More information about the book is available at:http://autonomousmobilerobots.epfl.ch/

The book can be bought at:The MIT Press and Amazon.com

3of25

3of34 Today’s LectureWhy is vision hard?

Brief historical overview– From early cameras to digital cameras

Low-level robot vision– Camera as sensor– Color representation– Object detection

4of25

4of34 Vision In GeneralVision is our most powerful sense providing us with an enormous amount of information about our environment and enables us to interact intelligently with the environmentIt is therefore not surprising that an enormous amount of effort has occurred to give machines a sense of visionVision is also our most complicated sense

– Whilst we can reconstruct views with high resolution on photographic paper, understanding how the brain processes the information from our eyes is still in its infancy

5of25

5of34 Vision In General (cont…)When an image is recorded through a camera, a 3-D scene is projected onto a 2-D planeIn order to try and recover some “useful information” from the scene, usually edge detectors are used to find the contours of the objectsFrom these edges or edge fragments, much research time has to been spent attempting to produce fool proof algorithms which can provide all the necessary information required to reconstruct the 3-D scene which produced the 2-D imageThe interpretation of 3-D scenes from 2-D images is not a trivial task

6of25

6of34 Vision Is Hard! (Segmentation)

7of25

7of34 Vision Is Hard! (Classification)

8of25

8of34 Vision Is Hard! (Perspectives)

9of25

9of34

Vision Is Hard! (Brightness Adaptation)

For more great illusion examples take a look at: http://web.mit.edu/persci/gaz/

10of25

10of34 Vision Is Hard! (Illusions)

Our visual systems play lots of interesting tricks on us

11of25


12of25


Stare at the cross in the middle of the image and think circles

13of25

13of34 Camera ObscuraMo Ti, Chinese philosopher, 5th Century B.C.

– Described linear light paths, pinhole image formation

Leonardo da Vinci (1452-1519)– Demonstrated camera obscura (lens added later)

Frisius (1544)

Portmerion Village, North Wales

Photograph of camera obscura interior:

14of25

14of34 Toward PhotographyPeople sought a way to “fix” the images at the back of the camera obscura

Pursued decades of experimentation with light-sensitive salts, acids, etc.

First photograph produced when?

15of25

15of34

Harry Ransom Center Kodak (reproduction)

First Photograph

Joseph Nicéphore Niépce “View from the Window at Le Gras”, c. 1826

Aluminum plate coated with light-sensitive materialMore information on the first photograph is available at: http://

www.hrc.utexas.edu/exhibitions/permanent/wfp/

Joseph Nicéphore Niépce

16of25

16of34 First Digital CamerasPhotoelectric effect (Hertz 1887; Einstein 1905)Charge-coupled devices as storage (late 1960’s)Light sensing, pixel row readout (early 1970’s)First electronic CCD still- image camera (1975):

– Fairchild CCD element – Resolution: 100 x 100 b&w– Image capture time: 23 sec.,

mostly writing cassette tape– Total weight: 8½ pounds

Kodak, c. 1975

17of25

17of34 Modern Digital CamerasToday, fifty Euro buys a camera with:

– 640 x 480 pixel resolution at 30Hz– 1280 x 960 still image resolution– 24-bit RGB pixels (8 bits per channel)– Automatic gain control, color balancing– On-chip lossy compression algorithms– Uncompressed images if desired– Integrated microphone, USB interface– Limitations

• Narrow dynamic range• Narrow FOV, with fixed spatial resolution• No motion / active vision capabilities

18of25

18of34 Vision-Based Sensors: HardwareCCD (light-sensitive, discharging capacitors of 5 to 25 micron)

2048 x 2048 CCD array

Cannon IXUS 300

Sony DFW-X700

Orangemicro iBOT Firewire

CMOS (Complementary Metal Oxide Semiconductor) technology

19of25

19of34 What Is A Digital Image?A digital image is a 2-D representation of a 3-D scene as a finite set of digital values, called picture elements or pixels

20of25

20of34 What Is A Digital Image? (cont…)Pixel values typically represent gray levels, colours, heights, opacities etc

Remember digitization implies that a digital image is an approximation of a real scene

1 pixel

21of25

21of34 Digital Image ContentsWhy are pixels represented as “RGB”?

– Is world made of red, green, and blue “stuff”?

… Answer requires a digression (or two)about human vision, cameras as sensors

Image coordinates (pixels)

u

v

IO width

height

22of25

22of34 Visible Light Spectrum

Freedman & Kaufmann, Universe

Solar (ECI, Oxford)

Incandescent (Wikipedia)

23of25

23of34 Image As MeasurementWhat does eye/camera actually observe?

… the product of illumination spectrum

with absorption or reflection spectrum!

= (at each image point)

X

Illumination spectrum IJVSReflection spectrum

24of25

24of34 Eye AnatomySpectrum incident on light-sensitive retina

Incident spectraldistribution

After Isaka (2004)

(View of R eye from above) Rods and cones

25of25

25of34 Blind-Spot ExperimentDraw an image similar to that below on a piece of paper (the dot and cross are about 6 inches apart)

Close your right eye and focus on the cross with your left eye

Hold the image about 20 inches away from your face and move it slowly towards you

The dot should disappear!

26of25

26of34

27of25

27of34 Cone SensitivitiesThree cone types (S, M, and L) are roughly blue, green, and red sensors, respectively

Their peak sensitivities occur at ~430nm, ~560nm, and ~610nm for the “average” human

Rods & cones, ~1.35 mm fromcenter of fovea

Rods & cones, ~8 mm fromcenter of fovea

Cone sensitivities as afunction of wavelength

4 m

28of25

28of34 Color PerceptionThe cones form a spectral “basis” for visible light; incident spectral distribution differentially excites S,M,L cones, leading to color vision

= (at each cone site)

X

IJVS

X

29of25

29of34 Origin Of RGB CCD SensorsSo, in a concrete sense, CCD chips are designed as RGB sensors in order to emulate the human visual system

(Vaytek)

30of25

30of34 RGB Colour ModelThink of R, G, B as color orthobasis

(0,1,0) – pure green

(0,0,1) – pure blue

(1,0,0) pure red

(1,1,1) - white

(0,0,0) - black (hidden)

31of25

31of34 HSV Colour Model

More robust against illumination changes

Still must confront noise, specularity etc.

32of25

32of34 Object DetectionSuppose we want to detect an object (e.g., a colored ball) in the field of view

We simply need to identify the pixels of some desired colour in the image … right?

Image coordinates (pixels)

u

v

IO width

height

33of25

33of34 Real-World Images

Occluded light source

Specularhighlights

Mixedpixels

Complexsurfacegeometry(self-shadowing)

Noise!

34of25

34of34 SummaryVision is our most useful sense, but is also the most difficult to replicate

Digital cameras have evolved to be powerful, cheap and accurate

Artificial vision systems tend to be modelled after the human eye

How do we do it?

35of25

35of34 Questions?

?

Documents

Mobile Robotics: 6. Vision 1 Dr. Brian Mac Namee (