Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
ME5286 – Lecture 3 (Theory)
#1
Lecture 3: Digital Image Representation
and Color Fundamentals
Saad J [email protected]
ME5286 – Lecture 3 (Theory)
#2
Last Lecture
• Image Formation– Pinhole Camera– Lenses
• Human Visual System• Digital Cameras Capture Components
ME5286 – Lecture 3 (Theory)
#3
Outline for this Lecture
• Digital Image Representation– Sampling, Quantization
• Color Fundamentals• Digital Cameras – Digital Sensor
ME5286 – Lecture 3 (Theory)
#4Image Sensing and Acquisitionand
Digital Image Representation:Sampling and Quantization
ME5286 – Lecture 3 (Theory)
#5
Digital Image Quality factors
ME5286 – Lecture 3 (Theory)
Image digitization
• Sampling: measure the value of an image at a finite number of points.
• Quantization: represent measured value (i.e., voltage) at the sampled point by an integer.
Image digitization
ME5286 – Lecture 3 (Theory)
#7
World Camera Digitizer DigitalImage0 10 10 15 50 70 80
0 0 100 120 125 130 130
0 35 100 150 150 80 50
0 15 70 100 10 20 20
0 15 70 0 0 0 15
5 15 50 120 110 130 110
5 10 20 50 50 20 250
PIXEL
Typically:0 = black
255 = white
(picture element)
Digital Images
ME5286 – Lecture 3 (Theory)
#8
Image Digitization
ME5286 – Lecture 3 (Theory)
Digital Image
0
255
Grayscale Image:- 2D Matrix- 8 bits/pixel
ME5286 – Lecture 3 (Theory)
N=M=30
Digital ImageImage is a 2D rectilinear array of pixels (picture element)With FIXED Number of samples : NxM
N=M=256
ME5286 – Lecture 3 (Theory)
L=15(4 bits)L=255 (8 bits)
Digital ImageNo continuous values – Quantization represented by the
number of bits per pixel
255
170
15
8
L=1 (1 bit) L=3 (2 bits)
ME5286 – Lecture 3 (Theory)
Sampling and Quantization
ME5286 – Lecture 3 (Theory) 13
Uniform sampling
• Digitized in spatial domain (IM x N)• M and N are usually integer powers of two• Nyquist theorem and Aliasing
• Non-uniform sampling– communication
(0,0) (0,1) (0,2) (0,3)(1,0)
(3,0)(2,0)
(1,1)(2,1)(3,1)
(2,2)(3,2)
(1,2)
(3,3)(2,3)(1,3)
(0,0) (0,0) (0,2) (0,2)(0,0)
(2,0)(2,0)
(0,0)(2,0)(2,0)
(2,2)(2,2)
(0,2)
(2,2)(2,2)(0,2)
Sampledby 2
ME5286 – Lecture 3 (Theory)
#14
Image Samplingoriginal image sampled by a factor of 2
sampled by a factor of 4 sampled by a factor of 8
ME5286 – Lecture 3 (Theory)
Image Decimation and Interpolation#15
• Decimation is the reduction in dimension or resolution of the Image ( subsampling ) – Decimation of 2, results to half of the size of the
image– Simplest method is the skipping of pixels
• Interpolation is the increase in dimension or resolution of the Image by means – Interpolation of 2, results to double the size of the
image– Simplest method is duplication of pixels
ME5286 – Lecture 3 (Theory)
Image Pyramids
Known as a Gaussian Pyramid [Burt and Adelson, 1983]• In computer graphics, a mip map [Williams, 1983]
ME5286 – Lecture 3 (Theory)
Effect of Sampling
• Simple example: a sign wave
ME5286 – Lecture 3 (Theory)
Undersampling
• What if we “missed” things between the samples?
• Simple example: undersampling a sine wave– unsurprising result: information is lost
ME5286 – Lecture 3 (Theory)
Undersampling• What if we “missed” things between the
samples?• Simple example: undersampling a sine
wave– unsurprising result: information is lost– surprising result: indistinguishable from lower
frequency
ME5286 – Lecture 3 (Theory)
Undersampling• What if we “missed” things between the samples?• Simple example: undersampling a sine wave
– unsurprising result: information is lost– surprising result: indistinguishable from lower
frequency– also was always indistinguishable from higher
frequencies– aliasing: signals “traveling in disguise” as other
frequencies
ME5286 – Lecture 3 (Theory)
What’s happening?Input signal:
x = 0:.05:5; imagesc(sin((2.^x).*x))
Plot as image:
Alias!Not enough samples
ME5286 – Lecture 3 (Theory)
Antialiasing• What can we do about aliasing?
• Sample more often– Join the Mega-Pixel craze of the photo industry– But this can’t go on forever
• Make the signal less “wiggly” – Get rid of some high frequencies– Will loose information– But it’s better than aliasing
ME5286 – Lecture 3 (Theory)
23
Aliasing effectAliasing (the Moire effect)
http://www.wfu.edu/~matthews/misc/DigPhotog/alias/
Artifacts
ME5286 – Lecture 3 (Theory) 24
Uniform quantization• Digitized in amplitude (or pixel value)• PGM – 256 levels 4 levels• Compute the uniform step that represent 1 level
step = 64 in this case
0
255
64
128
192
0
3
1
2
ME5286 – Lecture 3 (Theory)
#25
Image Quantization• 256 gray levels (8bits/pixel) 32 gray levels (5 bits/pixel) 16 gray levels (4bits/pixel)
• 8 gray levels (3 bits/pixel) 4 gray levels (2 bits/pixel) 2 gray levels (1 bit/pixel)
ME5286 – Lecture 3 (Theory)
Issues with Dynamic Range
15001500
11
25,00025,000
400,000400,000
2,000,000,0002,000,000,000
- The real world hasHigh dynamic range
- Uniform Sampling is not optimal
- Wide Dynamic Range combines multiple captures
ME5286 – Lecture 3 (Theory)
#27
Color Image Processing
ME5286 – Lecture 3 (Theory)
Color Image Processing
• Color– simplifies object extraction and identification– human vision : thousands of colors vs max-24
gray levels
• Color Spectrum– white light with a prism (1966, Newton)
ME5286 – Lecture 3 (Theory)
Gray scale Image
ME5286 – Lecture 3 (Theory)
Color Imageiew
ME5286 – Lecture 3 (Theory)
#31
What Is Light?• The visible portion of the electromagnetic (EM)
spectrum.• It occurs between wavelengths of approximately
400 and 700 nanometers.
ME5286 – Lecture 3 (Theory)
Color Spectrum
The experiment of Sir Isaac Newton, in 1666.
ME5286 – Lecture 3 (Theory)
Color spaces• How can we represent color?
http://en.wikipedia.org/wiki/File:RGB_illumination.jpg
ME5286 – Lecture 3 (Theory)
Human Eye
• Three different types of cones; each type has a special pigment that is sensitive to wavelengths of light in a certain range:– Short (S) corresponds to blue– Medium (M) corresponds to green– Long (L) corresponds to red
• Ratio of L to M to S cones: – approx. 10:5:1
• Almost no S cones in the center of the fovea
400 450 500 550 600 650
RE
LATI
VE
AB
SO
RB
AN
CE
(%)
WAVELENGTH (nm.)
100
50
440
S
530 560 nm.
M L
ME5286 – Lecture 3 (Theory)
Color images• Color representation is based on the theory of T. Young
(1802) which states that any color can be produced by mixing three primary colors C1, C2, C3:
C = aC1 + bC2 + cC3
• It is therefore possible to characterise a psycho-visual colour by specifying the amounts of three primary colours: red, green and blue, mixed together.
• This lead to the standard RGB space used in television, computer monitors, smart phones, etc.
ME5286 – Lecture 3 (Theory)
Color FundamentalsStandard wavelength values for the primary colors
ME5286 – Lecture 3 (Theory)
Color Fundamentals
Tri-stimulus values: The amount of Red, Green and Blue needed to form any particular color
Denoted by: X, Y and Z
ZYXXx
ZYXYy
ZYX
Zz
1 zyx
Tri-chromatic coefficient:
ME5286 – Lecture 3 (Theory)
Color FundamentalsAny patch of light can be completely describedphysically by its spectrum: the number of photons (per time unit) at each wavelength 400 - 700 nm.
400 500 600 700
Wavelength (nm.)
# Photons(per ms.)
© Stephen E. Palmer, 2002
ME5286 – Lecture 3 (Theory)
Color Examples
Some examples of the reflectance spectra of surfaces
Wavelength (nm)
% P
hoto
ns R
efle
cted
Red
400 700
Yellow
400 700
Blue
400 700
Purple
400 700
© Stephen E. Palmer, 2002
ME5286 – Lecture 3 (Theory)
Tetrachromatism
• Most birds, and many other animals, have cones for ultraviolet light.
• Some humans, mostly female, seem to have slight tetrachromatism.
Bird cone responses
ME5286 – Lecture 3 (Theory)
More Spectra
metamers
ME5286 – Lecture 3 (Theory)
Color Image Representation
• RGB Model
ME5286 – Lecture 3 (Theory)
Color Image Representation
• Usually, we specify the levels of R, G and Bin the range [0, 255], (8-bit integers).
(0,0,0)
(255,255,255)
RGB
Colors 216,777,162 38
ME5286 – Lecture 3 (Theory)
RGB Color Representation
0,1,0
0,0,1
1,0,0
Image from: http://en.wikipedia.org/wiki/File:RGB_color_solid_cube.png
Some drawbacks• Strongly correlated channels• Non-perceptual
Default color space
R(G=0,B=0)
G(R=0,B=0)
B(R=0,G=0)
ME5286 – Lecture 3 (Theory)
Color ImageR
G
B
ME5286 – Lecture 3 (Theory)
Alternate Color Spaces
• Various other color representations can be computed from RGB.
• This can be done for:– Decorrelating the color channels:
• principal components.
– Bringing color information to the fore:• Hue, saturation and brightness.
ME5286 – Lecture 3 (Theory)
Most Common Color Spaces The purpose of a color model (also called color
space) is to facilitate the specification of colors in some standard, generally accept way.
RGB (red,green,blue) : monitor, video camera. and HSI ( HSL, HSV, YUV) model, which corresponds
closely with the way humans describe and interpret color. CMY(cyan,magenta,yellow),CMYK (CMY, black) model
for color printing.Black (K) = minimum of C,M,YCyanCMYK = (C - K)/(1 - K)MagentaCMYK = (M - K)/(1 - K)YellowCMYK = (Y - K)/(1 - K)
ME5286 – Lecture 3 (Theory)
ME5286 – Lecture 3 (Theory)
Alternate Color Space
The characteristics generally used to distinguish one color from another are Brightness, Hue, and Saturation. Hue: Represents dominant color as perceive by an
observer. Saturation: Relative purity or the amount of white light
mixed with a hue
Hue and saturation taken together are called Chromaticity, and therefore, a color may be characterized by its Brightness and Chromaticity.
ME5286 – Lecture 3 (Theory)
HSI model: hue and saturation
ME5286 – Lecture 3 (Theory)
#51
HSI Color Space• Hue corresponds to color, saturation
corresponds to the amount of white in color, and intensity is related to brightness
• For example: a deep, bright orange color would have a large intensity (bright), a hue of “orange” , and a high value of saturation (“deep”)
• But in terms of RGB components, this color would have the values as R =245, G= 110, and B=20
ME5286 – Lecture 3 (Theory)
#52
RGB vs HSL
ME5286 – Lecture 3 (Theory)
Color spaces: HSVIntuitive color space
H(S=1,V=1)
S(H=1,V=1)
V(H=1,S=0)
ME5286 – Lecture 3 (Theory)
RGB to HSV#54
ME5286 – Lecture 3 (Theory)
Color spaces: YCbCr
Y(Cb=0.5,Cr=0.5)
Cb(Y=0.5,Cr=0.5)
Cr(Y=0.5,Cb=05)
Y=0 Y=0.5
Y=1Cb
Cr
Fast to compute, good for compression, used by TV
ME5286 – Lecture 3 (Theory)
RGB to Other Color Spaces#56
ME5286 – Lecture 3 (Theory)
Other Color spaces
• RGB (CIE), RnGnBn (TV - National Television Standard Committee)• XYZ (CIE)• UVW (UCS de la CIE), U*V*W* (UCS modified by the CIE)• YUV, YIQ, YCbCr• YDbDr• DSH, HSV, HLS, IHS• Munsel color space (cylindrical representation)• CIELuv• CIELab• SMPTE-C RGB• YES (Xerox)• Kodak Photo CD, YCC, YPbPr, ...
ME5286 – Lecture 3 (Theory)
Color Image: Full Description
Original image
ME5286 – Lecture 3 (Theory)
Intensity Image: Most information
Only intensity shown – constant color
ME5286 – Lecture 3 (Theory)
Most information in intensity
Only color shown – constant intensity
ME5286 – Lecture 3 (Theory)
Color Transformation - Examples
ME5286 – Lecture 3 (Theory)
#62
rg Chromaticity Coordinates
• Normalizes RGB values to the sum of all three
• Chromaticity coordinates are:
ME5286 – Lecture 3 (Theory)
Skin color
RGB rgr
g
ME5286 – Lecture 3 (Theory)
Skin detection
M. Jones and J. Rehg, Statistical Color Models with Application to Skin Detection, International Journal of Computer Vision, 2002.
ME5286 – Lecture 3 (Theory)
Common image file formats
• GIF (Graphic Interchange Format) -• PNG (Portable Network Graphics)• JPEG (Joint Photographic Experts Group)• TIFF (Tagged Image File Format)• PGM (Portable Gray Map)• FITS (Flexible Image Transport System)
ME5286 – Lecture 3 (Theory)
PBM/PGM/PPM format• A popular format for grayscale images (8 bits/pixel)• Closely-related formats are:
– PBM (Portable Bitmap), for binary images (1 bit/pixel)– PPM (Portable Pixelmap), for color images (24 bits/pixel)
» ASCII or binary (raw) storage
ASCI
Binary
ME5286 – Lecture 3 (Theory)
Images in Matlab• Images represented as a matrix• Suppose we have a NxM RGB image called “im”
– im(1,1,1) = top‐left pixel value in R‐channel– im(y, x, b) = y pixels down, x pixels to right in the bth channel– im(N, M, 3) = bottom‐right pixel in B‐channel
• imread(filename) returns a uint8 image (values 0 to 255)– Convert to double format (values 0 to 1 if you need to scale)
0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.990.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.910.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.920.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.950.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.850.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.330.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.740.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.930.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.990.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.970.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93
0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.990.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.910.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.920.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.950.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.850.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.330.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.740.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.930.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.990.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.970.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93
0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.990.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.910.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.920.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.950.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.850.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.330.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.740.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.930.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.990.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.970.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93
R
GB
row column
ME5286 – Lecture 3 (Theory)
#68
Image Representation
• Mathematically, an image can be represented by a 2-D matrix– Each entry (i,j) represents the value at the
corresponding location, which is called a pixel– The value of a pixel can have different types,
depending on the image types• Unsigned char ( 8 bits per pixel or 256 levels ) • Int • Float• A vector (Color image, for example)
ME5286 – Lecture 3 (Theory)
#69
Image File Formats
• Image file header: A set of parameters found at the start of the file image and contains information regarding:
• Number of rows (height)• Number of columns (width)• Number of bands• Number of bits per pixel (bpp)• File type
ME5286 – Lecture 3 (Theory)
#70
Digital Cameras
ME5286 – Lecture 3 (Theory)
Digital cameras• A digital camera replaces
film with a sensor array.
– Each cell in the array is light-sensitive diode that converts photons to electrons
– Two common types• Charge Coupled Device (CCD) • Complementary metal oxide
semiconductor (CMOS)
ME5286 – Lecture 3 (Theory)
Digital Camera: Properties• Focus – Shifts the depth that is in focus.
• Focal length – Adjusts the zoom, i.e., wide angle or telephoto
lens.
• Aperture – Adjusts the depth of field and amount of light let into
the sensor.
• Exposure time – How long an image is exposed. The longer an
image is exposed the more light, but could result in motion blur.
• ISO – Adjusts the sensitivity of the “film”. Basically a gain
function for digital cameras. Increasing ISO also increases noise.
ME5286 – Lecture 3 (Theory)
Types of Camera Sensors
• CCDs move photogenerated charge from pixel to pixel and convert it to voltage at an output node.
• An analog-to-digital converter (ADC) then turns each pixel's value into a digital value.
http://www.dalsa.com/shared/content/pdfs/CCD_vs_CMOS_Litwiller_2005.pdf
ME5286 – Lecture 3 (Theory)
CMOS Cameras• CMOS convert charge to voltage inside each element. • Uses several transistors at each pixel to amplify and move the charge
using more traditional wires. • The CMOS signal is digital, so it needs no ADC.
http://www.dalsa.com/shared/content/pdfs/CCD_vs_CMOS_Litwiller_2005.pdf
ME5286 – Lecture 3 (Theory)
#75
Basic Structure of CCD
ME5286 – Lecture 3 (Theory)
#76
CCD (Charged-Coupled Device) Cameras• Small solid state cells convert light energy into
electrical charge• The image plane acts as a digital memory that can
be read row by row by a computer
ME5286 – Lecture 3 (Theory)
Causes of noise• Shot noise – variation in the number of photons
(low light situations.)
• Readout noise – Noise added upon readout of pixel. In some cases can be subtracted out.
• Dark noise – Noise caused by electrons thermally generated. Depends on the temperature of device.
ME5286 – Lecture 3 (Theory)
Color sensing in camera: Prism
• Requires three chips and precise alignment.
CCD(B)
CCD(G)
CCD(R)
ME5286 – Lecture 3 (Theory)
Color Sensing in Camera (RGB)
• 3-chip vs. 1-chip: quality vs. cost• Why more green?
http://www.cooldictionary.com/words/Bayer-filter.wikipediaWhy 3 colors?
ME5286 – Lecture 3 (Theory)
Color Images: Bayer Grid
• Estimate RGBat ‘G’ cells from neighboring values
http://en.wikipedia.org/wiki/Bayer_filter
ME5286 – Lecture 3 (Theory)
Color sensing in camera
red green blue output
demosaicing(interpolation)
ME5286 – Lecture 3 (Theory)
Review82
• Image Representation– Effect of Sampling and Quantization
• Color Fundamentals– RGB versus Other Spaces
• Digital Cameras Processing