Upload
vivien-boone
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
What is Computer Vision?
The goal of Computer Vision (or Machine Vision) is:
to make useful decisions about real physical objects and scenes based on sensed images.
to construct scene descriptions from images.
Issues
Sensing: How do sensors obtain images from the world?
Encoded Information: How do images yield information for understanding the 3D world, including the geometry, texture, motion, and identity of objects in it?
Representations: How to represent objects, their parts, properties, and relationships, in a computer?
Algorithms: How to process image information and construct descriptions of the objects and the world?
Digital Images
A 2D image is a projection of a scene from a specific viewpoint; many 3D features are captured, some not.
A 2D arrangement of pixels (picture elements) with a fixed number rows and columns.
In grey scale images, each pixel is a single value of grey usually in the range [0…255] where 0 is black and 255 is white.
In color images, each pixel has color information associated with it RGB Color scheme – quantities of Red, Green and Blue True color uses 1 byte for Red, 1 for Green, and 1 for blue For many computer vision applications color is not needed.
Digital Images cont…
Example digital image
with 8 x 8 block of pixels
from left eye.
Numerous Applications
Image Database Querying Aerial Images and GIS Medical Imaging Processing Scanned Text Understanding a Scene of Parts Inspection applications Automated navigation Etc, etc, etc…
Understanding a Scene of Parts A robot needs to classify (or inspect) parts
and act accordingly.
Combining Multiple Images
Images with a constant background are subtracted to detect change over time.
Pixel differences at boundary reveals moving object and its shape
Combining Multiple Images
Images can also be added to blend them together.
Reality Check
Success is hard won Some potential issues:
Lighting Fluctuations or inadequacies Object positioning and occlusion Background noise or other un-important image
features. Good news
Industrial robotics usually provides a very controlled environment.
Imaging Devices CCD Camera
Charge-coupled Device (CCD) Instead of chemical film that reacts to light (like 35mm
film), tiny solid cells convert light energy into electrical charge.
Problem with Digital Images
Blooming
Difficult to insulate adjacent sensing elements.
Charge often leaks from hot cells to neighbors, making bright regions larger.
Problem with Digital Images
Clipping or Wraparound Dark grid
intersections at left were actually brightest of scene.
In A/D conversion the bright values were clipped to lower values.
Problem with Digital Images
Lens distortion distorts image
“Barrel distortion” of rectangular grid is common for cheap lenses ($50)
Precision lenses can cost $1000 or more.
Zoom lenses often show severe distortion.
Problem with Digital Images
CCD Variations CCD sensors imperfections can cause different
reading for the same light intensity. Chromatic Distortion
Different light wavelength bent differently. Quantization Effects
Mixing and Rounding problems.
Spatial Quantization Problems Both pixel size relative position make a difference.
Mixed pixel represents a mixture of intensity values in a real scene.
Small features can be lost or blended.
Portable Bit Map Image (PGM) P2 means
ASCII gray Comments W=16; H=8 192 is max
intensity Can be made
with editor
Compression
Lossless – if a decompression method exists to precisely recover the original image.
Lossy – information is lost during compression and cannot to recovered during decompression (JPG, GIF) GIF – (Graphics Interchange Format) only 8 bits used for
color; can contain transparency and animation. JPG – (Joint Photographic Experts Group) for high quality
images; considers human vision systems and uses discrete cosine transform and Huffman coding.
Pixels and Neighborhoods
A binary image consists of only two intensities – 0 and 1 (or 0 and 255).
A binary image B can be obtained from a grayscale image I through an operation that selects a subset of image pixels as foreground pixels, the pixels of interest in an image. Everything else would be considered as background pixels.
000100100010000001111000100000010010001000
Thresholding and Segmentation
Gray level thresholding is the simplest segmentation process.
Many objects or image regions are characterized by constant reflectivity or light absorption of their surface.
Thresholding is computationally inexpensive and fast.
Thresholding can easily be done in real time using specialized hardware
Thresholding Background is black Healthy cherry is bright Bruise is medium dark
This Histogram shows two cherry regions (black background has been removed)
gray-tone values
pixelcounts
0 256
Thresholding - Example
Original Image Thresholded Image (95)
Thresholding Example
Over-segmentation (225) Under-segmentation (25)
Algorithm
Thresholding algorithm Search all the pixels f(i,j) of the image f. An image element g(i,j) of the segmented image is an
object pixel if f(i,j) >= T, and is a background pixel otherwise
Correct threshold selection is crucial for successful threshold segmentation
Threshold selection can be interactive or can be the result of some threshold detection method
Region Properties
Once a binary image has been processed we could obtain properties about the regions in the processed image.
Some of those properties are Area, centroid Measure of circularity and elongation
Area and Centroid
Connected Components
Components are objects that share at least one common neighbor (in 4- or 8- neighborhood).
Definition: A connected component labeling of binary image B is a labeled image LB in which the value of each pixel is the label of its connected component.
Recursive Labeling Algorithm Given a binary image B
Negate the image (make all 1-pixels to –1) Search and find a –1 pixel, label it and find its (4-
or 8-) neighboring pixels with –1 and assign the same label.
Recursively apply to resolve (merge or split) components. Increment label each time…
Vision Input Sensor
A camera can be used to provide visual input data to the robot Objects in the input images must be represented
in the robot world coordinate system so that the robot can manipulate them
Two Approaches: Visual Servoing Calibration
Visual Servoing
A target image is compared against the current image
An error vector is generated which indicates how the robot should be moved in order to minimized the error between the current and target images.
The machine is incrementally moved. PRO: No need for a mm/pixel ratio to be calculated.
Very nice!
Visual Servoing Example: Camera
mounted to end-effector and end-effector must move so that circular piece in the center of the image.
Camera Calibration
Common in Industry CON: Usually must be manually done and
can become un-calibrated over time. Steps:
Must calculate mm/pixel ratio Must train a reference frame that can be seen in
the input image
Example Setup: Frame {f} has been training in the robot work area. Parts can coordinates in this area. Consider Z to be fixed. Note: cannot recover depth this way…
Same example as before, but this is what the camera is seeing: