COMPUTER VISION TOOLBOX (MATLAB)

Computer Vision System ToolboxDesign and simulate computer vision and video processing systems

Computer Vision System Toolbox™ provides algorithms and tools for the design and simulation of computervision and video processing systems. The toolbox includes algorithms for feature extraction, motion detection,object detection, object tracking, stereo vision, video processing, and video analysis. Tools include video file I/O,video display, drawing graphics, and compositing. Capabilities are provided as MATLAB® functions, MATLABSystem objects™, and Simulink® blocks. For rapid prototyping and embedded system design, the system toolboxsupports fixed-point arithmetic and C code generation.

Key Features▪ Feature detection, including FAST, Harris, Shi & Tomasi, SURF, and MSER detectors

▪ Feature extraction and putative feature matching

▪ Object detection and tracking, including Viola-Jones detection and CAMShift tracking

▪ Motion estimation, including block matching, optical flow, and template matching

▪ RANSAC-based estimation of geometric transformations or fundamental matrices

▪ Video processing, video file I/O, video display, graphic overlays, and compositing

▪ Block library for use in Simulink

Feature Detection and Extraction

A feature is an interesting part of an image, such as a corner, blob, edge, or line. Feature extraction enables you toderive a set of feature vectors, also called descriptors, from a set of detected features. Computer Vision SystemToolbox offers capabilities for feature detection and extraction that include:

▪ Corner detection, including Shi & Tomasi, Harris, and FAST methods

▪ SURF and MSER detection for blobs and regions

▪ Extraction of simple pixel neighborhood and SURF descriptors

▪ Visualization of feature location, scale, and orientation

Additionally, the system toolbox provides functionality to match two sets of feature vectors and visualize theresults. When combined into a single workflow, feature detection, extraction, and matching can be used to solvemany computer vision design challenges, such as registration, stereo vision, object detection, and tracking.

SURF (left), MSER (center), and corner detection (right) with Computer Vision System Toolbox. Using the same image, thethree different feature types are detected and results are plotted over the original image.

1

Registration and Stereo Vision

Computer Vision System Toolbox supports automatic image registration by providing algorithms that use featuresto estimate the geometric relationships between images or video frames. Typical uses include video mosaicking,video stabilization, image fusion, and stereo vision.

Feature-Based Registration

Feature detection, extraction, and matching are the first steps in the feature-based registration workflow. With apair of images, you can detect and extract features in each image, using one of several feature types available inthe system toolbox. You can then determine putative matches between the two sets of features and visualize thematches. Typically, this workflow produces many interest points with matches that include outliers. You canremove the outliers with statistically robust methods such as RANSAC or least median of squares to compute asimilarity, affine, or projective transformation. You can then apply the geometric transformation to align the twoimages.

Feature-based registration, used for video stabilization. The system toolbox detects interest points in two sequential videoframes using corner features(top); the putative matches are determined with numerous outliers (bottom left), and outliers areremoved using the RANSAC method (bottom right).

Stereo Image Rectification

Stereo image rectification transforms a pair of stereo images so that a corresponding point in one image can befound in the corresponding row in the other image. You can rectify a pair of stereo images with the systemtoolbox by determining a set of matched interest points, estimating the fundamental matrix, and then deriving twoprojective transformations. This process reduces the 2D stereo correspondence problem to a 1D problem, whichsimplifies the process of determining the depth of each point in the scene from the camera.

2

Results from stereo image rectification. Non-overlapping areas are show in red and cyan.

Stereo Vision

Stereo vision is the process of reconstructing a 3D scene from two or more views of the scene. Using the systemtoolbox, you can perform uncalibrated stereo image rectification on a pair of stereo images and match individualpixels along epipolar lines to compute a disparity map.

3

Reconstructing a scene using a pair of stereo images. To visualize the disparity, the right channel is combined with the leftchannel to create a composite (top left); also shown are a disparity map of the scene (top right) and a 3D rendering of the scene(bottom).

Object Detection, Motion Estimation, and Tracking

Object detection is the identification of an object in an image or video. Computer Vision System Toolbox supportsseveral approaches to object detection, including template matching, blob analysis, and the Viola-Jones algorithm.Template matching uses a small image, or template, to find matching regions in a larger image. Blob analysis usessegmentation and blob properties to identify objects of interest. The Viola-Jones algorithm uses Haar-like featuresand a cascade of classifiers to identify pretrained objects, including faces, noses, eyes, and other body parts.

4

Face detection using the Viola-Jones algorithm.

Motion estimation is the process of determining the movement of blocks between adjacent video frames. Thesystem toolbox provides a variety of motion estimation algorithms, such as optical flow, block matching, templatematching, and background estimation using Gaussian mixture models (GMMs). These algorithms create motionvectors, which relate to the whole image, blocks, arbitrary patches, or individual pixels. For block and templatematching, the evaluation metrics for finding the best match include MSE, MAD, MaxAD, SAD, and SSD.

Detecting moving objects using a stationary camera. In this series of video frames, optical flow is calculated and detectedmotion is shown by overlaying the flow field on top of each frame.

Computer vision often involves the tracking of moving objects in video. Computer Vision System Toolboxprovides video tracking algorithms, such as continuously adaptive mean shift (CAMShift) andKanade-Lucas-Tomasi (KLT). You can use these algorithms for tracking a single object or as building blocks in amore complex tracking system. The system toolbox also provides a framework for multiple object tracking thatincludes Kalman filtering and the Hungarian algorithm for assigning objects to tracks.

Learn how to integrate OpenCV and MATLAB

Video Processing, Display, and Graphics

Computer Vision System Toolbox provides algorithms and tools for video processing. You can read and writefrom common video formats, apply common video processing algorithms such as deinterlacing andchroma-resampling, and display results with text and graphics burnt in to the video. Video processing inMATLAB uses System objects, which avoids excessive memory use by streaming data for processing one frameat a time.

5

http://www.mathworks.com/discovery/matlab-opencv.html

http://www.mathworks.com/products/matlab/

Video deinterlacing in MATLAB.

Video I/O

Computer Vision System Toolbox can read and write multimedia files in a wide range of formats, including AVI,MPEG, and WMV. You can stream video to and from MMS sources over the Internet or a local network. You canacquire video directly from Web cameras, frame grabbers, DCAM-compatible cameras, and other imagingdevices using Image Acquisition Toolbox™. Simulink users can use the MATLAB workspace as a video sourceor sink.

Video Display

The system toolbox includes a video viewer that lets you:

▪ View video streams in-the-loop as the data is being processed

▪ View any video signal within your code or block diagram

▪ Use multiple video viewers at the same time

▪ Freeze the display and evaluate the current frame

▪ Display pixel information for a region in the frame

▪ Pan and zoom for closer inspection as the simulation is running

▪ Start, stop, pause, and step through Simulink simulations one frame at a time

6

http://www.mathworks.com/products/imaq/

http://www.mathworks.com/products/simulink/

Model with viewers for four videos: (from left) original, estimated background, foreground pixels, and results of tracking.

Graphics

Adding graphics to video helps with visualizing extracted information or debugging a system design. You caninsert text to display the number of objects or to keep track of other key information. You can insert graphics, suchas markers, lines, and polygons to mark found features, delineate objects, or highlight other key features. Thesystem toolbox functionality fuses text and graphics into the image or video itself rather than maintaining aseparate layer. You can combine two video sources in a composite that can highlight objects or a key region.

7

Images with text and graphics inserted. Adding these elements can help you visualize extracted information and debug yourdesign.

Stream Processing in MATLAB and Simulink

Computer Vision System Toolbox supports a stream processing architecture in both MATLAB and Simulink. In astream processing architecture, one or more video frames from a continuous stream are processed at a time. Thistype of processing is appropriate for analysis of large video files or systems with live video.

In MATLAB, stream processing is enabled by System objects, which use MATLAB objects to representtime-based and data-driven algorithms, sources, and sinks. System objects implicitly manage many details ofstream processing, such as data indexing, buffering, and the management of algorithm state. You can mix Systemobjects with standard MATLAB functions and operators. Most System objects have corresponding Simulinkblocks that provide the same capabilities.

Simulink handles stream processing implicitly by managing the flow of data through the blocks that make up aSimulink model. It includes a library of general-purpose, predefined blocks to represent algorithms, sources,sinks, and system hierarchy. Computer Vision System Toolbox provides a library of blocks specifically for thedesign of computer vision and video processing systems.

8



http://www.mathworks.com/discovery/system-objects.html

An abandoned object detection model (top). The three viewers (bottom) show the process of detecting and tracking anabandoned object in a live video stream from a camera in a train station.

System Design and Implementation

Computer Vision System Toolbox supports the creation of system-level test benches, fixed-point modeling, andcode generation within both MATLAB and Simulink. This support lets you integrate algorithm development withrapid prototyping, implementation, and verification workflows.

Fixed-Point Modeling

Many real-time systems use hardware that requires fixed-point representation of your algorithm. Computer VisionSystem Toolbox supports fixed-point modeling in most blocks and System objects, with dialog boxes and objectproperties that help you with configuration.

System toolbox support for fixed point includes:

▪ Word sizes from 1 to 128 bits

▪ Arbitrary binary-point placement

▪ Overflow handling methods (wrap or saturation)

▪ Rounding methods, including ceiling, convergent, floor, nearest, round, simplest, and zero

Code Generation Support

Most System objects, functions, and blocks in Computer Vision System Toolbox can generate ANSI/ISO C codeusing MATLAB Coder™, Simulink Coder™, or Embedded Coder™. You can select optimizations for specific

9



http://www.mathworks.com/products/matlab-coder/

http://www.mathworks.com/products/simulink-coder/

http://www.mathworks.com/products/embedded-coder/

Product Details, Examples, and System Requirementswww.mathworks.com/products/computer-vision

Trial Softwarewww.mathworks.com/trialrequest

Saleswww.mathworks.com/contactsales

Technical Supportwww.mathworks.com/support

processor architectures and integrate legacy C code with the generated code to leverage existing intellectualproperty. You can generate C code for both floating-point and fixed-point data types.

Simulink model designed to create code for a specific hardware target. This model generates C code for a video stabilizationsystem and embeds the algorithm into a digital signal processor (DSP).

Image Processing Primitives

Computer Vision System Toolbox includes image processing primitives that support fixed-point data types and Ccode generation. These System objects and Simulink blocks include:

▪ 2D spatial and frequency filtering

▪ Image pre- and postprocessing algorithms

▪ Morphological operators

▪ Geometric transformations

▪ Color space conversions

Resources

Online User Communitywww.mathworks.com/matlabcentral

Training Serviceswww.mathworks.com/training

Third-Party Products and Serviceswww.mathworks.com/connections

Worldwide Contactswww.mathworks.com/contact

© 2012 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarks for a list ofadditional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders. 10

http://www.mathworks.com/products/computer-vision/?s_cid=0909_webg_product_295122

http://www.mathworks.com/programs/trials/trial_request.html?s_cid=0909_webg_trial_295122

http://www.mathworks.com/company/aboutus/contact_us/contact_sales.html?s_cid=0909_webg_sales_295122

http://www.mathworks.com/support?s_cid=0909_webg_support_295122

http://www.mathworks.com/matlabcentral?s_cid=0909_webg_matlabcentral_295122

http://www.mathworks.com/services/training/?s_cid=0909_webg_training_295122

http://www.mathworks.com/products/connections/?s_cid=0909_webg_connections_295122

http://www.mathworks.com/company/aboutus/contact_us/index.html?s_cid=0909_webg_contact_295122

http://www.mathworks.com/trademarks

Documents

COMPUTER VISION TOOLBOX (MATLAB)