72
Lecture 10: Motion Features and Introduction to Content Based Image and Video Retrieval Dr Jing Chen NICTA & CSE UNSW CS9519 Multimedia Systems S2 2006 [email protected]

Lecture 10: Motion Features and Introduction to Content

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 10: Motion Features and Introduction to Content

Lecture 10: Motion Features and Introduction to Content

Based Image and Video Retrieval

Dr Jing ChenNICTA & CSE UNSW

CS9519 Multimedia SystemsS2 2006

[email protected]

Page 2: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 2 – J Chen

Last lecture…Color features

Color and color spacesHistograms and similarity metricsColor descriptors – Dominant, Scalable

Texture featuresEdge featuresShape features

Page 3: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 3 – J Chen

Last lecture… (Color Feature)

Color SpaceRGB, HSV, HMMD, YCbCr

Color HistogramsRepresented by set of pairs (bin, frequency)Bin Binning -- Fixed, Cluster, Adaptive

Page 4: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 4 – J Chen

Last lecture… (Similarity Metrics)Lp Χ2 KL JD QF EMD

Symmetrical yes yes no yes yes yes

Computational complexity

medium medium medium medium high High

Ground distance no no no no yes yes

Adaptive binning support

no no no no yes yes

Partial matches no no no no no yes

Accuracy in image retrieval

Depending on the application; Χ2 usually gives reasonably good results

Page 5: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 5 – J Chen

Last lecture…Color Descriptors in MPEG-7

Dominant color, Scalable Color (HSV), color Structure (HMMD), Color Layout (YCbCr),

Dominant Color Descriptor (DCD)

Extraction of Dominant Color Minimizing distortion

Updating rule

Similarity Measurement of DCD

( ){ } NisvpcF iii ...2,1,,,, ==

∑∑ ∈=−=i k

ii CkxNickxkhD )(,...1,)()( 2

ii Ckxkh

kxkhc ∈=

∑∑ )(,

)()()(

Page 6: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 6 – J Chen

Last lecture… (Texture Feature)Approach to texture feature

Angular features (directionality)Radial features (coarseness)

Texture Feature DescriptorPartition in frequency domain 30 channelsenergy and energy deviation of each channelmean and standard variation of frequency coefficients

Edge HistogramLocal histogram 16 x 5 = 80 binsGlobal histogram accumulation of local histogramSemi-global histogram

Page 7: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 7 – J Chen

Last lecture… (Shape Feature)

Region-based descriptorContour-based descriptor

Page 8: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 8 – J Chen

OutlineMotion features

Camera motionMotion activityMotion trajectory

Introduction to content based image and video retrieval

Page 9: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 9 – J Chen

Motion estimationPixel based motion estimation

Optical flowComputing a velocity vector for each of the pixels in the frameHighly accurate motion estimationProblems:

Fails when variable lighting conditions or occlusionVulnerable to noiseComputational complexity

Block matchingSimple and effectiveUsed in MPEG-1/2/4, H.261/2/3/4 etc

Page 10: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 10 – J Chen

MPEG-7 motion descriptors

Parametric Motion uses the same motion model and syntax as the Warping Parameters

Page 11: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 11 – J Chen

Camera motionCaptures 3-D camera motion parameters

tracking (horizontal transverse movement, also called traveling in the film industry) booming (vertical transverse movement)dollying (translation along the optical axis)panning (horizontal rotation)tilting (vertical rotation)zooming (change of the focal length)rolling (rotation around the optical axis)

Pan right

Pan left

Tilt up

Tilt downRoll* MPEG-7

Track left

Track right

Boom up

Boom down

Dollybackward

Dollyforward

Page 12: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 12 – J Chen

Motion activity descriptorCapture the “intensity of action” or “pace of action” in a video segment

Examples of high activity including scenes such as “goal scoring in a soccer match”, “scoring in a baseball game”, “a high speed car chase”, etc. On the other hand, scenes such as “news reader shot”, “an interview scene”, “a still shot” etc. are perceived as low action shots.

Attributes:Intensity of ActivityDirection of ActivitySpatial distribution of ActivityTemporal Distribution of Activity

Applicationscontent repurposing, surveillance, fast browsing, video abstracting, video editing, content based querying

Page 13: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 13 – J Chen

Intensity of motion

A high value of intensity indicates high activity while a low value of intensity indicates low activity.

For example, a still shot has a low intensity of activity while a “fast break” basketball shot has a high intensity of activity.

Page 14: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 14 – J Chen

ExampleMotion_shot_00 (low motion)

Page 15: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 15 – J Chen

ExampleMotion_shot_17 (high motion)

Page 16: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 16 – J Chen

Extraction of intensity of motion activityFive intensity levels:

1) very low intensity; 2) low intensity; 3) medium intensity;4) high intensity; 5) very high intensity.

Use quantized standard deviation of the motion-vector magnitude in a video segment to compute the intensity of motion activity

* Jeannin, S., and A. Divakaran. MPEG-7 Visual Motion Descriptors, CSVT, Vol 11, No. 6, pp. 720-724, June 2001.

Page 17: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 17 – J Chen

Direction of Activity (optional)

While a video shot may have several objects with differing activity, we can often identify a dominant direction.

The direction parameter expresses the dominant direction of the activity if any.

It is expressed by a three-bits integer that has a value corresponding to any of eight equally spaced directions.

Page 18: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 18 – J Chen

ExampleMotion_shot_013 (direction of motion)

Page 19: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 19 – J Chen

Extraction of direction of activityAngle from dominant MV

int quantize_angle(float f_angle) {int direction;

/* quantize angle using uniform 3-bits quantizationover 0-360 degrees i.e. 0,45,90,135,180,225,270,315 */

if((f_angle>=-22.5)&&(f_angle<22.5)) direction=0; (000)else if((f_angle>=22.5)&&(f_angle<67.5)) direction=1; (001)else if((f_angle>=67.5)&&(f_angle<112.5)) direction=2; (010)else if((f_angle>=112.5)&&(f_angle<157.5)) direction=3; (011)else if((f_angle>=157.5)&&(f_angle<202.5)) direction=4; (100)else if((f_angle>=202.5)&&(f_angle<247.5)) direction=5; (101)else if((f_angle>=247.5)&&(f_angle<292.5)) direction=6; (110)else if((f_angle>=292.5)&&(f_angle<337.5)) direction=7; (111)

return direction;}

y

θ

MV

x

Page 20: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 20 – J Chen

Spatial distribution of ActivityIndicates whether the activity is spread across many regions or restricted to one large region.

It is an indication of the number and size of “active” regions in a frame.

For example, a talking head sequence would have one large activeregion,

While an aerial shot of a busy street would have many small active regions.

The spatial distribution parameter is expressed by three integers using a total of 16 bits.

Page 21: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 21 – J Chen

ExampleMotion_shot_26 (spatial distribution of activity)

Page 22: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 22 – J Chen

Temporal Distribution of ActivityExpress the variation of activity over a video duration.

Represented by a parameter expressed by five 6-bits integers

The histogram consists of 5 bins, where histogram bins with indexes N0, N1, N2, N3, and N4 correspond to Intensity value of 1, 2, 3, 4, and 5 respectively.

The histogram expresses the relative frequency of different levels of activity in the sequence as defined by the intensity element above.

Each value is the percentage of occurrences of the correspondingquantized intensity level uniformly quantized to 6 bits.

Page 23: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 23 – J Chen

ExampleMotion_shot_032 (temporal distribution of activity)

Page 24: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 24 – J Chen

Motion trajectoryDescribes the displacements of objects in time

Trajectory model is first- or second-order piecewise approximation along time, for each spatial dimension

Key-points:representing the successive spatio-temporal positions of the described objecta set of (x,y,t) for 2-D x,y trajectory or (x,y,z,t) for 3-D x,y,ztrajectory

By default, linear interpolation (first order) between key-points is used

Interpolating parameters can be added to specify nonlinear interpolations between key-points, using a second-order function of time

Page 25: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 25 – J Chen

Example – motion trajectory

(50, 120, 5/30) (52, 120, 15/30) (54, 120, 25/30)

Page 26: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 26 – J Chen

ExampleLinear interpolation (example code in Matlab)

Page 27: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 27 – J Chen

ExampleLinear interpolation

Page 28: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 28 – J Chen

ExampleSecond order (polynomial) interpolation

Page 29: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 29 – J Chen

First and second order interpolationFirst order (linear) interpolation

Second order (polynomial) interpolation

Example of trajectory representation (one dimension)

* Jeannin, S., and A. Divakaran. MPEG-7 Visual Motion Descriptors, CSVT, Vol 11, No. 6, pp. 720-724, June 2001.

Page 30: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 30 – J Chen

Extraction of motion trajectory descriptorAssuming the position of objects is known

May be generated through segmentation/tracking (difficult though)

Selection of Key-points and their FunctionsNot defined by MPEG-7 standardOption 1: Key-points can be selected using regular time intervals sampling (simplest way)Option 2 (bottom up): Starting from lots of key-points, recursively remove points until the interpolation error exceeds a given thresholdOption 3 (top down): Starting with one interval containing two points, recursively splits intervals into two at the position where the interpolation error is maximum

Page 31: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 31 – J Chen

Option 1: regular time intervals sampling Black: true trajectoryRed: linear interpolation

Page 32: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 32 – J Chen

Option 2: bottom up (part 1) Black: true trajectory

Page 33: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 33 – J Chen

Option 2: bottom up (part 1) Black: true trajectory

Page 34: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 34 – J Chen

Option 2: bottom up (part 2) Black: true trajectory

Page 35: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 35 – J Chen

Option 2: bottom up (part 2) Black: true trajectory

Page 36: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 36 – J Chen

Option 2: bottom up (part 3) Black: true trajectory

Page 37: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 37 – J Chen

Option 2: bottom up (part 4) Black: true trajectoryRed: linear interpolation

Page 38: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 38 – J Chen

Option 3: top down (part 1) Black: true trajectory

Page 39: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 39 – J Chen

Option 3: top down (part 2) Black: true trajectory

Page 40: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 40 – J Chen

Option 3: top down (part 3) Black: true trajectory

Page 41: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 41 – J Chen

OutlineMotion features

Camera motionMotion activityMotion trajectory

Introduction to content based image and video retrieval

Text-based retrievalContent-based retrievalQuery formationFeature extractionSimilarity comparisonPerformance evaluation

Page 42: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 42 – J Chen

Google video search

Page 43: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 43 – J Chen

Google video search result part

Page 44: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 44 – J Chen

A closer look

Page 45: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 45 – J Chen

Text-based approach for image and video retrieval

Keywords annotation + text-based searching technique from traditional database management systemsAnnotation methods

By humanlabor intensive, subjective, content-sensitive and usually incomplete

To extract annotations from speech transcripts: Google video search

low accuracy May be improved with better machine understanding of natural languages (difficult!)

Automated machine understanding of images and videos“Semantic gap” between keywords and low level visual featuresChallenging research topic

Page 47: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 47 – J Chen

Bridging the semantic gapPattern recognition: develop a recognizer/classifier for each query concept

Eg, face detectorA simple and typical approach is feature extraction from images/video + classifier (eg, Support Vector Machines)Hard to generalize; impractical to develop classifiers for every possible query concept

Ontology (eg, broadcasting news)

ObjectsActions

Sites

Concepts

Outdoors IndoorsPerson

PeopleFace

NewsSubjectAnchor

Crowd

NewsMonolog

NewsDialog Studio

MPEG-7 Video Annotation Tool

Page 48: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 48 – J Chen

IBM Video Annex Demohttp://www.alphaworks.ibm.com/tech/videoannex

Page 49: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 49 – J Chen

Content based image and video retrieval

Emerged in early 1990s

Represent and index image/video with features (color, texture, shape, etc) extracted from the image/video content

Typical systems: QBIC, VisualSeek, SimPlicity, etc (one in this lect. Others in lect. 13)

Page 50: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 50 – J Chen

Image retrieval system diagram

Query formation

Feature extraction

Image database

Feature extraction

Image data

Similarity comparison

Indexing & retrieval

Retrieval results

Relevance feedback

user Feature vectors

output

Feature vectors

Feature vectors

Page 51: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 51 – J Chen

Query specificationA process of connecting user input with feature extraction to get feature vectors searchable in the database

Four major categories:Category browsing

Images are classified into different categories based on their semantic or visual content

Query by conceptUser supplied keyword -> concept (annotation); ie, text based

Query by sketchUser drawn sketch -> vectors

Query by exampleUser supplied example image -> vectors

Query specification

user vectorsFeature extraction

Page 52: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 52 – J Chen

Category browsing (1)

* A. Vailaya, A. K. Jain, and H. J. Zhang, “On image classification: City images vs. landscapes,” Pattern Recognit., vol. 31, no. 12, pp. 1921–1936, 1998.

Page 53: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 53 – J Chen

Category browsing (2)

* A. Vailaya, M. A. T. Figueiredo, A. K. Jain, and H.-J. Zhang, "Image Classification for Content-Based Indexing," IEEE Trans. Image Processing, vol. 10, no. 1, pp. 117--130, 2001.

Page 54: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 54 – J Chen

Image categorical pre-filtering may improve retrieval accuracy

(a) Query image

(b) top-ten retrieved images from 2145 city and landscape images

(c) top-ten retrieved images from 760 city images; filtering out landscape images prior to querying clearly improves the retrieval results.

* A. Vailaya, M. A. T. Figueiredo, A. K. Jain, and H.-J. Zhang, "Image Classification for Content-Based Indexing," IEEE Trans. Image Processing, vol. 10, no. 1, pp. 117--130, 2001.

Page 55: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 55 – J Chen

Limitations of categorical browsing

Ambiguity in categorizing images/videos

Images/videos found depending on the browsing path

Difficult to use if the number of categories is large

The ability to search is preferred in many applications

Page 56: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 56 – J Chen

Query by concept

A. Natsev, A. Chadha, B. Soetarman, and J. S. Vitter. ``CAMEL: Concept Annotated iMagELibraries,'' Proc. of SPIE Electronic Imaging 2001: Storage and Retrieval for Image and Video Databases, San Jose, CA, Jan 2001.

Page 57: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 57 – J Chen

Query by sketch

VisualSEEK user interface

The user sketches regions, positions them on the query grid assigns them properties of color, size and absolute location may also assign boundaries for location and size.

* John R. Smith , Shih-Fu Chang, VisualSEEk: a fully automated content-based image query system, Proc ACM Int Conf on Multimedia, p.87-98, Nov 18-22, 1996, USA

Page 58: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 58 – J Chen

VisualSEEK examples

* John R. Smith , Shih-Fu Chang, VisualSEEk: a fully automated content-based image query system, Proc ACM Int Conf on Multimedia, p.87-98, Nov 18-22, 1996, USA

Page 59: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 59 – J Chen

Query by example

Using shape feature in the above exampleW. Y. Ma and B. S. Manjunath, " NeTra: a toolbox for navigating large image databases", Multimedia Systems, vol.7, (no.3), Springer-Verlag, Berlin, Germany, pp.184-98, May 1999.

Page 60: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 60 – J Chen

Image retrieval system diagram

Query formation

Feature extraction

Image database

Feature extraction

Image data

Similarity comparison

Indexing & retrieval

Retrieval results

Relevance feedback

user Feature vectors

output

Feature vectors

Feature vectors

Page 61: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 61 – J Chen

Visual features - recapWhy visual features?

Manual labeling is very time consumingContent difficult to be described by text completelyMachine understanding of image/video is far from mature

What visual featuresExtractable from image/videoLearn from human visual system

Visual feature => feature vectors

Page 62: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 62 – J Chen

Popular visual featuresColor

Color histogram (HSV, YCbCr,…)Color momentsDominant color

Texturestructural and statisticalTexture histogramEdge histogram

Shapeboundaries of objects

MotionCamera motion (PZT)Object motion

Page 63: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 63 – J Chen

Content based retrieval system diagram

Query formation

Feature extraction

Image database

Feature extraction

Image data

Similarity comparison

Indexing & retrieval

Retrieval results

Relevance feedback

user Feature vectors

output

Feature vectors

Feature vectors

Page 64: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 64 – J Chen

Similarity comparisonGiven two feature vectors I, J, the distance is defined as D(I,J) = f(I,J)Typical similarity metrics

Lp (Minkowski distance)Χ2 metric KL (Kullback-Leibler Divergence)JD (Jeffrey Divergence)QF (quadratic form)EMD (Earth mover’s distance)

Page 65: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 65 – J Chen

K-nearest neighbour searchGiven a query vector vq, a brute-force k-nearest neighbour search method is (essentially): results = [ ]; maxD = infinity for each obj in the database {

dist = D(vobj,vq) if (#results < k or dist < maxD) {

insert (obj,dist) into results // results is sorted with

length <= k maxD = largest dist in results

} } Cost = Topen + NPTP + NTDNote: If q is an image from the database, we can use a pre-computed distance table to make this much faster.

* John Shepherd

Name Meaning Typically

N number of objects in the database

103 .. 1010

NP number of disk pages to hold stored objects

50 .. 1010

TP time to read a page from disk into memory

10ms

TD time to compute distance between two objects (using vectors)

100us (?)

Topen time to open a database file

10ms

Page 66: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 66 – J Chen

Content based retrieval system diagram

Query formation

Feature extraction

Image database

Feature extraction

Image data

Similarity comparison

Indexing & retrieval

Retrieval results

Relevance feedback

user Feature vectors

output

Feature vectors

Feature vectors

Page 67: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 67 – J Chen

Performance evaluationWe have three numbers: #system-correctly-retrieved-images, #system-retrieved-images, #relevant-images-in-DBPrecision = #system-correctly-retrieved-images / #system-retrieved-imagesRecall = #system-correctly-retrieved-images / #relevant-images-in-DB F# = (2 x precision x recall) / (precision + recall)

Recall

Precision

Page 68: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 68 – J Chen

A tutorial questionSuppose we have 1000 images in the databaseWe want to retrieve images with concept “car”There are 200 “car” images in the databaseWe retrieved 250 images, and there are 150 “car”images in these 250 imagesCalculate precision, recall and F-number.

Page 69: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 69 – J Chen

A demo retrieval systemMARVeL – from IBM Research

Exe fileResult: file:///c:/marvel/docs/html/main/0/index.html

Page 70: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 70 – J Chen

IBM in TRECVID 2004

Visual features included color histograms, edge histograms, color moments, wavelet texture, co-occurrence texture, moment invariants etc.

Page 71: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 71 – J Chen

Assignment2See http://www.cse.unsw.edu.au/~cs9519/assig-2/Submission deadline 4 Nov 2005Start early to avoid the late rush and possible conflicts with exams!

Page 72: Lecture 10: Motion Features and Introduction to Content

COMP9519 Multimedia Systems – Lecture 10 – Slide 72 – J Chen

Some references

S. Jeannin and A. Divakaran. MPEG-7 visual motion descriptors. IEEE Transactions on Circuits and Systems for Video Technology, 11(6):720-724, Jun 2001.

B.S. Manjunath , Phillipe Salembier , Thomas Sikora, Introduction to MPEG-7: Multimedia Content Description Interface, John Wiley & Sons, Inc., New York, NY, 2002 (Book)

Chapter 1, Fundamentals of content-based image retrieval, by F. Long, H.-J. Zhang and D. Feng, in book Multimedia Information Retrieval and Management, http://research.microsoft.com/asia/dload_files/group/mcomputing/2003P/ch01_Long_v40-proof.pdf