43
Video Retrieval

Video Retrieval

  • Upload
    orli

  • View
    89

  • Download
    1

Embed Size (px)

DESCRIPTION

Video Retrieval. Topics. Shot detection algorithm Video indexing key frame-based video indexing Adaptive video indexing technique Automatic relevance feedback network for video retrieval Experiment on iARM: video search engine. Video Data. - PowerPoint PPT Presentation

Citation preview

Page 1: Video Retrieval

Video Retrieval

Page 2: Video Retrieval

2

Topics

• Shot detection algorithm

• Video indexing• key frame-based video indexing• Adaptive video indexing technique

• Automatic relevance feedback network for video retrieval

• Experiment on iARM: video search engine

Page 3: Video Retrieval

3

Video Data

• Video is a continuous media but for database storage and manipulation such as random access it is important to be able to deal with portions of video object.

• Video Segmentation―cutting long video into portions:shot, scene, and clip

• Shot define a low level syntactic building blocks of video sequence.

• Scene is the logical grouping of shots into semantic unit.

• Clip is not clearly defined so it can last from a few seconds to several hours.

Page 4: Video Retrieval

4

Video Segmentation

Page 5: Video Retrieval

5

Organization of Video Data

Page 6: Video Retrieval

6

Shot Boundaries

• Shot boundary detection can be easy, or difficult, depending

• cut: hard boundary complete change of shot between consecutive frames

• fade: fade-out or fade-in, a gradual fade to/from completely back (white?) frame

• dissolve: simultaneous fade-out and fade-in• …..others

• Each of these post-production technique make the detection of shot boundaries more difficult.

Page 7: Video Retrieval

CutCut

Fade inFade in

Fade outFade out

DissolveDissolve

Page 8: Video Retrieval

8

Frame-to-frame comparison

),( 12 hhd

1h

2h

3h

4h

),( 23 hhd

),( 34 hhd

N

iniminm hhhhd

1

),(

],...,,[ 21 jNjjj hhhh

is the color histogram of the is the color histogram of the jj-th frame-th frame

Page 9: Video Retrieval

9

Shot Boundary Detection

Page 10: Video Retrieval

10

Key-Frame for Shot Representation

key-framekey-frame

key-framekey-frame

Page 11: Video Retrieval

11

Content Representation

• Content of video shot is describe by a low-level feature (e.g., color histogram) of the corresponding key-frame.

• The m-th video shot is indexed by

Video ShotVideo Shot ],...,,[ 21 mNmmm hhhh

mh

Key-FrameKey-Frame Content DescriptorContent Descriptor

Page 12: Video Retrieval

12

Querying Video Database

Video Shot 1Video Shot 1 ],...,,[ 112111 Nhhhh

Video Shot 2Video Shot 2 ],...,,[ 222212 Nhhhh

Video Shot 3Video Shot 3 ],...,,[ 332313 Nhhhh

Video Shot Video Shot JJ ],...,,[ 21 JNJJJ hhhh

Query Query ShotShot

qh

DatabaseDatabase

MatchingMatching

Page 13: Video Retrieval

13

GUI for Key Frame-Based Video Retrieval

QueryQueryShotShot

PlayPlayshotshot

Page 14: Video Retrieval

14

Problems

• Compared to an image, video data contains both spatial and temporal information

• Key frame-based video indexing (KFVI) method can deal with spatial content but does not take into account temporal information.

• Furthermore, KFVI is not well adapted for representing video at scene and story levels

Page 15: Video Retrieval

15

Adaptive Video Indexing (AVI) Technique

• A better technique in capturing temporal content as well as the spatial content for effective video indexing

• AVI provide multiple access to video database at three levels:

• shot• group-of-shot• Story

Page 16: Video Retrieval

16

Database Organization based on AVI

Where is the descriptor of the video interval

• Video shot database

• group of shots

• story

)}(|),{( FIIIDVD ShotiiIShot i

)}(|),{( FIIIDVD GroupiiIGroup i

)}(|),{( FIIIDVD StoryiiIStory i

Multiple-level access to video database

iID iI

Page 17: Video Retrieval

17

Fundamental of AVI

• Video sequence is a collection of visual templates (i.e., image frame)

• Similar video contains use similar visual templates

V 1V 1 V 2V 2 V 3V 3 V 4V 4 V 5V 5 V 6V 6 V 7V 7 V 8V 8 V 9V 9

Page 18: Video Retrieval

18

Fundamental of AVI

V 1V 1 V 2V 2 V 3V 3 V 4V 4 V 5V 5 V 6V 6 V 7V 7 V 8V 8 V 9V 9

Descriptor of shot 1: [0 0 2 0 3 0 0 2 0 ...]Descriptor of shot 1: [0 0 2 0 3 0 0 2 0 ...]

V 1V 1 V 2V 2 V 3V 3 V 4V 4 V 5V 5 V 6V 6 V 7V 7 V 8V 8 V 9V 9

Descriptor of shot 2: [0 0 0 0 3 2 0 0 0 ...]Descriptor of shot 2: [0 0 0 0 3 2 0 0 0 ...]

Page 19: Video Retrieval

19

• Given a set of initial visual templates, and training vectors

• The templates are optimized through the following steps:• Randomly choose the input vector• If is the closest node to such that

• Then,

Template Generation

},...,1|{ RrgC r

JRx Jjj ,}{ 1

jx

*rg

))()(()()1( *** ngxnngngrjrr

*,,...,1||,|||||| * rrRrgxgx rjrj jx

Page 20: Video Retrieval

20

• Let be a set of descriptors for the video interval I, where is the histogram corresponding to the video frame

• Each is mapped to a Voronoi space through

where

and is the label of the n-th cell neighboring to the best match cell,

Template-frequency modeling (TFM)

)},(),...,,(),...,,{( 11 MMmmI fxfxfxD

pmx

mf

Cp

},...,,{,1,

)(*** nrrr

xm lllx m

||)(||minarg* rmr

rgxl

nrl

,*

*rg

mx

Page 21: Video Retrieval

21

TFM Cont.

• The resulting of all frames from the mapping of the entire video interval are used as a representation of the video through a weight scheme:

where is the number of times the template is mentioned in the content of the video , N denotes the total number of videos in the system, and denotes the number of videos in which the index template appears.

Mmmx ,...,1,)(

jI

rjr

r

jrjr

jRjrjj

nNfreq

freqw

wwwv

/logmax

),...,,...,( 1

jrfreq rg

jI

rn

rg

Page 22: Video Retrieval

22

Test Data

Video sequences # Sequences # Cuts # Frames Lengths (min:sec)

Commercial 20 844 98,733 54:52

Movie clip 2

Headline and story news

46

Description of sequences in the database: CNN broadcast news (at 352 resolution and 30 frames/sec.)

Page 23: Video Retrieval

23

Retrieval Results

(a) (b)

A comparison of the retrieval performance at the shot level; (a) obtained by KFVI; and (b) obtained by the AVI

Page 24: Video Retrieval

24

Performance Comparison

Precision results averaged over 25 queries, compared between adaptive video indexing (AVI) and key-frame based video indexing (KFVI), using video database containing 844 video shots

Page 25: Video Retrieval

25

Query-by-Video-Clip

Precision and recall rates obtained by retrieval of:(a) video groups, employing two links: shot-to-group (STG) and group-to-group

(GTG)(b) video story, employing two links: shot-to-story (STS) and group-to-story (GTS)

(a) (b)

Page 26: Video Retrieval

26

Query-by-Video-Clip Cont..

(a) Query clip, <1.8 sec>

(b) Rank 1, <1.8 sec>

(c) Rank 2, <2.4 sec>

(d) Rank 3, <1.9 sec>

(e) Rank 4, <2.7 sec>

(f) Rank 5, <3.3 sec>

Page 27: Video Retrieval

27

Relevance Feedback for Video Retrieval: A client-server architecture

Search Engine with Relevance FeedbackSearch Engine with Relevance Feedback

Page 28: Video Retrieval

28

Problem with Relevance Feedback (RF)

• user have to play each retrieved video in a feedback cycle

• compared to an image, video files are usually very large• time consuming• high bandwidth in RF training process

Page 29: Video Retrieval

29

Automatic and Semi-Automatic RFs

Search Engine with Automatic Relevance Feedback NetworkSearch Engine with Automatic Relevance Feedback Network

Page 30: Video Retrieval

30

Automatic Relevance Feedback Network (ARFN)

• Goal: implementation of adaptive system to improve retrieval accuracy

• Strategy: incorporate self-learning neural network in the relevance feedback module in order to avoid user’s interaction during the retrieval process

Page 31: Video Retrieval

31

ARFN Architecture

number of nodes in the second layer = number of visual templatesnumber of nodes in the second layer = number of visual templatesnumber of nodes in the third layer = number of video in the database number of nodes in the third layer = number of video in the database

Page 32: Video Retrieval

32

Signal Propagation

(a) (c)(b)

(a) Forward propagation; (b) Backward propagation; (c) New video template nodes in (b) introduce a new video node. This process results in the activation of new video nodes by expanding the original query templates, analogous to the traditional relevance feedback technique

Page 33: Video Retrieval

33

Signal Propagation Cont..

• Activation level at the video template nodes, can be calculate according to two criterion:• Positive feedback

• Positive and negative feedback

where is the activation of the j-th video node, Pos is the set of positive video nodes, Neg is the set of negative video nodes

R

r jrjrjrPosj

jrvj

tr wwwwaa

1

2)()( ,

Negjjr

vj

Posjjr

vjqrr

rtr

wawawL

La)()(

)(

)(tra

)(vja

Page 34: Video Retrieval

# Return Cosine measure 1 Iter. 3 Iter. 20 Iter.

1 100 0.0 0.0 0.0

2 100 0.0 0.0 0.0

3 98.67 +1.33 +1.33 -1.33

4 97.00 +1.00 +2.00 -2.00

5 96.00 +0.80 +1.60 -1.60

6 94.67 +0.67 +2.00 -1.33

7 90.29 +2.29 +3.43 +1.14

8 89.00 +3.00 +2.50 +1.50

9 86.67 +2.67 +3.11 +2.67

10 82.80 +5.60 +5.60 +4.80

11 80.36 +6.18 +6.18 +5.09

12 77.67 +7.33 +7.67 +7.00

13 74.77 +8.62 +10.15 +8.31

14 72.00 +9.43 +11.14 +9.72

15 69.33 +9.87 +11.20 +10.13

16 67.75 +9.50 +11.00 +10.00

Average Precision Rate, APR (%) obtained by retrieving 25 video shot queries. ARFN results are quoted relative to the APR observed with simple retrieval.

ResultsResults

Page 35: Video Retrieval

Experiment: Video Search Engine

Page 36: Video Retrieval

36

Goals

• Setting video search engine at the shot level, using JSP and J2EE server

• Implementing video indexing using AVI and compared it with KFVI

• Implementing a simple user-controlled interactive retrieval method within the search engine

Page 37: Video Retrieval

37

GUI in iARM search engine

QueryQueryShotShot

Selected Selected methodmethod

Page 38: Video Retrieval

38

Step I

• Copy all the files in the folder “Experiments” to drive C:

•Feature DatabaseFeature Database•key-frame Databasekey-frame Database

Jsp file andJsp file andJava BeansJava Beans

Video Shot DatabaseVideo Shot Database

Page 39: Video Retrieval

39

Step II: load feature vectors to database>> java COM.cloudscape.tools.cview>> java COM.cloudscape.tools.cviewOpen the video feature database “C:\Experiments\database\videoOpen the video feature database “C:\Experiments\database\video

Page 40: Video Retrieval

40

Step III: deploy application

deploytooldeploytool

New ApplicationNew Application

Page 41: Video Retrieval

Add Web componentsAdd Web componentsto the Applicationto the Application• index.jspindex.jsp• autoFeedback.classautoFeedback.class• CompType.classCompType.class• MyDateJose.classMyDateJose.class• MyLocalRbf.classMyLocalRbf.class• userData.classuserData.class

Page 42: Video Retrieval

42

Deploy the Application

Page 43: Video Retrieval

43

Open the search engine: “http://localhost:8000/iARM/index.jsp”