View
216
Download
1
Embed Size (px)
Citation preview
Video retrieval using inference network
A.Graves, M. Lalmas
In Sig IR 02
Outline
Background Mpeg 7 Inference network model Experiment Conclusion
Background
CBVR: find the video upon demand The semantic gap: high level concepts and low level
features Traditional method: retrieve by example Relevance feedback: tedious for video retrieval Retrieval by semantics?
Mpeg 7
Multimedia Content description interface Attach metadata to multimedia content
Semantics: event, actor, place……Structure: shot, scene, group……
Video as a structured document The information contained in the Mpeg 7 annotation
can be exploited to perform semantic based video retrieval
Mpeg 7 does not provide the solution to extract the annotation
Mpeg 7
Description definition language Descriptors Description schemes
Mpeg 7
Structured document and description <TextAnnotation>
<FreeTextAnnotation> Basil attempts to mend the car without success
</FreeTextAnnotation> <StructuredAnnotation>
<Who>Basil</Who> <WhatObject>Car</WhatObject> <WhatAction>Mend</WhatAction> <Where>Carpark</Where>
</StructuredAnnotation> </TextAnnotation> <Video structures>……
<Video shot 13>… …</>
Inference network
Perform a ranking given many sources of evidence Document network (DN)
Constructed from the document data, represents all the retrievable units
Query network (QN)Constructed from the query, represents the information needs
Inference network
The document networkDocument layer (retrievable units collection)Contextual layer (represents the contextual information about
the document-concept links)Concept layer (represent all the concepts in the network)
Inference network
Document network
Inference network
Link weight calculationStructural: duration ratio (Between document nodes)Contextual (Between contextual nodes) sibling number,
context size, frequencyContext – concept (tf, idf)
Inference network
Query networkA framework of nodes that represents the information needConcept nodes Constraint operators (and, or, sum, not…), context-conceptual
constraints
Inference network
Inference network
Attachment and EvaluationAttachment: match the DN and QN, get a set of links between
their nodesAll the constraint satisfiedLink inheritance: in the DN, document node can share the
context nodes of its parent node Evaluation: calculated quantized similarity for each
document node in the DN
Inference network
AttachmentLink the QN and DN such that:
The concept nodes that contains same concepts (need a synonym dictionary) with weight 1 (firm)
For constraint queries, adjust the weight
Inference network
Attachment
Inference network
EvaluationThe evaluation process can be done toward all
document nodes, it is calculated according to the query network
Can be used in different applications:Retrieve scene, shot from one videoRetrieve video from video collections…
Inference network
EvaluationBack-propagate the QN-DN link weight back to each
document nodesAll the node will have a valueCan derive a rank at different granularity level (Video, scene,
shot)
Experiment
3 manually annotated video are employed as test data, totally 329 shots
Annotations:<abstract> <Structure Annotation> <Freetext Annotation> for video shots and
scenes Avg. precision: 69.25 Similarity order are as expected
Experiment
The performance are quite dependent on the quality of the annotation data
Efficient annotation methods will be quite helpful
Conclusion
Propose a semantic video retrieval model based on Inference Network model, fully exploits the structural, conceptual, and contextual aspects of Mpeg 7.
Parry the semantic gap problem