9
Discriminative Segment Annotation in Weakly Labeled Video Kevin Tang, Rahul Sukthankar Appeared in CVPR 2013 (Oral)

Discriminative Segment Annotation in Weakly Labeled Video Kevin Tang, Rahul Sukthankar Appeared in CVPR 2013 (Oral)

Embed Size (px)

Citation preview

Discriminative Segment Annotation in Weakly Labeled Video

Kevin Tang, Rahul Sukthankar

Appeared in CVPR 2013 (Oral)

Research Problem• Input: a weakly labeled video (eg., “dog”)• Output: identify segments that correspond to the label to generate the semantic

segmentation, i.e., classify each segment either as coming from concept “dog” (called concept segments), or not (called background segments).

• Pipeline– Perform unsupervised spatiotemporal segmentation.– Propose an algorithm to identify the meaningful segment.

Contributions

• Present a interpretation framework to analyze a broad class of existing weakly supervised learning algorithms about segment annotation problem.

• Propose a discriminative algorithm CRANE (Concept Ranking According to Negative Exemplars) for segment annotation.

Interpretation framework

• Pairwise distance matrix between segments

Segment: spatiotemporal volume (3D), represented as a point in feature space(such as RGB histogram, local binary pattern histogram, or dense optical histogram).

• Positive segment Concept segment Background segment

• Negative segment

Goal: classify the from in .

Rank the elements in in decreasing order of a score, such that top-rankedElements correspond to .

Interpretation framework• Baseline algorithms about segment annotation.– Kernel density estimation for Negative segments.

• Intuition: the distribution of is similar to distribution of .• Construct a probability density operated on block C.• Rank the elements according to .

– Negative Mining (MIN)• Intuition: distance from to the nearest > distance from to

nearest . • Operated on block D.

CRANE• Each negative segment in penalizes nearby segments in .• Segments in should be those far from negatives.

Penalty function

CRANE

• Advantages

1. Robust to noise.2. Parallelizable.

Experimental Results

YouTube Objects datasets

Experimental Results

• (a): Sucesses (b): Failures