MediaEval 2016 - UNIFESP Predicting Media Interestingness Task

UNIFESPatMediaEval 2016:PredictingMedia InterestingnessTask

Jurandy AlmeidaGIBIS Lab, Institute of Science and Technology, Federal University of Sao Paulo – UNIFESP

[email protected]

Introduction• Developed in the MediaEval 2016 Pre-

dicting Media Interestingness Taskand for its video subtask only.

• The goal is to automatically select themost interesting video segments ac-cording to a common viewer.

• The focus is on features derived fromaudio-visual content or associated tex-tual information.

Proposed Approach

It relies on combining learning-to-rank algo-rithms and exploiting visual information:

1. A simple histogram of motion patternsis used for processing visual information.

2. A majority voting scheme is used forcombining machine-learned rankers andpredicting the interestingness of videos.

Visual Features• Low-Level & Mid-Level Features: Not used

• Applying an algorithm to encode visualproperties from video segments.

– “Comparison of Video Sequences withHistograms of Motion Patterns” [1].

• It relies on three steps:

1. partial decoding;

2. feature extraction;

3. signature generation.

106 111

100 88

91 94

95 90

90 93

96 91

1 1

2 1

2 1

0 3

Previous Current Next

Temporal Spatial

Time Series of Macroblocks

Video Frames

I-frames

Macroblock

Pixel Block

Histogram Distribution

DC coefficient

1: Partial Decoding

2: Feature Extraction

3: Signature Generation

Motion Pattern

0101100110010011

Histograms of Motion Patterns (HMP)

Learning to Rank Strategies

• Ranking SVM [5]: Use the traditional SVM classifierto learn a ranking function.

• RankNet [2]: Probability distribution metrics as costfunctions to be optimized.

• RankBoost [4]: Regression error on weighted distri-butions of pairwise rankings.

• ListNet [3]: Extension of RankNet that uses a rankedlist instead of pairwise rankings.

• Majority Voting [6]: The label with the most votesis selected as the label for a given instance.

Input

Rankers R1 R2 RN

O1 O2 ON

Combining Rankings

Output o

Experimental Protocol

• 4-fold cross validation

• Development data

– 5,054 videos from 52 movie trailers

• Test data

– 2,342 videos from 26 movie trailers

• Mean Average Precision (MAP)

Configurations of Runs

Run Learning-to-Rank Strategy1 Ranking SVM2 RankNet3 RankBoost4 ListNet5 Majority Voting

Experimental Results

Results obtained on the development data. Results of the official submitted runs.

RankingSVM

RankNet

RankBoost

ListNet

Majority

Voting

MAP(%

)

10

11

12

13

14

15

16

17

18

19

20

0

5

10

15

20

25

MAP(%

)

RankingSVM

RankNet

RankBoost

ListNet

Majority

Voting

18.15

16.1716.17 16.56

14.35

AP per movie trailer achieved in each run.

video−52

video−53

video−54

video−55

video−56

video−57

video−58

video−59

video−60

video−61

video−62

video−63

video−64

video−65

video−66

video−67

video−68

video−69

video−70

video−71

video−72

video−73

video−74

video−75

video−76

video−77

0

10

20

30

40

50

60

70

Average

Precision

(%)

Ranking SVM

RankNet

RankBoost

ListNet

Majority Voting

The learning-to-rank algorithmsprovide complementary infor-mation that can be combined byfusion techniques aiming at pro-ducing better results.

Remarks• The proposed approach has explored only

visual properties. Different learning-to-rank strategies were considered, in-cluding a fusion of all of them.

• Results demonstrate that the proposedapproach is promising. By combininglearning-to-rank algorithms, it is possibleto make a contribution to better results.

Future WorksThe investigation of a smarter strategy for combining learning-to-rank algorithms and consideringother information sources to include more features semantically related to visual content.

Acknowledgements

This research was supported by Brazilian agencies FAPESP, CAPES, and CNPq.

References

[1] J. Almeida, N. J. Leite, and R. S. Torres. Compar-ison of video sequences with Histograms of MotionPatterns. In ICIP, pages 3673–3676, 2011.

[2] C. J. C. Burges, T. Shaked, E. Renshaw, A. Lazier,M. Deeds, N. Hamilton and G. N. Hullender. Learn-ing to rank using gradient descent. In ICML, pages89–96, 2005.

[3] Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li.Learning to rank: from pairwise approach to listwiseapproach. In ICML, pages 129–136, 2007.

[4] Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer.An efficient boosting algorithm for combining prefer-ences. Journal of Machine Learning Research, 4:933–969, 2003.

[5] T. Joachims. Training linear SVMs in linear time. InACM SIGKDD, pages 217–226, 2006.

[6] L. Lam and C. Y. Suen. Application of majority vot-ing to pattern recognition: an analysis of its behaviorand performance. IEEE Trans. Systems, Man, andCybernetics, Part A, 27(5):553–568, 1997.

1