Upload
multimediaeval
View
11
Download
0
Embed Size (px)
Citation preview
UNIFESPatMediaEval 2016:PredictingMedia InterestingnessTask
Jurandy AlmeidaGIBIS Lab, Institute of Science and Technology, Federal University of Sao Paulo – UNIFESP
Introduction• Developed in the MediaEval 2016 Pre-
dicting Media Interestingness Taskand for its video subtask only.
• The goal is to automatically select themost interesting video segments ac-cording to a common viewer.
• The focus is on features derived fromaudio-visual content or associated tex-tual information.
Proposed Approach
It relies on combining learning-to-rank algo-rithms and exploiting visual information:
1. A simple histogram of motion patternsis used for processing visual information.
2. A majority voting scheme is used forcombining machine-learned rankers andpredicting the interestingness of videos.
Visual Features• Low-Level & Mid-Level Features: Not used
• Applying an algorithm to encode visualproperties from video segments.
– “Comparison of Video Sequences withHistograms of Motion Patterns” [1].
• It relies on three steps:
1. partial decoding;
2. feature extraction;
3. signature generation.
106 111
100 88
91 94
95 90
90 93
96 91
1 1
2 1
2 1
0 3
Previous Current Next
Temporal Spatial
Time Series of Macroblocks
Video Frames
I-frames
Macroblock
Pixel Block
Histogram Distribution
DC coefficient
1: Partial Decoding
2: Feature Extraction
3: Signature Generation
Motion Pattern
0101100110010011
Histograms of Motion Patterns (HMP)
Learning to Rank Strategies
• Ranking SVM [5]: Use the traditional SVM classifierto learn a ranking function.
• RankNet [2]: Probability distribution metrics as costfunctions to be optimized.
• RankBoost [4]: Regression error on weighted distri-butions of pairwise rankings.
• ListNet [3]: Extension of RankNet that uses a rankedlist instead of pairwise rankings.
• Majority Voting [6]: The label with the most votesis selected as the label for a given instance.
Input
Rankers R1 R2 RN
O1 O2 ON
Combining Rankings
Output o
Experimental Protocol
• 4-fold cross validation
• Development data
– 5,054 videos from 52 movie trailers
• Test data
– 2,342 videos from 26 movie trailers
• Mean Average Precision (MAP)
Configurations of Runs
Run Learning-to-Rank Strategy1 Ranking SVM2 RankNet3 RankBoost4 ListNet5 Majority Voting
Experimental Results
Results obtained on the development data. Results of the official submitted runs.
RankingSVM
RankNet
RankBoost
ListNet
Majority
Voting
MAP(%
)
10
11
12
13
14
15
16
17
18
19
20
0
5
10
15
20
25
MAP(%
)
RankingSVM
RankNet
RankBoost
ListNet
Majority
Voting
18.15
16.1716.17 16.56
14.35
AP per movie trailer achieved in each run.
video−52
video−53
video−54
video−55
video−56
video−57
video−58
video−59
video−60
video−61
video−62
video−63
video−64
video−65
video−66
video−67
video−68
video−69
video−70
video−71
video−72
video−73
video−74
video−75
video−76
video−77
0
10
20
30
40
50
60
70
Average
Precision
(%)
Ranking SVM
RankNet
RankBoost
ListNet
Majority Voting
The learning-to-rank algorithmsprovide complementary infor-mation that can be combined byfusion techniques aiming at pro-ducing better results.
Remarks• The proposed approach has explored only
visual properties. Different learning-to-rank strategies were considered, in-cluding a fusion of all of them.
• Results demonstrate that the proposedapproach is promising. By combininglearning-to-rank algorithms, it is possibleto make a contribution to better results.
Future WorksThe investigation of a smarter strategy for combining learning-to-rank algorithms and consideringother information sources to include more features semantically related to visual content.
Acknowledgements
This research was supported by Brazilian agencies FAPESP, CAPES, and CNPq.
References
[1] J. Almeida, N. J. Leite, and R. S. Torres. Compar-ison of video sequences with Histograms of MotionPatterns. In ICIP, pages 3673–3676, 2011.
[2] C. J. C. Burges, T. Shaked, E. Renshaw, A. Lazier,M. Deeds, N. Hamilton and G. N. Hullender. Learn-ing to rank using gradient descent. In ICML, pages89–96, 2005.
[3] Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li.Learning to rank: from pairwise approach to listwiseapproach. In ICML, pages 129–136, 2007.
[4] Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer.An efficient boosting algorithm for combining prefer-ences. Journal of Machine Learning Research, 4:933–969, 2003.
[5] T. Joachims. Training linear SVMs in linear time. InACM SIGKDD, pages 217–226, 2006.
[6] L. Lam and C. Y. Suen. Application of majority vot-ing to pattern recognition: an analysis of its behaviorand performance. IEEE Trans. Systems, Man, andCybernetics, Part A, 27(5):553–568, 1997.
1