Upload
klschoef
View
628
Download
0
Embed Size (px)
DESCRIPTION
These are the slides from my keynote talk about Video Browsing on June 18, 2014, at the International Workshop on Content-Based Multimedia Indexing (CBMI) 2014.
Citation preview
Video Browsing
Klaus Schoeffmann, Klagenfurt University, AustriaCBMI 2014
The need for interactive video search…
Video Content Search Scenarios
• Private collection of recorded videosMany long sequences… You know there are a few interesting (e.g., funny) clips, but don’t know where
Want to find them for editing/sharing• Downloaded a suggested lecture video
In hurry for exam…2 hours durationWant to quickly check for important information
• Recordings from several surveillance camerasQuickly look for suspicious activities (e.g., forensics expert)Disasters (e.g., Boston Marathon bombings 2013)
2
Use Video Retrieval Tool?
3
Content‐based Feature
Example Image
Text
Ranked list of shots
Temporal Context
[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November). Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]
Video Search Scenarios
• Private collection of recorded videosMany long sequences…You know there are a few interesting (e.g., funny) clips, but don’t know where
Want to find them for editing/sharing• Downloaded a suggested lecture video
In hurry for exam…2 hours durationWant to quickly check for important information
• Recordings from several surveillance camerasQuickly look for suspicious activities (e.g., forensics expert)Disasters (e.g., Boston Marathon bombings 2013)
4
interesting
important information
suspicious activities
Shortcomings…pic by [ sunface13 ]
Video RetrievalWell-known issues
Query by example Typically no perfect example available.
Query by text How to describe a desired image by text?
Usability Gap
6
A picture tells a 1000 words.
by marfis75
How to describe a video clip by text???
Low performance in broad domainDatabase affinity of concept classifiers
P(k) Precision at level k (after k results)rel(k) defines if kth retrieved document is relevant
TRECVID 2013 Semantic Indexing (SIN‐500): median “inferred average precision” (infAP) < 0.13
Performance Gap
7
Video RetrievalWell-known issues
TRECVID Known-item Search
TRECVID KIS (2010‐2012)models the situation in which “someone knows of a video, has seen it before, believes it is contained in a collection, but doesn‘t know where to look”
Automatic Search Text‐description about the video Return ranked list of 100 videos (out of 9000)
Interactive Search Pre‐processing based on text query Searcher browses through result list (e.g., keyframes of shots)
• Interactively find target video as fast as possible• Within 5 minutes
8
TRECVID Known-item SearchThe Performance of State-of-The-Art Video Retrieval Tools
Known items not found by any team:
Interactive Automatic out of
2010 5 / 24 21% 69 / 300 22% 15 teams
2011 6 / 25 24% 142 / 391 36% 9 teams
2012 2 / 24 17% 108 / 361 29% 9 teams
From: [Alan Smeaton, Paul Over, “Known‐Item Search @ TRECVID 2012”, NIST, 2012]
9
What is Video Browsing?
10
Video Browsing[ F. Arman, R. Depommier, A. Hsu, and M‐Y. Chiu, Content‐based Browsing of Video Sequences,
in Proc. of ACM International Conference on Multimedia, 1994, pp. 97‐103 ]
11
How do Users Browse Today?In practice most users employ a…
VCR in the 1970s provided a similar functionality!
12
Novice vs. Expert
13
• Mostly interactive search• Simple‐to‐use• Inflexible and tedious for archives• Low performance
• Mostly automatic search• Complicated to use• Flexible and easier (?) for archives• Still limited performance
Modern Video Browsing
• Combines automatic and interactive search• Integrates the user in search process
Instead of „query‐and‐browse‐results“User controls search process
Inspects and interacts Most meaningful feature for current need
• content navigation, abstract visualization, ad‐hoc querying or content summarization, …
Klaus Schoeffmann, Frank Hopfgartner, Oge Marques, Laszlo Boeszoermenyi, and Joemon M. Jose, “Video browsing interfaces and applications: a review“, in SPIE Reviews Journal , Vol. 1, No. 1, pp. 1‐35 (018004), SPIE, Online, March 2010 14
Exploratory Search„Will know it when I see it!“
(instead of “telling the system what you want”)
Modern Video Browsing
• Interactive inspection/exploration of visual content in order to satisfy an information need
• Focuses on search and exploration in (i) single videos as well as (ii) video collections
Directed Search Find a specific shot or segment in a video Find a specific video in an archive
Undirected Search Searching to discover information E.g., browse through a video in order to
• Learn how the content looks like• See if it is interesting
15
Supported by Video Retrieval
Not supported by Video Retrieval
Content Navigation & Visualization
16
Improving Navigation
17
e.g., on YouTube default window:
640 pixels = frames(25 seconds)
Common seeker‐bar limits navigation granularity
[Huerst et al., ICME 2007]
ZoomSlider
[Dragicevic et al., CHI 2008]
Direct Manipulation
Improvements (selected):
Improving Content Visualizationaka “Video Surrogates”
18
However, outperformed by simple “grid of keyframes”
in terms of search time.
VideoTree[Jansen et al., CBMI 2008]
Similar concept proposed later[Girgensohn et al., ICMR 2011]
19
Squeeze / FisheyeRapid Visual Serial Presentation (RSVP)
Improving Content Visualizationaka “Video Surrogates”
[Wildemuth et al., 2003]
Table of Video Content (TOVC)
[Goeau et al., ICME 2007]
[Wittenburg et al., 2005]
Examples of Video Browsing Tools
20
Exploration…pic by [NASA's Marshall Space Flight Center]
The Video Explorer
Download demo at: http://vidosearch.com/demos/VideoExplorerTrial.zip22
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a singlevideo based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]
Visual Seeker Bar with 2 Levels
Allows a user to quickly identifysimilar/repeating scenes
23
[ Schoeffmann, K., & Boeszoermenyi, L. (2009, June). Video browsing using interactive navigation summaries. In Content‐Based Multimedia Indexing, 2009. CBMI'09. Seventh International Workshop on (pp. 243‐248). IEEE. ]
Example: Motion Direction + Intensity
Motion Vector (µ) classification intoK=12 equidistant motion directions
Mapping to Hue channel
24
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in contextof video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]
Ad-Hoc Query by Motion Pattern
25
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in contextof video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]
Ad-Hoc Query by Color Layout
Region‐of‐Interest (ROI) Search User selects spatial region‐of‐interest On search
Compute Euclidian distance of frame Fto every other frame f (acc. to selected region)
Based on color layout descriptor
…
frame F
frame 1 frame k frame n
User‐selected region (I)
…
d(F,1)=350 d(F,k)=8 d(F,n)=400
26
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a singlevideo based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]
Ad-Hoc Query by Color Layout
27
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a singlevideo based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]
Digital Natives…pic by [ angermann ]
Video Browser for the Digital Native[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
29
Temporal Semantic Compression• Compress the content of e.g., a 1h video to 5 mins. • Based on tempo and popularity (see next slide)
Compression on interestingnessUser defines a compression factor (f) that defines duration of compressed video
Based on interest function k shots are ranked in order of interestingness, satisfying
Shots are presented in their temporal order
Video Browser for the Digital Native
Interestingness
30
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
Tempo function derived from motion and audio features (originally; Greenhill et al.)
Per‐frame and per‐shot popularity based on information like
YouTube Insights and manual annotations
Video Browser for the Digital Native
User study with 8 participants Test configuration elements by two tasks
1. Browse a familiar movie to find scenes you remember2. Browse an unfamiliar movie to get a feel for its story or structure
Questionnaire with Likert‐scale ratings
31
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
Signatures…pic by [ Wierts Sabastien ]
Signature-based Video Browser
• Color sketches mapped to feature signatures
• Matched to those of keyframes
33
[ Kruliš, M., Lokoč, J. and Skopal, T. (2013). Efficient Extraction of Feature SignaturesUsing Multi‐GPU Architecture. Springer Berlin Heidelberg, LNCS 7733, pp.446‐456. ]
1. Sampling keypoints2. Description through location (x,y),
CIE Lab, contrast and entropy of surrounding pixels
3. K‐means clustering
Signature-based Video Browser
34
[ Lokoč, J., Blažek, A., & Skopal, T. (2014, January). Signature‐Based Video Browser. In MultiMedia Modeling (pp. 415‐418). Springer International Publishing. ]
Sketches (Color Signatures)
Player
Winner of VBS 2014
Download demo at: http://siret.ms.mff.cuni.cz/lokoc/vbs.zip
Signature-based Video Browser
35
Jakub Lokoč, Adam Blažek, and Tomáš Skopal. 2014. On Effective Known Item Video Search Using Feature Signatures. In Proceedings of International Conference on Multimedia Retrieval (ICMR '14). ACM, New York, NY, USA, 3 pages.
Performance Evaluation ofBrowsing Tools
36
Evaluation of Browsing Tools
• User StudiesReflect real benefit (+)Unexpected behaviors (+)Very tedious to do (‐) Individual data sets (‐)
• User SimulationsQuick procedure (+)Approximation only (‐)
• Campaigns/CompetitionsTRECVID Known‐Item‐SearchVideo Browser ShowdownCombine advantages from above
37
Video Browser Showdown (VBS)
• Annual performance evaluation competition Live evaluation of search performance Special session at Int. Conference on MultiMedia Modeling (MMM)
• Focus Known‐item Search tasks
Target clips are presented on site Teams search in shared data set
Highly interactive search e.g., text‐queries are not allowed Should push research on interfaces and interaction/navigation
Experts and Novices Easy‐to‐use tools and methods
38
39
2012: Klagenfurt11 teams
2013: Huangshan6 teams
2014: Dublin7 teams
VBS 2015: January 4, 2015, Sydney, Australia (MMM 2015)http://www.videobrowsershowdown.org/
Video Browser Showdown (VBS)
• Scoring through VBS Server• Score (s) [0‐100] for task i and team k is based on
Solve time (t)Penalty (p) based on number of submissions (m)
40
Maximum solve time (Tmax) typically 3 minutes
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]
VBS 2013 EvaluationBaseline Study with Novices and a Video Player
• Add. User study (16 participants) for comparison with VBS tools• Known Item Search Tasks as used for VBS 2013
41
[ Schoeffmann and Cobarzan, “An Evaluation of Interactive Search with Modern Video Players”, in Proc. of the 2013 IEEE International Symposium on Multimedia (ISM), Anaheim, CA, USA, 2013 ]
VBS 2013: Baseline vs. ExpertsScore
42
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]
Avg (Baseline) = 74.8 Avg (VBS) = 71.7
VBS 2013: Baseline vs. ExpertsSubmission Time
43
Avg (Baseline) = 57.9 s Avg (VBS) = 40.5 s
Conclusions and Open Issues…
HCI
Conclusions
• Need for interactive/exploratory search
• Video browsing tools Effective alternative to automatic search tools, support undirected search Provide reasonable performance, can help to bridge usability gap Many proposals for single browsing techniques
• But still improvable… How to even better integrate user into search process?
User knowledge could help to circumvent shortcomings of content analysis How to better support search behavior of users?
Stronger combination of automatic and interactive search techniques needed! More research on interface concepts, interaction models, demos, and user studies!
45
MM
Where is the User in Multimedia Retrieval?
IEEE Multimedia Magazine, Oct.‐Dec. 2012, vol. 19, no. 4, pp. 6‐10Marcel Worring, Paul Sajda, Simone Santini, David Shamma, Alan Smeaton, Qiang Yang
46
• “In the multimedia retrieval community, the emphasis has moved toward quantitative results to such an extent that the user has moved into the background. ”
• “It might be time to rethink what we are doing in the field.”
• “…users often don’t even know what they want from an automatic system….”
• “…user needs and characteristics are dynamic.”• “It is so much easier to publish papers about
improving a standard task than it is to describe a new insight about user intention or a new interface for browsing results.”
What About Novice Users?[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November).
Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]
47
Video Browser Showdown 2012Two examples (of the 11 tools)
48
Xiangyu Chen, Jin Yuan, Liqiang Nie, Zheng‐Jun Zha, Shuicheng Yan, and Tat‐Seng Chua, "TRECVID 2010 Known‐item Search by NUS", in Proceedings of TRECVID 2010 workshop, NIST, Gaithersburgh, USA, 2011
Jin Yuan, Huanbo Luan, Dejun Hou, Han Zhang, Yan‐Tao Zheng, Zheng‐Jun Zha, and Tat‐Seng Chua, "Video Browser Showdown by NUS", in Proceedings of th 18th International Conference on Multimedia Modeling (MMM) 2012, Klagenfurt, Austria, pp. 642‐645
• Keyframe extraction (shots)• ASR and OCR• HLF (Concepts)• RF with Related Samples
• Uniform sampled keyframes(with flexible distance)
• Parallel playback + navigation
Manfred Del Fabro and Laszlo Böszörmenyi, "AAU Video Browser: Non‐Sequential Hierarchical Video Browsing without Content Analysis", in Proceedings of th 18th International Conference on Multimedia Modeling (MMM) 2012, Klagenfurt, Austria, pp. 639‐641
Winner of VBS 2012
[ Marco A. Hudelist, Claudiu Cobarzan and Klaus Schoeffmann, “OpenCVPerformance Measurements on Mobile Devices“, in Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR 2014), pp. 1‐4, Glasgow, UK, 2014, pp. 479‐482 ]
The Potential of Mobile Devices
• Intuitive to use• Rich interaction capabilities
multi‐touch accelerometer, gyroscope, … front camera (tracking/feedback?)
• High computing power on‐demand content analysis ad‐hoc queries powerful graphics
49
Mobile Video BrowsingFilmStrip – Improve Visability [ Hudelist, M. A., Schoeffmann, K., & Boeszoermenyi, L. (2013, April). Mobile
video browsing with a 3D filmstrip. In Proceedings of the 3rd ACM conference on International Conference on Multimedia Retrieval (pp. 299‐300). ACM. ]
50