Joost de Wit (Media Distillery) - Technology Update - Watskeburt?!

Preview:

Citation preview

Watskeburt?! Onder de motorkap

Technology Update 6 oktober, 2016

Founded by Geert Vos and Joost de Wit in 2014

We still can’t search in video content!

Adding metadata used to be manual labour

We use advanced data mining to analyse video content

Photo courtesy of Google

Both broadcast & on-line sources

Teletext Speech Subtitles Logos Faces Filmstrips

Ingest

Analysis

Storage

Presentation / interaction

API

cask still well

foundation

Large Scale Video Storage

Online Video Service

StorageService

RSS feed Video scraper

RabbitMQ

SpeechRecognizer

SubtitleExtractor Filmstrip Creator …

individual frames chunked videoannotations

well still cask foundation

SpeechRecognizer

SubtitleExtractor Filmstrip Creator …

RabbitMQ Indexer Elasticsearch

ContentService

Notification Service

SearchService

AuthenticationService

FeedService

well still cask foundation

annotations Large Scale Video Storage

Large Vocabulary Automatic Speech Recognition (ASR)

Photo courtesy of IBM

r  eh k ao g n ay  z       s  p  iy  ch

"recognize speech"

Speech / non-speech

Speaker diarisation

Phonemes (acoustic model)

Vocabulary (language model)

Dialect de NS > Dennis

SjoemelsoftwarePhone calls

Subtitle extraction

ROI selection

Text extraction

OCR

Utterance detection

Garbage detection

Logo recognition

Voor het kijken Tijdens het kijken

While browsingWhile watching

Research questions

• How to “visually summarise” a video clip as a filmstrip? • #frames to show (fixed / variable)? • Which frames to show? • Size of the frames to show (fixed / variable)? • Part of the frame to show? • How to present the filmstrip? • How should users interact with it?

Frame sampling

Shot detection

Frame selection

Merge

[Bar clipping]

Videoclip

Filmstrip & manifest

Features for frame/bar selection

• Gezichten (en de beweging ervan over het scherm)

• Gezichtsuitdrukking • Open ogen • Tekst • Infographics • Scene-overgangen • Scherpte in het beeld • Rule-of-thirds • Aanwezigheid van muziek / spraak • Ondertiteling

• Visuele eigenschappen frame (sharpness, saturation, kleurhistogram)

• Programma specifiek (studio, naambordjes, …)

Filmstrip as (eventually) tested

• Variable number of frames (one per shot) • Selected frame just after shot changed • Fixed width, no clipping • Focused on the presentation & interaction

Examples

• Short clip • One long shot taken from a helicopter • No voice-over or text present

Examples

• The subtitling tells the story • Selected images don’t contain much information

Examples

• Great example • Reads like a comic book • Lucky shot

Alternative filmstrip

• Variable number of frames • “bar” width based on length of shot • “bar” cut with respect to frame’s center

Storing the filmstrip

• We store parts of the filmstrip up to a maximum length separately • Parts are stored in Cassandra • Efficient retrieval based on program ID and start time of the strip • Scalability and redundancy are build-in in Cassandra

Part 1

Part 2

The API

• REST API (created using Jersey, running in Apache Tomcat) • JSON & JPG as output

{ "height" : 480, // The height of the filmstrip "strips" : [ { "program_id" : "b214753f-5e17-46ec-afeb-c1b3d9ee6565", // The Id of the program from which // the filmstrip was created "start" : 0, // Start time of the strip (in milliseconds) "stop" : 12000, // Stop time of the strip (in milliseconds) "url" : "<base>/strip_001.jpg", // The URL of this strip's image "width" : 6700, // The width of this strip "segments" : [ // Segments are the tiles in the strip { "start_offset" : 0, // The start offset of this segment (in pixels) "stop_offset" : 100, // The stop offset of this segment (in pixels) "start_timestamp" : 0, // The start time of this segment (in milliseconds) "stop_timestamp" : 1234 // The stop time of this segment (in milliseconds) }, { "start_offset" : 101, "stop_offset" : 305, "start_timestamp" : 1235, "stop_timestamp" : 2345 } ] } ], "previous" : false, // A reference to the previous filmstrip file (when present) "next" : "<base>/next_strip.json" // A reference to the next filmstrip file (when present) }

/api/{version}/filmstrip/{program_id}

Flexibility of the API (for other applications)

Parameter Required Description Defaultprogram_id yes The id of the program for which the filmstrip should

be returned-

start no The start time (in milliseconds) in the video for which the filmstrip is constructed

0

max_duration no The maximum duration (from {start}) of the video covered by the filmstrip

-1 (complete clip)

max_width no The maximum width (in pixels) of one single filmstrip

15000

size no The size of the filmstrip, either small, medium or large

small

/api/{version}/filmstrip/{program_id}/strip

UI challenges

• Cross platform & responsive (is a hassle) • We had to ‘fold’ the frames to shorten the strip • Interaction between the strip and the player

• Move strip while watching? • Interact with strip while watching?

• Equally sized frames represent different shot durations

• Implementing ‘the wiggle’ was challenging, but required to stimulate swiping

• IOS only plays video in fullscreen mode

The filmstrip UI

• Runs in the (mobile) browser • Responsive design (using Bootstrap) • Heavily using jQuery + libraries

• jQuery UI • jQuery Mobile (Events) • Kinetic • SmoothDivScroll • Handlebars

• Thin backend that’s basically a proxy to the Media Distillery platform

Any questions?

Media Distillery John M. Keynesplein 12-46 1066 EP Amsterdam

info@mediadistillery.tv +31 (0)6 50 983 893