13
INSTITUT MINES-TÉLÉCOM Comments on carriage of timed text (and graphics) Cyril Concolato, Jean Le Feuvre M25978 July 2012, Stockholm, Sweden

Carriage of timed subtitles and graphics in MP4

Embed Size (px)

Citation preview

Page 1: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM

Comments on carriage of timed text (and graphics)

Cyril Concolato, Jean Le Feuvre

M25978

July 2012, Stockholm, Sweden

Page 2: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20232

Timed text has a long history …

■ So many formats• See http://wiki.videolan.org/Subtitles

■ MPEG has already looked at it, some time ago…• Analysis of the streaming text requirements, MPEG, Shanghai, China,

October 2002, M8931• Existing MPEG technologies:

− Scene description streams (BIFS, LASeR)− MPEG-4 Part 17

■ Now new formats (TTML, WebVTT, …)• Each format so far requires MPEG to standardize a new mechanism

for its carriage • What will we do for the next formats (now or in 10 years) ?

■ MPEG should design future-proof technology for the carriage in ISOBMF of timed text !!

Page 3: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20233

Not only of timed text but also of timed graphics

■ Proposed new use case• Frame-based synchronized graphics overlay on top of a

video−Ex: Graphics and video derived from Kinect devices−Ex: Recordings of Augmented Reality applications−Ex: SVG-based cartoons à la Flash

■ Same requirements as “timed text”• Selecting a graphics track• Playing while keeping synchronization• Accessing randomly in the graphics stream• Enabling progressive download and streaming or adaptive

streaming, • Positioning the track on top of the video …

■ Demo

Page 4: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20234

Example of mis-synchronization

Page 5: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20235

We need more generic requirements (1/2)

■ The ISOBMFF should be able to carry timed data, in a generic manner, for which the exact type or format can be identified. • Ex: to carry timed TTML, SVG, HTML, WebVTT...

■ The ISOBMFF should be able to carry samples of timed data composed of a main sample data referencing several individual pieces of data (sample resource), each of them carried efficiently, without requiring modifications to the main sample data. • Ex: Efficient carriage of JPEG images used by the

timed text or graphics document

Page 6: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20236

We need more generic requirements (2/2)

■ The ISOBMFF should be able store sample resources together with or separately of the main sample data, possibly using movie fragments.• Ex: Share a JPEG across samples

■ The ISOBMFF should enable the storage of timed data in a fragmented manner across samples, for progressive loading by the application consuming sample data.• Ex: if an XML progressive loader can be used, use it!

Page 7: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20237

Technical elements towards a solution

■ Situation• MPEG has already almost all tools for timed text and

graphics−Metadata tracks (part 12)−Scene description tracks (part 14)

• Reuse as much existing tools• Adapting them if needed

■ 2 proposals• Generic Tool: Usage of ‘meta’ in movie fragments

• Specific adaptations to carry timed text and/or graphics−Option 1: Usage of timed metadata samples−Option 2: Usage of ‘meta’ box as samples

Page 8: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20238

‘meta’ in movie fragments

■ The ‘meta’ box provides• Carriage of un-timed metadata• a useful mapping between a URL and the location of the

metadata in the file (ItemInfoBox and ItemLocationBox)• Gives way to protect the metadata

■ Current situation• Fragmenting movie with ‘meta’, what happens?

−Media data not allowed in initialization segments!• Why ‘meta’ not allowed in movie fragments?

■ Proposal• Allow at most one ‘meta’ box (and possibly one ‘meco’

box to be consistent with the rest of the specification)• At the ‘traf’ level (not at the ‘moof’ level)

Page 9: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/20239

Option 1: Usage of “timed metadata” samples to carry timed text and graphics

■ Track handler: ‘meta’■ Sample entry:

• XMLMetaDataSampleEntry if XML (TTML, SVG, …)• TextMetaDataSampleEntry if textual (WebVTT, HTML, …)• URIMetaSampleEntry if needed.

■ Sample• Use of given mime type or namespace to identify the content of the

sample• Complete XML document or text chunks or binary content• Storage of secondary resources (eg. JPEG …) as items in a ‘meta’

box:− At ‘traf’ level, if fragmented − At the ‘trak’ level, at the ‘moov’ level, at the file level

■ Flatenning• Merge ‘meta’ • Or store ‘traf’-level meta boxes at the ‘trak’ level with ‘meco’ boxes

− Use a new sampleGroup type to associate ‘meta’ to sample

Page 10: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/202310

Option 2: Usage of ‘meta’ box as samples to carry timed text and graphics

■ Track handler: ‘metb’ ■ Sample entry:

• MetaBoxSampleEntry (merge of Text- and XMLMetadataSampleEntry)

■ Sample format• A ‘meta’ box with text, XML or graphics document stored

as primary item• Storage of secondary resources (eg. JPEG …) as items

in a ‘meta’ box:−This one, at the sample level (if any)−At ‘traf’ level, if fragmented −At the ‘trak’ level, at the ‘moof’ level, at the file level

■ Flatenning• As usual with audio, video, …

Page 11: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/202311

Important points

■ Sample mapping• Empty samples

−Option 1: an « empty » sample is a zero-length string−Option 2: an « empty » sample is a meta box with an empty

primary document

• Overlapping samples−Use sample start time as CTS−Use delta CTS as duration (as usual)−Let the sample define « real » presentation duration and

overlap−No « artificial » duplicate content

■ Simple processing• Import/export• Fragmentation/Un-fragmentation

Page 12: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/202312

Misc points: Current WD problems

■ « Time mapping »• What is needed? • What is the timeBase for TTML?• « adjacent time ranges » not defined in SMPTE?

■ Spatial registration• We agree with m25859: we want to be able to store a

text/graphics track in a separate file from the video track• We agree to reuse 3GPP-style positioning

■ Unnecessary Restrictions• We agree with m25859: do not add restrictions to the

ISOBMFF on timescale, …

■ Be careful not to restrict to complete XML documents only

Page 13: Carriage of timed subtitles and graphics in MP4

INSTITUT MINES-TÉLÉCOM04/13/202313

Summary

■ Proposed one new use case: timed graphics■ Reformulated requirements (more generic) ■ Proposed clarifications for ‘meta’ box in movie

fragments■ Proposed 2 options based on existing tools to

carry timed text or graphics■ Comments about current WD

Questions?