Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,

Supervised by Prof. LYU, Rung Tsong MichaelSupervised by Prof. LYU, Rung Tsong Michael

Department of Computer Science & Engineering

The Chinese University of Hong Kong

Prepared by: Chan Pik Wah, Pat

Ngai Cheuk Han, Table

LYU0102LYU0102

XML for InteroperableXML for Interoperable Digital Video Library Digital Video Library

OutlineOutline

Project OverviewProject Overview Extraction TechniquesExtraction Techniques

Video Optical Character Recognition (VOCR) Video Optical Character Recognition (VOCR) Scene Change DetectionScene Change Detection

StorageStorage XMLXML Knowledge EnrichmentKnowledge Enrichment

ImplementationImplementation Tasks in next semesterTasks in next semester

MotivationsMotivations

Rapid increase in Rapid increase in the usage of the usage of multimedia multimedia informationinformation

New approach: New approach: DIGITAL VIDEO DIGITAL VIDEO LIBRARYLIBRARY

Project Outline

MotivationsMotivations

Little attention paying on video Little attention paying on video information extraction and storageinformation extraction and storage

Scalability of the system in terms of Scalability of the system in terms of adding new extraction componentsadding new extraction components

Lack of a generic framework for Lack of a generic framework for presentation and visualization of presentation and visualization of video informationvideo information

Project Outline

TargetsTargets

Provide an open architecture that Provide an open architecture that can integrate different digital video can integrate different digital video library functionslibrary functions

Increase the reusability of the Increase the reusability of the information extracted from videosinformation extracted from videos

Deliver and present the video to Deliver and present the video to multiple computing platformmultiple computing platform

Project Outline

Ways to achieveWays to achieve

Modal concept of the digital video library Modal concept of the digital video library functions functions

Collaborating the video information Collaborating the video information processing moduleprocessing module

Using XML for storageUsing XML for storage Universal formatUniversal format Flexible, scalableFlexible, scalable Present in different waysPresent in different ways Easy to search based on particular tags Easy to search based on particular tags

Generic framework for presentation and Generic framework for presentation and visualization of video informationvisualization of video information

Project Outline

Overview of our projectOverview of our project

Project Outline

AchievementsAchievements Implement two of the video information Implement two of the video information

extraction techniquesextraction techniques Video Optical Character DetectionVideo Optical Character Detection Scene Changes DetectionScene Changes Detection

Store the extracted information as XMLStore the extracted information as XML Build an XML editor in the tool for editingBuild an XML editor in the tool for editing Do knowledge enrichment base on the Do knowledge enrichment base on the

information extractedinformation extracted

Project Outline

Extraction TechniquesExtraction Techniques

Text Detection

Camera Motion

Face Detection

Scene Changes

WordRelevance

Audio Level

Extraction Techniques

Video OCR for Digital Video OCR for Digital News News

Help to locate topics by extracting the Help to locate topics by extracting the words in the captionswords in the captions

News captions provide vital search News captions provide vital search information of the videoinformation of the video

Video OCR results extracted the keywords Video OCR results extracted the keywords on the frameson the frames

The results can be used together with the The results can be used together with the words extracted from the transcript for words extracted from the transcript for indexingindexing


Video OCR for Digital Video OCR for Digital NewsNews


Scene changeScene change

Detection TechniqueDetection Technique Effective method for segmenting a Effective method for segmenting a

video sequence into significant video sequence into significant componentscomponents


Existing MethodExisting Method

Image difference methodImage difference method Histogram Difference MethodHistogram Difference Method Histogram Difference Method using Histogram Difference Method using

DC Coefficient ImageDC Coefficient Image Our Algorithm & ImplementationOur Algorithm & Implementation Histogram difference method Histogram difference method

with dynamic thresholdwith dynamic threshold


Build and compared the histogram Build and compared the histogram with the pervious scenewith the pervious scene

Calculate the histogram differenceCalculate the histogram difference If (total difference) > thresholdIf (total difference) > threshold

=> scene change=> scene change Use the first frame as key frame Use the first frame as key frame

Our Algorithm & Implementation


XMLXML

Extensible Markup LanguageExtensible Markup Language W3CW3C Create its own mark-up language for Create its own mark-up language for

describing the contentsdescribing the contents

Storage

AdvantagesAdvantages of using XML of using XML

Platform and system independentPlatform and system independent Create your own tag Create your own tag Adopt UnicodeAdopt Unicode Universal formatUniversal format ScalableScalable

Storage

XML schemaXML schema

Storage

XML ParserXML Parser

A parser is an A parser is an interface between interface between an XML document an XML document and the application and the application programprogram

Document Object Document Object Model (DOM)Model (DOM)

Storage

How to present XMLHow to present XML

Tree model becomes Tree model becomes very similar to an very similar to an XML schemaXML schema

Represented as Represented as nodes that show nodes that show element/attribute element/attribute names or the text names or the text content and their content and their relative places relative places within the XMLwithin the XML

Storage

Content creation in Content creation in digital video librarydigital video library

Collaborating different video Collaborating different video information extraction techniques, information extraction techniques, mainlymainly Knowledge Cross-referencingKnowledge Cross-referencing Knowledge EnrichmentKnowledge Enrichment

Access to video by contentAccess to video by content Communicate information trends Communicate information trends

across time, spaceacross time, space Provide fast and effective searchingProvide fast and effective searching

Storage

Knowledge EnrichmentKnowledge Enrichment

Geographic information Geographic information Extract geographic names of countries Extract geographic names of countries

and cities from text recognized from the and cities from text recognized from the video OCR or speech recognition video OCR or speech recognition

Knowledge from geographic naming Knowledge from geographic naming database enrich the informationdatabase enrich the information

Allow query or browse for events at a Allow query or browse for events at a particular location or within some particular location or within some “distance” of that location“distance” of that location

Storage

Our ImplementationOur Implementation

Use a known set of places along with Use a known set of places along with their spatial coordinates and some their spatial coordinates and some additional information for knowledge additional information for knowledge enrichment enrichment

Use the XML file as the source Use the XML file as the source material to be processedmaterial to be processed

Try to extract names of major cities Try to extract names of major cities by processing the text in the sourceby processing the text in the source

Storage

Geographic naming Geographic naming databasedatabase

An XML file with An XML file with the following the following format is usedformat is used

For each city:For each city: City IDCity ID Name of cityName of city Name of countryName of country LongitudeLongitude LatitudeLatitude

Storage

The updated XML fileThe updated XML file

Storage

Knowledge enrichment Knowledge enrichment component in our toolcomponent in our tool

Extract and Extract and list out all the list out all the cities cities mentioned in mentioned in the videothe video

Allow user to Allow user to select any of select any of them to look them to look for further for further information of information of that citythat city

Storage

Program PlatformProgram Platform

Microsoft Visual C++Microsoft Visual C++®® Object-OrientedObject-Oriented Faster MFC applications Faster MFC applications Composite Controls Composite Controls ActiveX ActiveX

Microsoft Microsoft ®® DirectShow DirectShow ®®

Component object Model (COM) Component object Model (COM) High-quality capture and playback of High-quality capture and playback of

multimedia streams multimedia streams

Implementation

Video PlayerVideo Player

Implementation

ControlControl

filter graph managerfilter graph manager Dialog Box create with the class Dialog Box create with the class CFormViewCFormView

Implementation

Scene Change & VOCDScene Change & VOCD

• CScrollView

• CMenu

• Add the extracted information to XML

Implementation

XML EditorXML Editor

TreeViewTreeView XML read by parserXML read by parser Tag in XMLTag in XML Node in TreeNode in Tree

Implementation

Knowledge EnrichmentKnowledge Enrichment

Dialog Box create as class Dialog Box create as class CFormView CFormView Read the databaseRead the database Compare with the XML generatedCompare with the XML generated

Implementation

Problems & SolutionsProblems & Solutions

Implementation

Problems & SolutionsProblems & Solutions Multi-modal tool --- Integrate all the Component Multi-modal tool --- Integrate all the Component Docking Window is used Docking Window is used Flexible & Efficient to add different new modals Flexible & Efficient to add different new modals CSizingControlBarCSizingControlBar

Implementation

Problems & SolutionsProblems & Solutions

Implementation

DemoDemo

Tasks in Next SemesterTasks in Next Semester

Focus on using XML to do Focus on using XML to do multimedia presentationmultimedia presentation

Style Sheet (XSLT) not suitable for Style Sheet (XSLT) not suitable for multimedia document generationmultimedia document generation

New format on multimedia New format on multimedia presentationpresentation

SMILSMIL

Future

Tasks in Next SemesterTasks in Next Semester

Time-based multimedia content Time-based multimedia content Capable to synchronize the playback Capable to synchronize the playback

of all multimedia elements of all multimedia elements Transform XML generated into SMIL Transform XML generated into SMIL

format for presentationformat for presentation Design a style sheet suitable for Design a style sheet suitable for

multimedia document generationmultimedia document generation

Future

Q & AQ & A

Documents

Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,