Multimedia Semantic Web and MPEG-7 Ana B. Benitez ana @ ee.columbia.edu Image and Advanced Television Lab (ADVENT) Department of Electrical Engineering

Multimedia Semantic Web and MPEG-7

Ana B. Benitezana @ ee.columbia.edu

Image and Advanced Television Lab (ADVENT)Department of Electrical Engineering

Columbia University

Semantic Web and Multimedia

Motivation: Rapid explosion of available multimedia on the Web Extract semantics from multimedia is even harder than from text!

Multimedia Semantic Web: Describe multimedia documents Enable interoperable, scalable and intelligent applications Deal with multimedia and interact with users at human levels

MPEG-7 – Multimedia Content Description Interface: Describe different aspects of multimedia documents at different

abstraction levels Enable multimedia applications in general

MPEG-7

Suite of tools for describing: Structure and semantics ~ XML and RDF for generic data Audio and visual features – color, texture, melody and timbre Content management information – creator, media format and rights Free and structured annotations – who, when and where Text and graph classification schemes ~ thesauri and ontologies Summaries and variations – key frame hierarchy and abstracts Multimedia collections and models – statistical models and classifiers

XML-Schema as description definition language

ISO standard in September 2001!

Still region SR1: Creation inform a tion Text annotation

Still region SR2: Text annotation Color structure

Still region SR3: Text annotation Matching hint Color structure

Spatial segment decompos i tion: No overlap, gap

Directional spatial segment relation: left

Content Structure

Agent object AO1: Label Person


Event EV1: Label Semantic time Semantic place

Concept C1: Label Property Property

Comradeship

Shake hands

Alex Ana

Object-event relation: hasAccompanierOf

Concept-semantic base rel a tion: hasPropertyOf

Content Semantics

Object-event relation: hasAgentOf

MPEG-7 Description

Segment-semantic base relation: hasMediaPerceptionOf

Segment-semantic base relation: hasMediaSymbolOf

Photographer: Seungyup Place: Columbia University

Time: 19 September 1998

704x480 pixels True color RGB http://www.alex&ana.jpg

Columbia University, All rights reserved

Creation information: Creation Creator Creation corrdinates Creation location Creation date

Media information: Media profile Media format Media instance

Usage unformation: Rights

Content Management




Comradeship

Shake hands

Alex Woman


Concept-semantic base rel a tion:

hasProperty Of

AbstractionLevel

dimension=1



Formal Abstraction

Abstraction Levels




Comradeship

Shake hands

Alex Ana


Concept-semantic base rel a tion:

hasProperty Of



Media Abstraction

Structure and Semantic Tools

Framework comparable to ER Modeling and Semantic Networks: Entities, attributes and relationships

Structure descriptions: Segment entities: video, audio, multimedia segments, … Attributes: AV descriptors, spatio-temporal localization, … Relationships: temporal, spatial, …

Semantic descriptions: Semantic entities: objects, events, concepts, states, places, times Attributes: label, definition, abstraction level, media occurrences Relationships: object-event, concept-semantic entity, … Multiple abstraction levels: media and formal Encoding of rules based on alphabet and rule graphs in the works

RDF/DAML+OIL vs MPEG-7

Pros RDF/DAML+OIL: Cardinality of properties different than zero or one Arbitrary union and intersection combinations of classes

Pros MPEG-7: Richer and more flexible graphical structure:

Total or partial graph morphisms to relate graphs Discover and reuse sub-descriptions but not as nodes, e.g., modern

adaptation of Hamlet (poissonheart medicine; drownstep front of car) Closer to how humans construct descriptions

More types of entities (segments, object, events, …), attributes (AV features, definition, abstraction, …), relations (composition, dependency, spatio-temporal…), and abstraction levels (media, formal)

Mechanism to encode rules to enable inference in progress Non-structure/semantic information (management, summaries, …)

Some Questions …

What is the Semantic Web doing about multimedia?

What is the position of the Semantic Web on MPEG-7?

Some Pointers

To know more about MPEG-7: MPEG: http://www.cselt.it/mpeg MPEG-7 Industry Forum: http://www.mpeg7.org

To know more about our work at MPEG-7: MPEG-7 Project at Columbia University:

http://www.ctr.columbia.edu/~ana/MPEG7 ADVENT: http://www.ee.columbia.edu/advent,

Research

The End

Thanks for you attention!

Content Management Description<CreationInformation> <Creation> <Creator> <Role><Name>Photographer</Name></Role> <Person> <Name> <GivenName>Seungyup</GivenName> </Name> </Person> </Creator> <CreationCoordinates> <CreationLocation> <Name xml:lang="en">Columbia University</Name> </CreationLocation> <CreationDate> <TimePoint>1998-09-19</TimePoint> </CreationDate> </CreationCoordinates> </Creation></CreationInformation><MediaInformation> <MediaProfile master="true"> <MediaFormat> <Content>image</Content> <VisualCoding> <Format colorDomain="color“ href="urn:mpeg:VisualCodingFormatCS:1">JPG</Format> <Frame height="480" width="704"/> </VisualCoding> </MediaFormat> <MediaInstance id="mastercopy"> <MediaLocator> <MediaUri> http://www.alex&ana.jpg </MediaUri> </MediaLocator> </MediaInstance> </MediaProfile></MediaInformation><UsageInformation> <Rights> <RightsId organization="Columbia University“> columbia:1919:alex&ana_image </RightsId> </Rights></UsageInformation>

Photographer: Seungyup Place: Columbia University

Time: 19 September 1998

704x480 pixels True color RGB http://www.alex&ana.jpg

Columbia University, All rights reserved

Creation information: Creation Creator Creation corrdinates Creation location Creation date

Media information: Media profile Media format Media instance

Usage unformation: Rights

Content Structure Description<StillRegion id="SR1"> <TextAnnotation> <FreeTextAnnotation> Alex shakes hands with Ana </FreeTextAnnotation> </TextAnnotation> <SpatialDecomposition overlap="false" gap="true">

<StillRegion id="SR2"> <TextAnnotation> <FreeTextAnnotation> Alex </FreeTextAnnotation>

</TextAnnotation> <VisualDescriptor xsi:type="ColorStructureType"> ... </VisualDescriptor> </StillRegion>

<StillRegion id="SR3"> <TextAnnotation> <FreeTextAnnotation> Ana </FreeTextAnnotation>

</TextAnnotation> <MatchingHint> <Hint value="0.455" xpath=”../../VisualDescriptor"/> </MatchingHint> <Relation xsi:type="DirectionalSpatialSegmentRelationType“ name="left“ target="#SR2"/> <VisualDescriptor xsi:type="ColorStructureType"> ... </VisualDescriptor></StillRegion>

</SpatialDecomposition></StillRegion>

Still region SR1: Creation inform a tion Text annotation

Still region SR2: Text annotation Color structure

Still region SR3: Text annotation Matching hint Color structure

Spatial segment decompos i tion: No overlap, gap

Directional spatial segment relation: left

Locator Description

<SpatioTemporalLocator>  <FigureTrajectory type="1"> <MediaTime> <MediaTimePoint>T00:00:15</MediaTimePoint> <MediaDuration>PT1M15S</MediaDuration> </MediaTime> </FigureTrajectory> <Vertices> <Coordinates> 4.34 1.43 4.33 </Coordinates> <Coordinates> 10.3 5.03 .33 </Coordinates> <Coordinates> 5.34 .43 2.37 </Coordinates> </Vertices> <Vertices> <Coordinates> 4.34 1.43 4.33 </Coordinates> <Coordinates> 10.3 5.03 .33 </Coordinates> <Coordinates> 5.34 .43 2.37 </Coordinates> </Vertices> <Vertices> <Coordinates> 4.34 1.43 4.33 </Coordinates> <Coordinates> 10.3 5.03 .33 </Coordinates> <Coordinates> 5.34 .43 2.37 </Coordinates> </Vertices> …</SpatioTemporalLocator>

Content Semantics Description<Semantic> <Label><Name>Alex shakes hands with Ana </Name></Label> <SemanticBase xsi:type="EventType" id="EV1"> <Label><Name>Shake hands</Name></Label> <Relation xsi:type="ObjectEventRelationType“ name="hasAgentOf“ target="#AO1"/> <Relation xsi:type="ObjectEventRelationType“ name="hasAccompanierOf“ target="#AO2"/> <Relation xsi:type="ConceptSemanticBaseRelationType“ name="hasPropertyOf" target="#C1"/> <SemanticPlace> <Label><Name>Columbia University</Name></Label> </SemanticPlace> <SemanticTime> <Label><Name>September 9, 1998</Name></Label> </SemanticTime> </SemanticBase> <SemanticBase xsi:type="AgentObjectType" id="AO1"> <Label><Name>Alex</Name></Label> <Agent xsi:type="PersonType"> <Name><GivenName>Alex</GivenName></Name>

</Agent> </SemanticBase> <SemanticBase xsi:type="AgentObjectType" id="AO2"> <Label><Name>Ana</Name></Label> <Agent xsi:type="PersonType"> <Name><GivenName>Ana</GivenName></Name>

</Agent> </SemanticBase> <SemanticBase xsi:type="ConceptType" id="C1"> <Label><Name>Comradeship</Name></Label> <Property>Associate</Property> <Property>Friend</Property> </SemanticBase></Semantic>





Comradeship

Shake hands

Alex Ana


Concept-semantic base rel a tion: hasProperty Of

Documents

Multimedia Semantic Web and MPEG-7 Ana B. Benitez ana @ ee.columbia.edu Image and Advanced Television Lab (ADVENT) Department of Electrical Engineering