42
Multimedia Grand Challenge 2012 Mei-Chen Yeh 04/24/2012

Multimedia Grand Challenge 2012

  • Upload
    tavia

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Multimedia Grand Challenge 2012. Mei-Chen Yeh 04/24/2012. Midterm Report. Submission due date: May 8 report short presentation (10-mins) Max 4 pages, double column Word template Latex Come up with a solution to one of the grand challenges - PowerPoint PPT Presentation

Citation preview

Page 1: Multimedia Grand Challenge 2012

Multimedia Grand Challenge 2012

Mei-Chen Yeh04/24/2012

Page 2: Multimedia Grand Challenge 2012

Midterm Report

• Submission due date: May 8– report– short presentation (10-mins)

• Max 4 pages, double column – Word template– Latex

• Come up with a solution to one of the grand challenges– http://www.acmmm12.org/call-for-multimedia-gran

d-challenge-solutions/

Page 3: Multimedia Grand Challenge 2012
Page 4: Multimedia Grand Challenge 2012

Why should I care about this?

• I want to pass this course.• Look for ideas for your final project / thesis.• Writing a report and doing a project always

take time. Why not turn the report/project into something beneficial?

Page 5: Multimedia Grand Challenge 2012

Here comes the opportunity!

• 6 problems that Google, HP, NHK and other companies see in the future of multimedia

• Cash Award– 3 prizes for last year– For every finalist team this year

Page 6: Multimedia Grand Challenge 2012

………………………….………………………….………………………….

………………………….………………………….

………………………….

Education Educationmaster, NTNU

………………………….………………………….

master, NXU………………………….………………………….

Experiences

………………………………………………………………………………………………………………………………………

……………………………………………

Experiences……………………………………………………………………………………………………………………………………………………………………………………

Publicationxxx, “A new approach for automatic music video generation”, ACM Multimedia Grand Challenges, 2012.

勝Make your resume stand out!

Page 7: Multimedia Grand Challenge 2012

Great experience and great location!

Scottsdale, 2011

Beijing, 2009

Nara, 2012

Florence, 2010

Page 8: Multimedia Grand Challenge 2012

2012 Challenges

• Google: Automatic Music Video Generation• 3DLife / Huawei: Realistic Interaction In Online Virtual

Environments• HP: Understanding the Emotional Impact of Images and

Videos• NHK: “Where is beauty?” Video Segment Extraction

Based on Aesthetic Quality Assessment• NTT Docomo: Event Understanding through Social

Media and its Text-Visual Summarization• Technicolor: Audiovisual Recognition of Specific Events

Page 9: Multimedia Grand Challenge 2012

Google Challenge: Automatic Music Video Generation

Page 10: Multimedia Grand Challenge 2012

Google Challenge

• Music Vide = Visual + Audio• A befitting soundtrack makes a video

compelling and likewise Lady Gaga’s music videos greatly enhance her songs.

• Automatic Music Video Generation– How to auto-suggest a cool soundtrack to a user-

generated video?– How to auto-generate interesting music videos?

Page 11: Multimedia Grand Challenge 2012

Use Case 1

• You have shot a few family videos on your smartphone, but you don’t want to upload them to YouTube because they look boring.

• What if you could find a matching soundtrack? Wouldn’t it improve the appeal of the video and make you want to upload it?

• Goal: make a video much more attractive for sharing by adding a matching soundtrack to it.

• Bonus point: the application runs on Android or iPhone.

Page 12: Multimedia Grand Challenge 2012

Use Case 2

• Consider the case that you are hosting a home party. You have a playlist of party music, but you don’t have any matching music videos to show on your 50 inch TV.

• Goal: automatically generate entertaining music videos that match the songs.

• Bonus point: personalize the music videos to the people who are viewing them.

You may focus on either of the two use cases.

Page 13: Multimedia Grand Challenge 2012

Evaluation

• Novelty of the music video generation system• Entertainment value of the produced music

videos

Page 14: Multimedia Grand Challenge 2012

http://www.mtv.com/

Page 15: Multimedia Grand Challenge 2012

http://www.mtv.com.tw/

Page 16: Multimedia Grand Challenge 2012

HP Challenge: Understanding the Emotional Impact of Images and Videos

Page 17: Multimedia Grand Challenge 2012

HP Challenge

• Images and videos can serve as a powerful communications vehicle, conveying a wealth of information as well as emotional impact.

Page 18: Multimedia Grand Challenge 2012

HP Challenge

• Images and videos are used extensively by professionals on web sites, magazine covers and printed advertisements to draw attention, communicate a message and leave a lasting emotional impression.

Page 19: Multimedia Grand Challenge 2012
Page 20: Multimedia Grand Challenge 2012

HP Challenge

• Understanding the Emotional Impact of Images and Videos: 6 research problems:1. How do we characterize the response categories

and levels of emotional impact?2. What attributes of images and videos are

associated with their emotional impact?• The color, composition, content, lighting, sharpness,

and movement of an image or a video, …

3. What affective models can be used to predict the emotional impact of images?

Page 21: Multimedia Grand Challenge 2012

HP Challenge

4. How can we use the affective models to rank images and videos?

5. Can we use image and video transformations to change the emotional impact?

6. What are the applications of affective models?

Page 22: Multimedia Grand Challenge 2012

HP Challenge

• Evaluation– how well the deep understanding of the emotional

impact is used to create novel and compelling applications on the web, for the mobile devices, and for social networks.

Page 23: Multimedia Grand Challenge 2012

NHK Challenge: “Where is beauty?” Video Segment Extraction Based on Aesthetic Quality Assessment

美學的

Page 24: Multimedia Grand Challenge 2012

NHK Challenge

• Goal:– “Where is beauty?” -- Automatic recognition of

beautiful scenes in broadcast programs• Two key questions:– how beauty is defined– how to approach beauty

• Dataset provided!

Page 25: Multimedia Grand Challenge 2012

NHK Challenge• Input– Broadcast video program “Japan’s Scenic Beauty ” (25 min x 10

programs)• Video format: MPEG1 (704 x 480 pixels)• Audio: MPEG Audio 44.1 kHz stereo 224 kbps (English)

– Shot boundary data (xml file)• Output– List of extracted beautiful scenes that were ranked in the top

10%• The scenes should be described by the shot number that we provide or

frame number and its duration– Recommended video:

• 1 to 2-minute short video that is composed of extracted beautiful scenes

Page 26: Multimedia Grand Challenge 2012

NHK Challenge

• Evaluation– Originality and adequacy of proposed algorithm– Reliability and variety of submitted beautiful

scenes– Quality of the submitted short video (if submitted)

Page 27: Multimedia Grand Challenge 2012

NTT Docomo Challenge: Event Understanding through Social Media and its Text-Visual Summarization

Page 28: Multimedia Grand Challenge 2012

NTT Docomo Challenge

• Goal:– Data-mining on social media to retrieve,

summarize, and visualize events for a selected topic

• Example – Topic: “local events for New York City”– Summarize twitter/flickr data to have the

magazine like “New York of the Day.”

Page 29: Multimedia Grand Challenge 2012

NTT Docomo Challenge• Input

Researcher working on this challenge should collect necessary data from Twitter or Flicker. There will be at least three types of data requirement for this challenge.– Images: Twitter or Flicker, or both– Text: Tweets from Twitter– 3rd party contents: News website such as New York Times, Blog, and others.

• OutputThe output could be in a format of magazine, in which each article represents an event and each article is associated with either/both related images and texts. These images and texts should be self-explanatory to the article. The magazine could be summarized as daily basis, hourly basis, or even shorter.

Page 30: Multimedia Grand Challenge 2012

NTT Docomo Challenge

• Research problems– Extract the local events from the Twitter data– Assign the location information to the image– Create a text summary of each local event with

tweets and other 3rd party contents– Assign the most relevant images to each local

event– Layout the articles and design the magazine

Page 31: Multimedia Grand Challenge 2012

NTT Docomo Challenge

• Evaluation– Relevance of the summary/article to the actual

topic– Relevance of the related images to the abstract

text, or vice-versa– Quality of magazine design

Page 32: Multimedia Grand Challenge 2012

Technicolor Challenge: Audiovisual Recognition of Specific Events

Page 33: Multimedia Grand Challenge 2012

Technicolor Challenge

• Goal– given a short video sequence, with audio,

stemming from the coverage of a public event, the system should produce precise textual information on it.

Page 34: Multimedia Grand Challenge 2012

Technicolor Challenge

• A description at the event identity level:– Which event is it?– When and where did it take place?– What is its context?– What is precisely happening in the audio-visual

scene?– In particular, who are the persons in the scene?– Where are they in the image?– What are they doing or saying?

Page 35: Multimedia Grand Challenge 2012

Example

Page 36: Multimedia Grand Challenge 2012

Key ideas

• Extract automatically as much information as possible from the audio-visual query and to use it to search the intertwined textual, audio and visual data available online!– Extraction of compact low-level audio-visual signatures– Detection and recognition of text present in the images– Detection and recognition of speech present in the

audio track– Semantic analysis of the audio-visual content

Page 37: Multimedia Grand Challenge 2012

Huawei/3DLife Challenge: Realistic Interaction in Online Virtual Environments

Page 38: Multimedia Grand Challenge 2012

Huawei/3DLife Challenge

• Goal– Support real-time realistic interaction between

humans in online virtual environments• Scenario– An online dance class where a dance teacher and

a student perform a series of movements

Page 39: Multimedia Grand Challenge 2012

Huawei/3DLife Challenge

• Not limited to certain capture technology – Visual sensing techniques: a single camera, a

camera network, wearable inertial motion sensing – Gaming controllers: the Nintendo Wii, the

Microsoft Kinect

Page 40: Multimedia Grand Challenge 2012

Huawei/3DLife Challenge

• Work with the provided data set to illustrate key technical components that would be required to realize this kind of online interaction and communication:– 3D data acquisition and processing from multiple sensor

data sources– Realistic (optionally real-time) rendering of 3D data based

on noisy or incomplete sources– Realistic and naturalistic marker-less motion capture– Human factors around interaction modalities in virtual

worlds http://perso.telecom-paristech.fr/~essid/3dlife-gc-12

Page 41: Multimedia Grand Challenge 2012

Huawei/3DLife Challenge• A data set is provided, including:

– Synchronization data between each of the multiple calibrated sources capturing the students movements;

– Original music excerpts consisting of a few tracks at different tempos varying from low to fast;

– Inertial (accelerometer + gyroscope + magnometer) sensor data captured from multiple sensors on the student’s body;

– Depth maps for student performance captured using a Microsoft Kinect;

– Ratings of the student performances by the teacher;– A form of annotation of the choreographies (mostly basic steps

and movements for salsa beginners) performed.

Page 42: Multimedia Grand Challenge 2012

Start Early!

• Upload your report on Moodle by 11:55pm, May 8, 2012

• Less than 4 pages, using the ACM MM template

• Prepare for a short presentation (<10-mins) for sharing your ideas on a challenge

• More information:– http://www.acmmm12.org/call-for-multimedia-gr

and-challenge-solutions/