62
1 S.-F. Chang, Columbia U. Recent Advances and Open Issues of Digital Image/Video Search Prof. Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2007

Recent Advances and Open Issues of Digital Image/Video Search

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Recent Advances and Open Issues of Digital Image/Video Search

1S.-F. Chang, Columbia U.

Recent Advances and Open Issues of Digital Image/Video Search

Prof. Shih-Fu Chang

Digital Video and Multimedia LabColumbia University

www.ee.columbia.edu/dvmmJune 2007

Page 2: Recent Advances and Open Issues of Digital Image/Video Search

2S.-F. Chang, Columbia U.

AcknowledgementColumbia University

Winston Hsu, Wei Jiang, Lyndon Kennedy, Akira Yanagawa, Eric Zavesky, Dongqing Zhang

Kodak ResearchAlex Loui, Jiebo Luo

IBM ResearchShahram Ebadollahi, Milind Naphade, Paul Natsev, John R. Smith, Lexing Xie

CMUAlex Hauptmann

Page 3: Recent Advances and Open Issues of Digital Image/Video Search

Need of Image/Video SearchExplosive growth of online image/video data, personal media, broadcast news videos, etc.5 billion images on the Web, 31 million hours of TV programs each yearSuccessful services like Youtube and FlickrImage/video search exciting opportunity

Page 4: Recent Advances and Open Issues of Digital Image/Video Search

4S.-F. Chang, Columbia U.

Seeking the image search tools-- an active research area over decade

Query by

Sketch

results results

IBM QBIC ’95, Columbia VisualSEEk ’96

Query by Sketch

Page 5: Recent Advances and Open Issues of Digital Image/Video Search

-5- digital video | multimedia lab

However… User Expectation in Practice

“…type in a few words at most, then expect the engine to bring back the perfect results. More than 95 percent of us never use the advanced search features most engines include, …” – The Search, J. Battelle, 2003

“…type in a few words at most, then expect the engine to bring back the perfect results. More than 95 percent of us never use the advanced search features most engines include, …” – The Search, J. Battelle, 2003

Keyword search is the primary search method.User input is often limited, thus

Difficult to get detailed visual descriptionsDifficult to get user feedback for refined search

Page 6: Recent Advances and Open Issues of Digital Image/Video Search

-6- digital video | multimedia lab

Google Zeitgeist publishes top keywords monthly

Page 7: Recent Advances and Open Issues of Digital Image/Video Search

Examples of Keyword Image Search

1st page

2nd page

Reasonable Keyword Search ResultsContent Analysis May Help Correct Mistakes…

query: “sunset”

Page 8: Recent Advances and Open Issues of Digital Image/Video Search

-8- digital video | multimedia lab

Example SearchText Query on Google: “Manhattan Cruise”

Image content analysis may help refine resultsImage content analysis may help refine results

Page 9: Recent Advances and Open Issues of Digital Image/Video Search

9S.-F. Chang, Columbia U.

How about Social-Net Tagging?

Yahoo-flickrmillions of users, extensive labels

Uploaded by gdannyTags: outdoor, nyc, bridges, water, boat, cruiseCamera: Canon PowerShotSD 400Date: Sept. 17 2006

Social tags are often subjective and incomplete.

Page 10: Recent Advances and Open Issues of Digital Image/Video Search

Insufficient Precision of Social Tagsprecision

Bronx-Whitestone Br. 1.00Brooklyn Br. 0.38Chrysler Building 0.65Columbia University 0.30Empire State Building 0.18Flatiron Building 0.70George Washington Br. 0.48Grand Central 0.37Guggenheim 0.21Met. Museum of Art 0.02Queensboro Br. 0.38Statue of Liberty 0.49Times Square 0.56Verrazano Narrows Br. 0.66World Trade Center 0.13

Many tags from social networks are of low precision

(due to batch uploading?)

Test New York City landmark labels

Page 11: Recent Advances and Open Issues of Digital Image/Video Search

An Interesting Paradigm:Image Tagging via Game Playing

Used in Goggle Image LabelerUse competitive games to motivate usersHas attracted many participants for free!

Some users spent hours in a day

Claim the potential of annotating the whole Web in just few months!

5 Billion images

(Von Ahn & Dabbish, CHI 04)

Page 12: Recent Advances and Open Issues of Digital Image/Video Search

12S.-F. Chang, Columbia U.

My Personal Experience

Affected by personal background and proficiencyMatched words tend to be obvious ones

Scoring system encourages guessing partner’s thoughts, rather than indexing image contentPlayers sometimes just perform manual OCR

An interesting paradigm, but perhaps not a complete solution

Page 13: Recent Advances and Open Issues of Digital Image/Video Search

13S.-F. Chang, Columbia U.

Opportunity for Content Analysis: Large-Scale Auto. Image Tagging Framework

Audio-visual features Surrounding textSVM or graph modelsContext fusion

. . .

Rich semantic description based on content analysis

Statistical models

Semantic Tagging

+-

AnchorSnowSoccerBuildingOutdoor

Page 14: Recent Advances and Open Issues of Digital Image/Video Search

14S.-F. Chang, Columbia U.

Large-Scale Concept Detectors Publicly Available

Columbia374374 baseline detectors for LSCOM multimedia ontology

MediaMill 491 concept detectors for LSCOM and MediaMill 101 Lexicons

IBM MARVEL Search SystemTrials with BBC, CNNReal-time standalone detectors from IBM AlphaWorks

Others …

Page 15: Recent Advances and Open Issues of Digital Image/Video Search

15S.-F. Chang, Columbia U.

What Concept to Detect?

One effort: Large Scale Concept Ontology for Multimedia (LSCOM)

Joint effort by news/intelligence analysts, librarians, researchersBroadcast News DomainSelection Criteria

useful, detectable, observable834 concepts defined, 449 concepts annotatedLabeled over 61,000 shots of TRECVID 2005 data set

33 Million judgments collected, 100 person-month laborDownload by 170+ groups so far

http://www.ee.columbia.edu/dvmm/lscom/

Page 16: Recent Advances and Open Issues of Digital Image/Video Search

16S.-F. Chang, Columbia U.

LSCOM data set downloaded by 170+ groups

Yahoo! ResearchIntel AT&T FXPAL University of AmsterdamOxford UniversityNanyang Technological University, SingaporeNational Taiwan UniversityTsinghua UniversityKDDI, JapanDublin City University, Ireland University of Central Florida University of Texas, AustinUC BerkeleyOthers ...

Page 17: Recent Advances and Open Issues of Digital Image/Video Search

17S.-F. Chang, Columbia U.

Example LSCOM Concepts (449)Event/Activity (56 - 13%)

Airplane taking off, car crash, explosion, etc

People (113 - 25%)Person, male/female, firefighter, etc

Location (89 - 20%)Cityscape, hospital, airfield, etc

Object (135 - 30%)Vehicle, map, tank, power plant, etc

Scene (49 - 10%)Vegetation, urban, interview, etc

Program (7 - 2%)Entertainment, weather, finance, etc

Page 18: Recent Advances and Open Issues of Digital Image/Video Search

18S.-F. Chang, Columbia U.

Another Effort : Consumer Video Ontology(Kodak-Columbia, 2007)

Activity (6)Occasion (16)Scene (15)Object (25)People (11)Sound (14)Camera Motion (5)Object Motion (3)Social (4)

Activity:dancing, singing, sitting, walking, running, talkingOccasion :wedding, birthday, graduation, Christmas, ski, picnic, show, meeting, parade, sports, playground, theme-park, park, (back) yard, dinning, museum

Scene:sunset, beach, waterscape/waterfront, mountain, field, desert, urban, suburban, night, home, kitchen, office, lab, public building

Object:people, animal, boat, and othersPeople:crowd, baby, youth, adult, and othersSound:music, cheer, and othersCamera Motion:pan, tilt, zoom, fix, trackObject Motion:entity, speed, directionSocial:friend, family, classmate, colleagueLexicon and annotation over 1300 consumer videos planned

to be released in ACM MIR 2007

Page 19: Recent Advances and Open Issues of Digital Image/Video Search

19S.-F. Chang, Columbia U.

How to Obtain High-Quality Annotations?

(CMU annotation tool)

IBM EVA tool

Label only one concept at a time

Instead of open text simultaneous labelingFound to be more accurate And faster!

But hard to scale upThroughput: 1-3 sec/imageA laborious process(100 person-month) for 33 M labelsNo user incentive

Page 20: Recent Advances and Open Issues of Digital Image/Video Search

20S.-F. Chang, Columbia U.

Building Image Classifiers – starting from the basics

General for all concepts, easy to implementLate fusion better than early fusion374 baseline detectors (Columbia 374) released

Detector for

each concept

Page 21: Recent Advances and Open Issues of Digital Image/Video Search

21S.-F. Chang, Columbia U.

Examples of Basic Image Features

edge directionhistogram

grid layout + colormomentσ σ σμ μ μγ γ γ

Gabortexture

225 dimensions 48 dimensions 73 dimensions

Page 22: Recent Advances and Open Issues of Digital Image/Video Search

22S.-F. Chang, Columbia U.

www.ee.columbia.edu/dvmm/columbia37

Page 23: Recent Advances and Open Issues of Digital Image/Video Search

23S.-F. Chang, Columbia U.

Performance of baseline models in TRECVID 2006

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Run

Mean A

vera

ge P

recis

ion

A_CL1_1

A_CL2_2

A_CL3_3

A_CL4_4

A_CL5_5

A_CL6_6

Columbia374

Models based on local image features

Fusion of more features and models

Despite the simplicity, baseline

models provide satisfactory accuracy

Page 24: Recent Advances and Open Issues of Digital Image/Video Search

Rank detection resultsCompute precision/recall values at each hit pointAverage precision at different recall values Mean average precision (MAP)

Mean of APs across conceptsTop TRECVID system MAP ~0.3

Performance Metric – Average Precision (AP)

-

0.50

-

-

0.40

-

-

0.30

0.20

-

0.10

R

-

+

-

-

+

-

-

+

+

-

+

hit

-I9

0.5I10

-I11

-I8

0.57I7

-I6

-I5

0.75I4

0.67I3

-I2

1.00I1

P#

0 .1 .2 .3 .4 .5 .6 .7

Precision

1.0

Recall

averaging precisionsat different recall levels

Page 25: Recent Advances and Open Issues of Digital Image/Video Search

25S.-F. Chang, Columbia U.

Comparison with Text-based Classifiers

Extract text features from closed caption, ASR or translated ASR (for foreign videos)Apply classifiers : Naïve Bayes, SVM, Max. Entropy …

Featurevector

Text classifiers usually lower than visual models by 3-5 times.

Page 26: Recent Advances and Open Issues of Digital Image/Video Search

Comparing Semantic Classification (text vs. visual) (1)

Text search - “boat”Visual concept search – “boat”

(images from TRECVID)

Page 27: Recent Advances and Open Issues of Digital Image/Video Search

Comparing Semantic Classification (text vs. visual) (2)

Text search results - “car”Concept search results – “car”

Page 28: Recent Advances and Open Issues of Digital Image/Video Search

-28- digital video | multimedia lab

Example: good detectors for LSCOM conceptwaterfront bridge crowd explosion fire US flag Military personnel

Page 29: Recent Advances and Open Issues of Digital Image/Video Search

-29- digital video | multimedia lab

Complexity

Slow training, fast detectionTraining time over 61K shots per SVM

20 mins (color), 3.5 mins (texture), 1.2 mins (edge)About 1 month over 20 PCs to train Columbia374

Feature extraction0.4 sec (color), 0.4 sec (texture), 2.8 sec (edge) per image

Testing< 30 ms per image

Page 30: Recent Advances and Open Issues of Digital Image/Video Search

30S.-F. Chang, Columbia U.

Advanced Features and Models

Maximum Entropy Regions

Interest points

Segmented Regions

Part detection

SIFT, Gabor filter,

PCA projection,Color histogram,

Moments …

Local features

Part-based representation

Bag

StructuralGraph

AttributedRelational

Graph

Page 31: Recent Advances and Open Issues of Digital Image/Video Search

Representation and Learning

Individual images Salient points, high entropy regions

Attributed RelationalGraph (ARG)

GraphRepresentation of an Image

size; color; texture

Collection of training images

Random Attributed Relational Graph(RARG)

Statistical GraphRepresentation of Model

Statistics of attributes and relations

machinelearning

spatial relation

demo(Zhang & Chang CVPR 06)

Page 32: Recent Advances and Open Issues of Digital Image/Video Search

Learning Graph Object Model

CorrespondenceProbability

Patch image cluster

BUTFinding the correspondence of parts and computing matching probability are NP-completeUse advanced machine learning methods:Loopy Belief Propagation, and Gibbs Sampling plus Belief Optimization

Re-estimate

demo

Page 33: Recent Advances and Open Issues of Digital Image/Video Search

33S.-F. Chang, Columbia U.

different person

different view

large variance in appearance

“Government-Leader” Detector

Hard/specific concept

“Face” Detector

Generic concept

“flag” “podium”

Generic concept - +Flag, podium Face

Government-Leader

Context-based Model

government-leader

Context Information

Model Contextual Relations among Concepts

Page 34: Recent Advances and Open Issues of Digital Image/Video Search

S.-F. Chang, Columbia U. 34

face detector Gov. Leader detector music detector

(gov. leader|image)P (flag|image)P(face|image)P

(gov. leader|image)P (flag|image)P(face|image)P

face Gov. leader flag

Conditional Random FieldInitial scores

updated scores

Boosted Conditional Random Fields

context-based model

(Jiang, Chang, Louie ICASSP ’07)

boosting

Page 35: Recent Advances and Open Issues of Digital Image/Video Search

Predict When will Context Help

Strong classifiers may suffer by fusion with inaccurate context

Complex inter-conceptual relationships difficult to estimate from limited training samples

Concept Fusion (CF) does not always help:

Strong context

,

,

( ; ) ( )

( ; )j

j

j i jC j i

j iC j i

I C C E C

I C Cβ≠

<∑

∑or

Use CF for concept , only if is weak or with strong contextiC iC

-- mutual information between and( ; )i jI C C iC jC( )iE C -- error rate of independent detector for iC

( )iE C λ>

weak self

Page 36: Recent Advances and Open Issues of Digital Image/Video Search

-36- digital video | multimedia lab

00.10.20.30.40.50.60.70.80.9

1

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

MA

P

CBCFIndividual Detector Improved

Degraded

Fuse Context Only When it is Predicted to Help

missed

Apply context fusion smartly16 concepts are predicted to improve14 actually improve, 4 missed

Page 37: Recent Advances and Open Issues of Digital Image/Video Search

Example: Road Detector

. . .

Road Natural-Disaster

Boat-Ship

5.03

Car

4.7

3 5.01

Mutual information

Page 38: Recent Advances and Open Issues of Digital Image/Video Search

38S.-F. Chang, Columbia U.

Combine Context-Fusion with User Input

automatic content recognition+context fusion

+user input

Page 39: Recent Advances and Open Issues of Digital Image/Video Search

Person: 79%

Clinton: 75%US Flag: 80%

Podium: 70%

Give speech: 75%Press conference: 65%

Tags:

Instead of Automatic Tagging 100% of Concepts

What if we can ask users to help 1-2 labels?

Active TaggingUser labels: park, picnic

Auto. generated labels:people, tree, mountain, etc

Page 40: Recent Advances and Open Issues of Digital Image/Video Search

40S.-F. Chang, Columbia U.

Which Questions to Ask? (20 Questions Game)

ICIP 2006ICIP 2006

Selection criterion: minimize the expected Bayes classification error

1

1 ( )N

ii

SN =

= ∑E E *1:

1 1

1 1( ) ( ; )2 2

N N

i i Mi i

H S I S SN N= =

≤ −∑ ∑[Hellman et al., IEEE Trans. Info. Theory, 1970]

entropy Mutual info.

Best concepts for active tagging: high uncertainty or high mutual info

E

User labels: ???

1, , NS SAll Concepts of interest ---* * *1: 1{ , , }M MS S S=Questions to ask user ---

(Jiang, Chang, Loui, ICIP 06)

Page 41: Recent Advances and Open Issues of Digital Image/Video Search

Examples of Best Questions

Active Tagging

Best Questions to Ask User?

Airplane, animal, boat, building, bus, car, chart, court , crowd, desert, entertainment, explosion_fire, face, flag_us, government_leader, map, meeting, military, mountain, natural_disaster, office, outdoor, people_marching, person, road, sky, snow, sports, urban, waterscape, etc

1. Outdoor?2. Airplane?

1. Outdoor?2. Court?

1. Outdoor?2. Building?

Page 42: Recent Advances and Open Issues of Digital Image/Video Search

42S.-F. Chang, Columbia U.

Active Tagging Consistently Help

0.31

0.36

0.41

0.460.51

0.56

0.61

0.66

1 2 3 4 5 6 7 8 9 10Number of labelled concepts

MA

P

MMIEE LabellingRandom Labelling

70% improvement

Performance gain increases with more user inputMagic number here is 4

Random questions

Active QA

Page 43: Recent Advances and Open Issues of Digital Image/Video Search

-43- digital video | multimedia lab

Power of Concept-based Representation

outdoor people

building

. . .

Large semantic index

New applications: Search, Filtering, Pattern Mining

Page 44: Recent Advances and Open Issues of Digital Image/Video Search

VideoShots

Query

Video Database

Query Text“Find shots of a road with one or more cars”

Part- of- Speech Tags - keywords“road car”

Map to conceptsWordNetsemantic similarity

Concept MetadataNames and Definitions

Concept Space

(1.0) road(0.1) fire(0.2) sports(1.0) car….(0.6) boat(0.0) person

Confidence for each concept

Concept Detectors

Simple SVM, Grid Color Moments,

Gabor Texture

Model Reliability

Expected AP for each concept.

Concept Space

(0.9) road(0.1) fire(0.3) sports(0.9) car….(0.2) boat(0.1) person

(0.9) road(0.1) fire(0.3) sports(0.9) car….(0.2) boat(0.1) person

(0.9) road(0.1) fire(0.3) sports(0.9) car….(0.2) boat(0.1) person

(0.9) road(0.1) fire(0.3) sports(0.9) car….(0.2) boat(0.1) person

(0.9) road(0.1) fire(0.3) sports(0.9) car….(0.2) boat(0.1) person

Sem

antic Distance

Map text queries to concept

Use ontology (Wordnet) to find matched concepts

Videos represented by semantic descriptors

Measure semantic distance between query and concept

Power of Semantic Indexing: Text-to-Concept Search

Page 45: Recent Advances and Open Issues of Digital Image/Video Search

DVMM Lab, Columbia University Lyndon Kennedy45

Mapping search topics to concepts

Find shots with a view of one or more tall buildings (more than 4 stories) and the top story visible.

Finds shots with one or more emergency vehicles in motion (e.g., ambulance, police car, fire truck, etc.)

Find shots with one or more people leaving or entering a vehicle.

Find shots with one or more soldiers, police, or guards escorting a prisoner.

Concept Concept

ConceptConcept

Matched Concepts: Emergency_Room, Vehicle Matched Concepts: Building

Matched Concepts: Person, Vehicle Matched Concepts: Guard, Police_Security, Prisoner, Soldier

TRECVID search topics

Research issue: what concept to use?

How to fuse multiple concepts?

Page 46: Recent Advances and Open Issues of Digital Image/Video Search

46S.-F. Chang, Columbia U.

Concept Search DemoConcept search case 1 (link)Concept search case 2 (link)Multimodal search (link)

Page 47: Recent Advances and Open Issues of Digital Image/Video Search

-47- digital video | multimedia lab

Top results from text query: “Manhattan Cruise”

What are the dominant semantic concepts in the initial result set?

Boat, waterfront, sky, buildings, …

What are the dominant semantic concepts in the initial result set?

Boat, waterfront, sky, buildings, …

Other Applications: Semantic Mining

Page 48: Recent Advances and Open Issues of Digital Image/Video Search

48S.-F. Chang, Columbia U.

Decipher dominant visual concepts behind each search topic

Text search for “Hu Jintao”

374 Pre-trained concept detectors

Use discovered concepts to learn

SVM models

Rerank by the models

Page 49: Recent Advances and Open Issues of Digital Image/Video Search

49S.-F. Chang, Columbia U.

Related concepts have high mutual information with target labels.

Mutual information:

T: initial text search scoreC: concept detection score (quantized)Include both positive and negative correlations

Page 50: Recent Advances and Open Issues of Digital Image/Video Search

Examples: discovered concepts per query

Person, Civilian Person, Face, Talking, Male Person, Sitting

Weapons, Explosion Fire, Airplane, Smoke, Sky

(158) Find shots of a helicopter in flight.

Desert, Protest, Crowd, Mountain, Outdoor, Animal

Court, Politics, Government Leader, Suits, Lawyer, Judge

(179) Find shots of Saddam Hussein.

Person, Civilian Person, Face, Adult, Suits, Ties

Waterscape, Outdoor, Sky, Fire, Exploding Ordnance

(164) Find shots of boats or ships.

Negative ConceptsPositive ConceptsQuery Topic

News Story Relationships unique to news story. Reranking uniquely powerful.

Mistaken Relationships due to detection errors. Rare, but still helpful.

Generally Present Obvious, expected relationships. Concepts that are always together.

Three main types of topic-concept relationships:

Page 51: Recent Advances and Open Issues of Digital Image/Video Search

51S.-F. Chang, Columbia U.

Another Application: Event Detection

Typical approach to event detection: detect object of interest, track over time, model spatio-temporal dynamics

Object Detection / Localization

Tracking Inference

“Airplane Landing”

Difficult for events without explicit objects and motions, e.g., “riot”, “street battle”, “flight combat”, “parade”

Page 52: Recent Advances and Open Issues of Digital Image/Video Search

52S.-F. Chang, Columbia U.

New Approach:Concept-based Event Representation (1)

airplane

crowd sky

indoorbag of concept features

Apply large-scale

concept detectors

Map image frames to confidence scores in the semantic concept spaceA video is associated with a bag of concept features

Page 53: Recent Advances and Open Issues of Digital Image/Video Search

53S.-F. Chang, Columbia U.

Concept-based Event Representation (2)

airplane

crowd sky

indoor

Large-scale concept

detectors

Trace the concept scores over timeA video corresponds to a trajectory in the concept space

[Airplane]

[Sky]

[Vegetation]

Trajectory in concept space

Page 54: Recent Advances and Open Issues of Digital Image/Video Search

DVMM Lab, Columbia University Video Event Recognition

Video Events as Distinct Traces in the Concept Space

Red: video 1, Green: video 2, Yellow: other videos

Page 55: Recent Advances and Open Issues of Digital Image/Video Search

55S.-F. Chang, Columbia U.

From Representation to Event Detection

video representation

Learn classifier

+- xx

x

Compute event pair-

wise similarities

Page 56: Recent Advances and Open Issues of Digital Image/Video Search

-56- digital video | multimedia lab

Earth-Mover’s Distance (EMD) [Xu & Chang, CVPR 07]

Video 1 Video 2

......

concepts

Matching?

concepts

. . .

Matching?

EMD finds optimal

alignment flows

between features in the bags

Page 57: Recent Advances and Open Issues of Digital Image/Video Search

-57- digital video | multimedia lab

HMM+SVM to Model Event Traces in the Concept Space

…Sky

Vegetation

Airplane

Event Trajectories in high-dimensional space

HMM

+-

SVM 010

2030

40

0

10

20

30

400

0.2

0.4

0.6

0.8

1

state-histogram

fisher score

[Xie et al, ICME 06]

Model dynamics in each dimension

Compute model statistics

Train Classifiers

Page 58: Recent Advances and Open Issues of Digital Image/Video Search

Performance Testing Using TRECVID Data

00.10.20.30.40.50.6

Airp

lane

_Fly

ing

Beac

h

Cel

ebra

tion_

Or_

Dan

cing

Dem

onst

ratio

n_

Elec

tion_

Cam

pai

Elec

tion_

Cam

pai

Fune

ral

Gro

und_

Com

bat

Han

dsha

king

Hel

icop

ter_

Hov

e

Inte

rvie

w_S

equ

Lake

s

Nat

ural

_Dis

aste

r

Para

de

Peop

le_C

ryin

g

Rep

orte

rs

Rio

t

Sing

ing

Stre

et_B

attle

MAP

rand

cc

concept

hmm

emd

news videos

ENG

CHN

ARB

39 core visual concepts 20 target events

80 hours, TRECVID 2005

Significant performance gain by applying temporal models (EMD and HMM)

Page 59: Recent Advances and Open Issues of Digital Image/Video Search

Landscape of Semantic Image/Video Indexing

Large data repositories

annotation Concept Ontology

Concept Detection

outdoor people

building

. . .

New Applications:

Search, Topic Tracking, Filtering

Large semantic index

New image features

Multimodal Retrieval

Statistical Models Semantics

Mining

Page 60: Recent Advances and Open Issues of Digital Image/Video Search

60S.-F. Chang, Columbia U.

Open Issues and OpportunitiesSystematic Way of Extending Ontology?Cross-Domain Model Development:

General detectors for News, Consumer, Web?Sharing of data and models?

Complexity and speedFeature selection, fast training

Map user queries to visual concepts Wordnet for visual search?

Page 61: Recent Advances and Open Issues of Digital Image/Video Search

61S.-F. Chang, Columbia U.

More InformationColumbia DVMM Lab

http://www.ee.columbia.edu/dvmm374 SVM-based concept detectors available soonOnline video search demos

LSCOM lexicon and annotationhttp://www.ee.columbia.edu/lscom

Columbia374 concept detectorshttp://www.ee.columbia.edu/columbia374

Page 62: Recent Advances and Open Issues of Digital Image/Video Search

Columbia Video Search System http://www.ee.columbia.edu/cuvidsearch

Search Result Folder

Beyond keywords:search by example

image

Automatically Detected

Story Segments

Customizable Multi-modal

Search Tool Suite

Automatic Query

Expansions

XML Output

Prototype includes 160 hours, 3 languages (English, Arabic, Chinese), 6 channels