28
Department of Signal Processing 1 Esin Guldogan/System Profiles in CBIR 3/31/2009 SYSTEM PROFILES IN CONTENT-BASED INDEXING AND RETRIEVAL Esin Guldogan [email protected]

SYSTEM PROFILES IN CONTENT-BASED INDEXING AND RETRIEVALmoncef/SGN-5857-PCM/P7.pdf · SYSTEM PROFILES IN CONTENT-BASED INDEXING AND ... Content-Based Indexing and Retrieval ... Esin

Embed Size (px)

Citation preview

Department of Signal Processing

1

Esin Guldogan/System Profiles in CBIR 3/31/2009

SYSTEM PROFILES IN CONTENT-BASED INDEXING AND

RETRIEVAL

Esin Guldogan

[email protected]

Department of Signal Processing

2

Esin Guldogan/System Profiles in CBIR 3/31/2009

Outline

Personal Media Management

Text-Based Retrieval

Metadata Retrieval

Content-Based Retrieval

System Profiling

User Surveys

Analysis of User Surveys

CBIR Parameters

System Profiles and Parameters

Experimental Cases

Experimental Results

Conclusions

Department of Signal Processing

3

Esin Guldogan/System Profiles in CBIR 3/31/2009

Personal Media

Recent image technology improvements have

led to a huge amount of digital multimedia.

Flickr claims to host 3 billion images in

November 2008.

5 billion video viewed in

YouTube on July 2008.

Department of Signal Processing

4

Esin Guldogan/System Profiles in CBIR 3/31/2009

Personal Media Management

Storing, Browsing, Indexing

Accessing, Searching, Retrieving

SequentialText-BasedEvent-Basedetc.Content-Based

Department of Signal Processing

5

Esin Guldogan/System Profiles in CBIR 3/31/2009

Text-Based Search

Requires Annotation

Indexing Phase

Time Consuming

Subjective

Retrieval Phase

Very Fast

Supervised

Low Accuracy

Department of Signal Processing

6

Esin Guldogan/System Profiles in CBIR 3/31/2009

Metadata

Location, Time, etc.

Fast Indexing

Fast Retrieval

Limited

Department of Signal Processing

7

Esin Guldogan/System Profiles in CBIR 3/31/2009

Content

What is the “content” of an image?

Department of Signal Processing

8

Esin Guldogan/System Profiles in CBIR 3/31/2009

Content-Based Indexing and Retrieval

Content Based Image Retrieval (CBIR) is a

technique of searching through a database of

images not based on keywords but image

content

Department of Signal Processing

9

Esin Guldogan/System Profiles in CBIR 3/31/2009

Content-Based Image Retrieval Features

CBIR systems analyze image content via features

Describe image content using low-level features:

color, shape, and texture.

High-level features: Red bus, pigeon, rock etc.

Department of Signal Processing

10

Esin Guldogan/System Profiles in CBIR 3/31/2009

Content-Based Indexing and Retrieval

Feature

ExtractionQuery

ImageFeatures

Similarity

Measurement

Features

Images

Image

Database

Feature

Extraction

Display

Results

ONLINE

OFFLINE

User

Department of Signal Processing

11

Esin Guldogan/System Profiles in CBIR 3/31/2009

MUVIS Framework

RetrievalIndexing

Image

Database

Hybrid

Database

Video

Database

DbsEditor

Database

Management

HCT Indexing

MM Conversions

MM Insertion

Removal

FeX - AFeX

Management

AV Database

Creation

Real-time

Capturing

Encoding

Recording

AVDatabase

Audio-Video

Clips

Still

Images

FeX & AFeX API

AFeX

Modules

FeX

Modules

An

Image

A Video

Frame

SBD API SEG API

SBD

Modules

SEG

Modules

SEG

Management

SBD

Management

An Audio-

Video Clip

MBrowser

Query: PQ & NQ

HCT

Browsing

View-Display

Video

Summarization

Department of Signal Processing

12

Esin Guldogan/System Profiles in CBIR 3/31/2009

System Profiling

Complex User Interfaces

Complex Parameters

Hardware Dependencies

Computational Complexity

Efficiency

Department of Signal Processing

13

Esin Guldogan/System Profiles in CBIR 3/31/2009

System Profiling

Tuning and adapting the parameters of the

system

for improving the performance

Increase scalability

User Satisfaction

Department of Signal Processing

14

Esin Guldogan/System Profiles in CBIR 3/31/2009

System Profile 1

System Profile 2

System Profile 3

System Profile 4

.

.

.

CBIR APPLICATION

Parameter 1

Parameter 2

Parameter 3

Parameter 4

Parameter 1

Parameter 2

Parameter 3

Parameter 4

Indexing Factors /

Parameters

Retrieval Factors /

Parameters

.

.

.

.

.

.

Adaptability

and

Hardware

Scalability

System Profiling in CBIR

Department of Signal Processing

15

Esin Guldogan/System Profiles in CBIR 3/31/2009

User Surveys in Multimedia

Jaimes studied human factors, which influence automatic

content-based retrieval systems, such as human memory,

context and subjectivity.

Eakins, Briggs and Burford used online questionnaire method in

order to improve user interface of CBIR applications.

Halvey and Keane studied log statistics of YouTube to provide an

analysis of user’s interaction with video search engines.

Frohlich et al. used interview and observation approach in order

to understand the strengths and weaknesses of past and present

technology of photo sharing.

Rodden and Wood used interviews and questionnaires to find out

how people organize and browse their digital photo collections.

Weiss et. al. studied user-profile based personalization in order to

select and recommend content with respect to user’s interest for

automated online video or TV services.

Department of Signal Processing

16

Esin Guldogan/System Profiles in CBIR 3/31/2009

User Survey and Participants

Identifying real world problems and specifying system

requirements and system limitations

122 people contributed to the online survey, 27 females

and 95 males participated.

students, researchers, engineers and professors from

computer science, software systems, electronics,

telecommunication, and information technology.

Age distribution is as follows: 32% are 20-24 years old,

61% are 25-35 years old and 7% of 36-50 years old.

Department of Signal Processing

17

Esin Guldogan/System Profiles in CBIR 3/31/2009

Analysis of User Surveys

The analysis method of the survey results

can be classified into two categories:

Direct answers from the question results

definitions and specifications of

indexing and retrieval parameters

Heuristic analysis of the relevant and

associated survey questions

System profiling

By events

46%

By date

34%

By people

9%

By location

6%

By

multimedia

source

1% Other

4%

a) How would you prefer to organize

your multimedia files?

c) Which of the following do you prefer

to see for each multimedia item when

browsing?

b) What is the reasonable waiting time

in your opinion to see the results of an

image/video search on the _web_?

d) What is the reasonable waiting time

in your opinion to see the results of an

image/video search on the _home-

computer_?

Full-size image ; 30

%

Thumbnail; 64 %

Associated textual description (caption, date, file size, …

Other ; 2 %

Instantaneous ; 40

%

approximately 30

seconds; 48 %

between 30 sec and 1

min ; 6 %

1-3 min ; 5 %

more than 3

min ; 1 %

Instantaneous ; 56

%

approximately 30

seconds ; 31 %

between 30 sec

and 1 min ; 7 %

1-3 min ; 5 %

more than 3 min; 2

%

Department of Signal Processing

18

Esin Guldogan/System Profiles in CBIR 3/31/2009

Survey Results

Distinct informative knowledge about the hardware specification

of the users and their preferences about digital image

management

Provide answers to the definition of the system profiles in terms

of hardware specifications and technical specifications affecting

CBIR parameters and adaptations

Helps in the selection of factors, parameters and experimental

case setup

Example: Answers of the 16th question in the survey reveal that 93%

of the participants prefer to use JPEG image compression technique

Define the requirements, capacities and conditions of the

systems

Department of Signal Processing

19

Esin Guldogan/System Profiles in CBIR 3/31/2009

System Profiles and CBIR Parameters

Baseline System Profile

General PC and laptop users

Powerful System Profile

powerful computer systems such as dedicated servers for professional use

TV broadcast and mass media companies and, libraries

Limited System Profile

limited platforms such as mobile phones

Distributed System Profile

client-server architecture, such as web-based systems

Indexing Factors/parameters:

Compression parameters

Image Downscaling parameters

Feature type

Retrieval Factors/parameters:

Dimension reduction of feature data parameters

Feature selection

Department of Signal Processing

20

Esin Guldogan/System Profiles in CBIR 3/31/2009

Recommended CBIR Parameters

System Profiles Limited

Systems

Distributed Systems Baseline

Systems

Powerful

Systems

Indexing

Factors/

Parameters

Compression

Parameters

( JPEG Quality

Factor)

Compression

quality factor

50%

Compression quality

factor 75-50%

Compression

quality factor 75%

None or

Compression

quality factor

90%

Image

Downscaling

Parameters

Image Scaling

Factor (ISC) = 4

for Color

features

ISC=2 for

texture and shape

features

Image Scaling Factor = 4

for Color features

ISC=2 for texture and

shape features

Image Scaling

Factor = 2 for

Color features

none for texture

and shape features

Image Scaling

Factor = 2 or

none for Color

features

none for texture

and shape

features

Feature FactorsUse a feature

selection method

Use a feature selection

method

Optionally use a

feature selection

method

Optionally use a

feature selection

method

Retrieval

Factors/

Parameters

Dimension

Reduction of

Feature Data

Parameters

Scaling factor=4

or 8Scaling factor=4 Scaling factor=2

None or Scaling

factor=2

Feature

Selection and

Combination

Parameters

Use a feature

selection method

Use a feature selection

method

Optionally use a

feature selection

method

Optionally use a

feature selection

method

Department of Signal Processing

21

Esin Guldogan/System Profiles in CBIR 3/31/2009

Experimental Cases

MUVIS Framework

Corel 10000 Image database

14 types of low-level features

YUV, HSV and RGB Color Histograms

Dominant Color Feature

Gray-level Co-occurrence Matrix Texture Feature

Gabor Wavelet Texture Feature

Canny Edge Histogram

Objective evaluation measurement

ANMRR

Image Compression

JPEG

Dimension Reduction of Feature Data

Mapping Based Adaptive Threshold (MAT)

Image Downscaling

DCT-Based

Department of Signal Processing

22

Esin Guldogan/System Profiles in CBIR 3/31/2009

Experimental Cases

System Profiles

Attributes

Connection

Bandwidth128/512 Kbit/s 128 Kb/s – 1 Mb/s 128 Kb/s – 2 Mb/s 1 Mb/s – 100 Mb/s

Storage Space 1 GB 120 GB client 120 GB 180 GB

CPU Power[Information not

available]

Intel Pentium 4 2.8

GHz

Intel Pentium 4 2.8

GHz2x2.8 GHz

Display Size 320 x 240 1280 x 1024 1280 x 1024 1280 x 1024

Multimedia Codecs

MPEG-4 ,

H.264/AVC ,

H.263/3GPP, MP3-,

AAC-, eAAC- and

eAAC

Generally All Generally All Generally All

Powerful SystemBaseline SystemDistributed SystemLimited System

Department of Signal Processing

23

Esin Guldogan/System Profiles in CBIR 3/31/2009

Experimental Results

Image Compression

ParametersANMRR Results Size on Disk

Recommended System

Profiles

Uncompressed 0.20 2.6 GBPowerful System

Profile

JPEG Compressed with

Quality Factor 90%0.20 400 MB

Powerful System

Profile

JPEG Compressed with

Quality Factor 75%0.23 310 MB

Baseline System

Profile

Distributed System

Profile

JPEG Compressed with

Quality Factor 50%0.23 190 MB

Limited System

Profile

Experimental Results of Image Compression Parameters

Department of Signal Processing

24

Esin Guldogan/System Profiles in CBIR 3/31/2009

Experimental Results of PSP

Compressed

Image

Database

with JPEG

Quality

Factor

90%

Image Downscaling

ParametersANMRR

Elapsed

Times for

Feature

Extraction

Process on PSP

Color-based scaled by

2 & texture and shape-

based none

0.20 2.5 hours

Color, texture and

shape-based scaled by

2

0.21 1 hour

Color-based scaled by

4 & texture and shape-

based scaled by 2

0.23 50 min

Color, texture and

shape-based scaled by

4

0.27 18 min

Compressed

Image

Database with

JPEG Quality

Factor 90%

AND

Images are

Downscaled

for Feature

Extraction

Process

Dimension

Reduction of

Feature Data

Parameters

ANMRR

Elapsed

Times for

Retrieval

Process on

PSP

None 0.20 9 sec

Scaled by 2 0.16 5 sec

Scaled by 4 0.19 3 sec

Scaled by 8 0.20 1 sec

Department of Signal Processing

25

Esin Guldogan/System Profiles in CBIR 3/31/2009

Experimental Results of BSP

Compressed

Image

Database with

JPEG Quality

Factor 75%

Image

Downscaling

ParametersANMRR

Elapsed Times

for Feature

Extraction

Process on BSP

Color-based

scaled by 2 &

texture and

shape-based

none

0.20 6 hours

Color, texture

and shape-based

scaled by 2

0.23 1.5 hour

Color-based

scaled by 4 &

texture and

shape-based

scaled by 2

0.25 1.2 hour

Color, texture

and shape-based

scaled by 4

0.30 25 min

Compressed

Image

Database

with JPEG

Quality

Factor 75%

AND

Images are

Downscaled

for Feature

Extraction

Process

Dimension

Reduction of

Feature Data

Parameters

ANMRR

Elapsed

Times for

Retrieval

Process on

BSP

None 0.23 12 sec

Scaled by 2 0.19 7 sec

Scaled by 4 0.19 4 sec

Scaled by 8 0.23 2 sec

Department of Signal Processing

26

Esin Guldogan/System Profiles in CBIR 3/31/2009

Experimental Results of DSP

Compressed

Image Database

with JPEG

Quality Factor

75%

Image

Downscaling

Parameters

ANMRR

Elapsed Times

for Feature

Extraction

Process on DSP

Color-based

scaled by 2 &

texture and

shape-based

none

0.20 6 hours

Color, texture

and shape-based

scaled by 2

0.23 1.5 hour

Color-based

scaled by 4 &

texture and

shape-based

scaled by 2

0.25 1.2 hour

Color, texture

and shape-based

scaled by 4

0.30 25 min

Compressed

Image

Database

with JPEG

Quality

Factor 75%

ANDImages are

Downscaled

for Feature

Extraction

Process

Dimension

Reduction of

Feature Data

Parameters

ANMRR

Elapsed

Times for

Retrieval

Process on

DSP

None 0.23 100 sec

Scaled by 2 0.21 50 sec

Scaled by 4 0.21 25 sec

Scaled by 8 0.25 13 sec

Department of Signal Processing

27

Esin Guldogan/System Profiles in CBIR 3/31/2009

Experimental Results of LSP

Compressed Image

Database with

JPEG

Quality Factor 50%

Image

Downscaling

ParametersANMRR

Elapsed

Times

for Feature

Extraction

Process on

LSP

Color-based

scaled by 2 &

texture and

shape-based

none

0.22 ~65 hours

Color, texture

and shape-based

scaled by 2

0.24 24 hour

Color-based

scaled by 4 &

texture and

shape-based

scaled by 2

0.26 13 hour

Color, texture

and shape-based

scaled by 4

0.30 4 hour

Compressed

Image Database

with JPEG

Quality Factor

50%

AND

Images are

Downscaled for

Feature

Extraction

Process

Dimension

Reduction

of Feature

Data

Parameters

ANMRR

Elapsed

Times for

Retrieval

Process on

LSP

None 0.23 140 sec

Scaled by 2 0.21 65 sec

Scaled by 4 0.22 32 sec

Scaled by 8 0.24 17 sec

Department of Signal Processing

28

Esin Guldogan/System Profiles in CBIR 3/31/2009

Conclusions and Future Work

Novel study for defining CBIR system profiles and

determining suitable parameters for each profile

substantial savings in time and computational complexities

while maintaining semantic retrieval performance

Scalable and adaptable CBIR systems

Study may be extended and supplemented by additional

experiments especially for future CBIR applications and

user platforms which are expected to change the proposed

profiles and the proposed parameters due to advances in

technology.

User satisfaction for the proposed system profiles and

CBIR parameters using online surveys and further analysis