21
Large-Scale Content- Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed Hefeeda By: Ahmed Abdelsadek ([email protected])

Large-Scale Content-Based Image Retrieval

  • Upload
    hamlet

  • View
    66

  • Download
    2

Embed Size (px)

DESCRIPTION

Large-Scale Content-Based Image Retrieval. Project Presentation CMPT 880 : Large Scale Multimedia Systems and Cloud Computing. Under supervision of Dr. Mohamed Hefeeda By: Ahmed Abdelsadek ([email protected]). Outlines. Introduction Project Scope Work Flow Image Features - PowerPoint PPT Presentation

Citation preview

Page 1: Large-Scale Content-Based Image Retrieval

Large-Scale Content-Based Image RetrievalProject PresentationCMPT 880: Large Scale Multimedia Systems and Cloud Computing

Under supervision of Dr. Mohamed HefeedaBy: Ahmed Abdelsadek ([email protected])

Page 2: Large-Scale Content-Based Image Retrieval

Outlines •Introduction•Project Scope•Work Flow•Image Features •Indexing and Retrieval•Matching•Evaluation•Conclusion

Page 3: Large-Scale Content-Based Image Retrieval

Introduction•Current image search engines rely heavily

on text to retrieve images▫User provides keywords, and images

having that keyword in the filename or in nearby html are candidates for retrieval.

•In this project we are willing to try content-based retrieval techniques where the query is an image.

Page 4: Large-Scale Content-Based Image Retrieval

Project Scope•Similarity using local features.•Extracting features from the reference

images.•Index these features in efficient data

structure in a scalable large scale environment

•Process query images.•Search and Match.

•This project is NOT▫Recognition, Classification, Categorization

Page 5: Large-Scale Content-Based Image Retrieval

Work Flow

Generate Feature Points

Generate Feature Points

Direct to KD-Tree Index Bin

Build KD-Tree Index Bins

Distributed Storage

Searching for Nearest Neighbors

Matching Objects

Sorting and Reporting Results

QueryMultimedia Object

ReferenceMultimedia Object

Results

Matching

BuildingQuerying

SaveLoad

Page 6: Large-Scale Content-Based Image Retrieval

Image Features• Using SIFT features (Scale-invariant feature transform).

▫ A SIFT feature is a selected image region (also called keypoint) with an associated descriptor.

▫ A SIFT descriptor is a histogram of the image gradients surrounding a keypoint.

▫ Using PCA for Dimension Reduction

Page 7: Large-Scale Content-Based Image Retrieval

KD-Tree•Using KD-Trees

▫Each tree level represent a dimension of a feature

▫Searching the index for the K-nearest neighbours

Page 8: Large-Scale Content-Based Image Retrieval

Logical View

ReferenceFeatures

Points

QueryFeatures

Points

Multimedia Objects Matcher

Similar Features

Similar Objects

Results

Page 9: Large-Scale Content-Based Image Retrieval

Physical ViewDirecting

Bui

ldin

g

Block 1 Block 2 Block 3 Block n

Block 1

Block 2

Block 3

Block n

Physical FilesOn HDFS

B1 vs B1

B2 vs B2

B3 vs B3

Bn vs Bn

Computing DistancesTasks

ReducePhase

MapPhase

MapPhase

DistributedCache

QueriesR

efer

ence

sKD-Tree

Page 10: Large-Scale Content-Based Image Retrieval

Matching•For each query we extract the features

and then search the index for the K-NN features.

•For each query feature, each neighbouring feature of it votes to certain image with a score of its rank.

•The maximum 10 images for the voting array are reported as the most similar images.

Page 11: Large-Scale Content-Based Image Retrieval

Evaluation•Core KNN

▫Experiments on local machine.▫Our results vs brute force

•Image retrieval▫CalTech, and TRICVID datasets▫On amazon AWS cloud.▫We 8 machines.

Dual core 4 GB ram

Page 12: Large-Scale Content-Based Image Retrieval

Precision of KNN

Page 13: Large-Scale Content-Based Image Retrieval

Scanned Bins Size

Page 14: Large-Scale Content-Based Image Retrieval

Affect of Data Size

Page 15: Large-Scale Content-Based Image Retrieval

Image Recall @ K

Page 16: Large-Scale Content-Based Image Retrieval

First Correct @ K

Page 17: Large-Scale Content-Based Image Retrieval

Implementation Details•The system is implemented in Java•We use Hadoop 1.0.3 •We run cloud experiments on AWS

services▫S3▫EMR

•We use some open source libraries▫For images preprocessing we use :

FFMPEG▫For extracting SIFT features we use :

VLFeat

Page 18: Large-Scale Content-Based Image Retrieval

Conclusion•We implement a full pipeline for image

retrieval problem.▫The framework can easily support different

types of features, different indexing methods.

•We show how we can build a big cloud system from small components.

Page 19: Large-Scale Content-Based Image Retrieval

Conclusion•Intersection with my research

•Contributions▫Feature Selection and Extraction▫Implement Dimension Reduction▫Design and Implement Map/Reduce Index▫Implement Image Matching and Ranking

Page 20: Large-Scale Content-Based Image Retrieval

Questions ?

Page 21: Large-Scale Content-Based Image Retrieval

Thank you !