Upload
hamlet
View
66
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Large-Scale Content-Based Image Retrieval. Project Presentation CMPT 880 : Large Scale Multimedia Systems and Cloud Computing. Under supervision of Dr. Mohamed Hefeeda By: Ahmed Abdelsadek ([email protected]). Outlines. Introduction Project Scope Work Flow Image Features - PowerPoint PPT Presentation
Citation preview
Large-Scale Content-Based Image RetrievalProject PresentationCMPT 880: Large Scale Multimedia Systems and Cloud Computing
Under supervision of Dr. Mohamed HefeedaBy: Ahmed Abdelsadek ([email protected])
Outlines •Introduction•Project Scope•Work Flow•Image Features •Indexing and Retrieval•Matching•Evaluation•Conclusion
Introduction•Current image search engines rely heavily
on text to retrieve images▫User provides keywords, and images
having that keyword in the filename or in nearby html are candidates for retrieval.
•In this project we are willing to try content-based retrieval techniques where the query is an image.
Project Scope•Similarity using local features.•Extracting features from the reference
images.•Index these features in efficient data
structure in a scalable large scale environment
•Process query images.•Search and Match.
•This project is NOT▫Recognition, Classification, Categorization
Work Flow
Generate Feature Points
Generate Feature Points
Direct to KD-Tree Index Bin
Build KD-Tree Index Bins
Distributed Storage
Searching for Nearest Neighbors
Matching Objects
Sorting and Reporting Results
QueryMultimedia Object
ReferenceMultimedia Object
Results
Matching
BuildingQuerying
SaveLoad
Image Features• Using SIFT features (Scale-invariant feature transform).
▫ A SIFT feature is a selected image region (also called keypoint) with an associated descriptor.
▫ A SIFT descriptor is a histogram of the image gradients surrounding a keypoint.
▫ Using PCA for Dimension Reduction
KD-Tree•Using KD-Trees
▫Each tree level represent a dimension of a feature
▫Searching the index for the K-nearest neighbours
Logical View
ReferenceFeatures
Points
QueryFeatures
Points
Multimedia Objects Matcher
Similar Features
Similar Objects
Results
Physical ViewDirecting
Bui
ldin
g
Block 1 Block 2 Block 3 Block n
Block 1
Block 2
Block 3
Block n
Physical FilesOn HDFS
B1 vs B1
B2 vs B2
B3 vs B3
Bn vs Bn
Computing DistancesTasks
ReducePhase
MapPhase
MapPhase
DistributedCache
QueriesR
efer
ence
sKD-Tree
Matching•For each query we extract the features
and then search the index for the K-NN features.
•For each query feature, each neighbouring feature of it votes to certain image with a score of its rank.
•The maximum 10 images for the voting array are reported as the most similar images.
Evaluation•Core KNN
▫Experiments on local machine.▫Our results vs brute force
•Image retrieval▫CalTech, and TRICVID datasets▫On amazon AWS cloud.▫We 8 machines.
Dual core 4 GB ram
Precision of KNN
Scanned Bins Size
Affect of Data Size
Image Recall @ K
First Correct @ K
Implementation Details•The system is implemented in Java•We use Hadoop 1.0.3 •We run cloud experiments on AWS
services▫S3▫EMR
•We use some open source libraries▫For images preprocessing we use :
FFMPEG▫For extracting SIFT features we use :
VLFeat
Conclusion•We implement a full pipeline for image
retrieval problem.▫The framework can easily support different
types of features, different indexing methods.
•We show how we can build a big cloud system from small components.
Conclusion•Intersection with my research
•Contributions▫Feature Selection and Extraction▫Implement Dimension Reduction▫Design and Implement Map/Reduce Index▫Implement Image Matching and Ranking
Questions ?
Thank you !