Upload
leo-magnus-wilkerson
View
218
Download
5
Tags:
Embed Size (px)
Citation preview
ATTRIBUTE-AUGMENTED SEMANTIC HIERARCHY
Towards Bridging Semantic Gap and Intention Gap in Image Retrieval
Hanwang Zhang1, Zheng-Jun Zha2, Yang Yang1, Shuicheng Yan1, Yue Gao1, Tat-Seng Chua1
1: National University of Singapore
2: Institute of Intelligent Machines, Chinese Academy of Sciences
Semantic Concept
SemanticHierarchy
Low-level Visual Feature
High-level Semantic
Bridging Semantic Gap
semantic
ontological
3/33
Semantic Gap Bridged? No !
Bridging Intention Gap
4/33
User Feedback
Low-level Visual Feature
User Intention
Intention Gap Bridged? No !
Attributes
7/33
Component
Appearance
Discriminability
snout, ear, etc
furry, brown, etc
cat or dog? etc
☞ Hierarchical Semantics
☞ Hierarchical Semantic
Similarity
Animal
Cat Dog
Vehicle
Root
Corgi Pug
Solution: Attribute-augmented Semantic Hierarchy (A2SH)
8/33
☞ Semantic hierarchy☞ Pool of attributes☞ Concept classifiers☞ Attribute classifiers
General framework for Content-based Image Retrieval
legfurry brown
wheelshiny
glass
wet
metalhead Roo
tAnimal
Dog
Pug
1
2
A Prototype of A2SH
9/33
Concepts: 1322 (958 leaves)Depth: 3 ~ 11Images: 1.23 million 50% training 50% testing
☞ 95,800 images are manually labeled with 33 attributes☞ Automatically discovered 2-26 attributes for each
concept node☞ 15 ~ 58 attributes per concept
ILSVRC 2012 ImageNet
TailLeg
Why A2SH?
10/33
SmallerVariance
Descriptive, Transferrable
☞ Attributes bridge the semantic gap
glass
wing
wheel
1
2
conceptattribute
☞ A2SH bridges the intention gap
12/33
Intention as attributes through attribute and image feedbacks
Leg Skin
Leg
Tail
Feedbacks are automatically digested into multiple levels
1
2
Attribute Feedback Image Feedback
Why A2SH?
Concept Classifiers
17/33
c +
+ +
_+_
☞Exploit hierarchical relation
☞Alleviate error propagation
predicts whether an image belongs to concept c
hierarchical one v.s. all
18/33
predicts the presence of an attribute a of concept c
Attribute Classifiers
☞ Nameable attributes: human nameable, hierarchical supervised learning
☞ Unnameable attributes: human unnameable, hierarchical unsupervised learning
☞ They together offer a comprehensive description of the multiple facets of a concept
Ear
Snout
Eye
Furry
Unnameable Attribute Classifiers
19/33
☞Nameable attributes are not discriminative enough.
☞Discover new attributes for concepts that share many nameable attributes.
☞2-26 for each concept.
D. Parikh, K. Graman. “Interactively Building a Discriminative Vocabulary of Nameable Attributes”, CVPR 2011.
What we have now?
☞ Concept classifiers Semantic path prediction
☞ Attribute classifiers Image representation along the semantic path
20/33
Hierarchical Semantic Representation
Hierarchical Semantic Similarity
21/33
Images are represented by attributes in the context of concepts
Hierarchical semantic similarity
What we have now?
☞ Concept classifiers Semantic path prediction
☞ Attribute classifiers Image representation along the semantic path
☞ Hierarchical Semantic Similarity FunctionSemantic similarity between images
Hierarchical Semantic Representation
23/33
Automatic Retrieval
25/33
Hierarchical semantic similarity
Candidate images are retrieved by semantic indexing
c child(c)
Ic
candidate images
Low complexity!Efficient!
Evaluation☞ A2SH: our method
☞ hBilinear: retrieves images by bilinear semantic metric (Deng et al. 2011 CVPR)
☞ hPath: length (confidence) of the common semantic
path of an image and the query
☞ hVisual: hPath+visual similarity
☞ fSemantic: flat semantic feature similarity
☞ fVisual: visual feature similarity
26/33
Training: 50%, Gallery: 50% (95, 800 queries)
Evaluation: Automatic Retrieval
27/33
Method fVisual fSemantic
hVisual hBilinear
A2SH
Time (ms) 1.18 x 104 3.62 x 103 7.42 x 102 4.47 x 102 70.6
Effective!
Efficient!
30/33
Query
Leg
Cloth
☞ Attribute-level Feedback
Interactive Retrieval
Zhang et al. “Attribute Feedback”, MM 2012
Evaluation: Interactive Retrieval
31/33
Method A2SH HF QPM SVM
MAP@20 0.25 0.22 0.21 0.21
2-min fixed time
Summary
A2SH
Attribute-augmented Semantic Hierarchy
SH with Attributes
Gaps bridging
Framework for CBIR
1.23 M Images
EffectivenessVerified
33/33
Data Set Only leaves have images and each concept’s images are merged bottom-top 50% to 50% training and testing (gallery) 100 random images per leaf from testing are used as queries 100 random images from each leaf’s training images are annotated with attributes Color, texture, edge and multi-scale dense SIFT. LLC with max-pooling, 2-level spatial pyramid. 35,903-d feature vector