Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Supervised Hashing with Kernels Wei Liu (Columbia), Jun Wang (IBM),
Rongrong Ji (Columbia), Yu‐Gang Jiang (Fudan), and Shih‐Fu Chang (Columbia)
June, 2012
Outline• Motivations • Problem • Our Approach • Experiments• Conclusions
2CVPR 2012
Fast Nearest Neighbor Search• Exhaustive search ( time) is inefficient.
3CVPR 2012
Tree‐Based Indexing• O(log n) search time.• Impractical for high dimensionality.
4CVPR 2012
treeKD‐tree
Locality‐Sensitive Hashing
• Sublinear search time for –approximate NN. • Long hash bits (>=1k) and multiple hash tables.
0
1
0
10
1
Feature Vector
5CVPR 2012
hash function
random
101 Query
[Gionis, Indyk, and Motwani 1999][Datar et al. 2004]
Hashing with Compact Codes
6CVPR 2012
• O(1) search time with short bits (
Related Works• Three main categories
7CVPR 2012
Unsupervised Hashing
LSH, PCAH, ITQ,KLSH, SH, AGH
Semi‐Supervised Hashing
SSH, WeaklySH
Supervised Hashing
RBM, BRE, MLH, LDAHOur Approach
SupervisionSemantic Supervision
8CVPR 2012
Metric Supervision
similar
dissimilardissimilar
similar
dissimilar
Outline• Motivations • Problem• Our Approach • Experiments• Conclusions
9CVPR 2012
Principle: Preserve Supervised Information
• The hashing quality could be boosted by leveraging supervised information: similar and dissimilar pairs.
10CVPR 2012
similar
dissimilar
01
implicit classification
Encode Supervised Information• Encode as a pairwise label matrix
11
• The labeled data of samples.• Objective: learn r hash functions for r hash bits
given and S.
similar pairsdissimilar pairs
CVPR 2012
uncertain
Previous Formulations
SSH/OKH [Wang, Kumar, He, Liu, Chang 2010]
12CVPR 2012
MLH [Norouzi&Fleet 2011]
BRE [Kulis&Darrell 2009]Hamming distance between H(xi) and H(xj)
hinge loss
Goal
Outline• Motivations • Problem • Our Approach • Experiments• Conclusions
13CVPR 2012
Proposed Idea: Code Inner Products
• Optimizing Hamming distances can yield compact yet discriminative hash codes, but is hard to implement.
• We propose to optimize code inner products.
14CVPR 2012
code inner product ≡ Hamming distance
Hamming Distances
15CVPR 2012
x2
x3
x1 similar
supervised hashing
0 distance
max distance
hash code of x2hash code of x1
hash code of x3
The labeled data
1 -1 1 1 -1 1
-1 1 -1
Optimization on Hamming distances
1 -1 1
1 -1 1
1 -1 1
-1 1 -1
Code Inner Products
16CVPR 2012
S
x2
x3
x1 similar
supervised hashing
The labeled data
1 -1 11 -1 1-1 1 -1
1 1 1-1 -1 11 1 -1
ХTcode matrix
1 1 -11 1 -1-1 -1 1
x1x2x3
x1 x2 x3
pairwise label matrix
Optimization on code inner products
rx1x2x3
code matrix
fitting
Code Learning• Lead to a clean matrix‐form code learning framework
( ):
17CVPR 2012
• Easy to be extended to a kernelized formulation.
sample
single hash bit
reduce the gap bet.code similarity and semantic similarity
reduce the gap bet.code similarity and semantic similarity
Kernel‐Based Hash Functions • Following KLSH, construct a hash function using a kernel
function and m anchor samples:
zero‐mean normalization applied to k(x).
18CVPR 2012
1 -1 11 -1 1-1 1 -11 1 -1
=sgn
model parameterkernel matrix
×l samples
m anchors
Sequential Optimization
19CVPR 2012
• Rewrite the object function as
A sequential idea: at a time, only optimize one vector akprovided with the previously optimized k‐1 vectors .
one hash bit one time
matrix: r bits
vector: kth bit cumulative residue
Deal with sgn()• We propose two methods to handle sgn().
20CVPR 2012
Generalized SVD
Gradient Descent
Spectral Relaxation
Sigmoid Smoothing
where is a smooth approximation to sgn(x) (|x|>6).
Outline• Motivations • Problem • Our Approach • Experiments• Conclusions
21CVPR 2012
CIFAR‐10
22CVPR 2012
• 60K object images from 10 classes, 1K query images.
• Hamming radius 2 precision in terms of semantic labels.
• 1K labeled examples are used for (semi‐)supervised hashing.
• KSH0 Spec Relax, KSH Sig Smooth.
CIFAR‐10
23CVPR 2012
MethodTrain Time Test Time
48 bits 48 bitsSSH 2.1 0.9×10−5
LDAH 0.7 0.9×10−5
BRE 494.7 2.9×10−5
MLH 3666.3 1.8×10−5
KSH0 7.0 3.3×10−5
KSH 156.1 4.3×10−5Significant speedup
Tiny‐1M
24CVPR 2012
• 1M tiny images from the web, 2K query images.
• Pseudo labels: top 5% L2 nearestneighbors as groundtruths.
• Hamming radius 2 precision in terms of L2 neighbors.
• 5K pseudo‐labeled examples are used for (semi‐)supervised hashing.
Tiny‐1M: Hamming Ranking
25CVPR 2012
KSH achieves highest precision and recall.
26
Tiny‐1M: Visual Search Results
CVPR 2012
most visuallyrelevant
Outline• Motivations • Problem • Our Approach • Experiments• Conclusions
27CVPR 2012
Conclusions• A novel inner products based formulation to preserve
supervised information into hashing.
28CVPR 2012
S
• A sequential code learning procedure: one bit one time.• A new smoothing method for binary code optimization.• Significant performance gains over state‐of‐the‐arts.• Release code soon.