Supervised Hashing with Kernels - Columbia University · 2012. 8. 10. · Conclusions • A novel inner products based formulation to preserve supervised information into hashing

Supervised Hashing with Kernels Wei Liu (Columbia), Jun Wang (IBM),

Rongrong Ji (Columbia), Yu‐Gang Jiang (Fudan), and Shih‐Fu Chang (Columbia)

June, 2012

Outline• Motivations • Problem • Our Approach • Experiments• Conclusions

2CVPR 2012

Fast Nearest Neighbor Search• Exhaustive search ( time) is inefficient.

3CVPR 2012

Tree‐Based Indexing• O(log n) search time.• Impractical for high dimensionality.

4CVPR 2012

treeKD‐tree

Locality‐Sensitive Hashing

• Sublinear search time for –approximate NN. • Long hash bits (>=1k) and multiple hash tables.

0

1

0

10

1

Feature Vector

5CVPR 2012

hash function

random

101 Query

[Gionis, Indyk, and Motwani 1999][Datar et al. 2004]

Hashing with Compact Codes

6CVPR 2012

• O(1) search time with short bits (

Related Works• Three main categories

7CVPR 2012

Unsupervised Hashing

LSH, PCAH, ITQ,KLSH, SH, AGH

Semi‐Supervised Hashing

SSH, WeaklySH

Supervised Hashing

RBM, BRE, MLH, LDAHOur Approach

SupervisionSemantic Supervision

8CVPR 2012

Metric Supervision

similar

dissimilardissimilar

similar

dissimilar

Outline• Motivations • Problem• Our Approach • Experiments• Conclusions

9CVPR 2012

Principle: Preserve Supervised Information

• The hashing quality could be boosted by leveraging supervised information: similar and dissimilar pairs.

10CVPR 2012

similar

dissimilar

01

implicit classification

Encode Supervised Information• Encode as a pairwise label matrix

11

• The labeled data of samples.• Objective: learn r hash functions for r hash bits

given and S.

similar pairsdissimilar pairs

CVPR 2012

uncertain

Previous Formulations

SSH/OKH [Wang, Kumar, He, Liu, Chang 2010]

12CVPR 2012

MLH [Norouzi&Fleet 2011]

BRE [Kulis&Darrell 2009]Hamming distance between H(xi) and H(xj)

hinge loss

Goal


13CVPR 2012

Proposed Idea: Code Inner Products

• Optimizing Hamming distances can yield compact yet discriminative hash codes, but is hard to implement.

• We propose to optimize code inner products.

14CVPR 2012

code inner product ≡ Hamming distance

Hamming Distances

15CVPR 2012

x2

x3

x1 similar

supervised hashing

0 distance

max distance

hash code of x2hash code of x1

hash code of x3

The labeled data

1 -1 1 1 -1 1

-1 1 -1

Optimization on Hamming distances

1 -1 1

1 -1 1

1 -1 1

-1 1 -1

Code Inner Products

16CVPR 2012

S

x2

x3

x1 similar

supervised hashing

The labeled data

1 -1 11 -1 1-1 1 -1

1 1 1-1 -1 11 1 -1

ХTcode matrix

1 1 -11 1 -1-1 -1 1

x1x2x3

x1 x2 x3

pairwise label matrix

Optimization on code inner products

rx1x2x3

code matrix

fitting

Code Learning• Lead to a clean matrix‐form code learning framework

( ):

17CVPR 2012

• Easy to be extended to a kernelized formulation.

sample

single hash bit

reduce the gap bet.code similarity and semantic similarity

reduce the gap bet.code similarity and semantic similarity

Kernel‐Based Hash Functions • Following KLSH, construct a hash function using a kernel

function and m anchor samples:

zero‐mean normalization applied to k(x).

18CVPR 2012

1 -1 11 -1 1-1 1 -11 1 -1

=sgn

model parameterkernel matrix

×l samples

m anchors

Sequential Optimization

19CVPR 2012

• Rewrite the object function as

A sequential idea: at a time, only optimize one vector akprovided with the previously optimized k‐1 vectors .

one hash bit one time

matrix: r bits

vector: kth bit cumulative residue

Deal with sgn()• We propose two methods to handle sgn().

20CVPR 2012

Generalized SVD

Gradient Descent

Spectral Relaxation

Sigmoid Smoothing

where is a smooth approximation to sgn(x) (|x|>6).


21CVPR 2012

CIFAR‐10

22CVPR 2012

• 60K object images from 10 classes, 1K query images.

• Hamming radius 2 precision in terms of semantic labels.

• 1K labeled examples are used for (semi‐)supervised hashing.

• KSH0 Spec Relax, KSH Sig Smooth.

CIFAR‐10

23CVPR 2012

MethodTrain Time Test Time

48 bits 48 bitsSSH 2.1 0.9×10−5

LDAH 0.7 0.9×10−5

BRE 494.7 2.9×10−5

MLH 3666.3 1.8×10−5

KSH0 7.0 3.3×10−5

KSH 156.1 4.3×10−5Significant speedup

Tiny‐1M

24CVPR 2012

• 1M tiny images from the web, 2K query images.

• Pseudo labels: top 5% L2 nearestneighbors as groundtruths.

• Hamming radius 2 precision in terms of L2 neighbors.

• 5K pseudo‐labeled examples are used for (semi‐)supervised hashing.

Tiny‐1M: Hamming Ranking

25CVPR 2012

KSH achieves highest precision and recall.

26

Tiny‐1M: Visual Search Results

CVPR 2012

most visuallyrelevant


27CVPR 2012

Conclusions• A novel inner products based formulation to preserve

supervised information into hashing.

28CVPR 2012

S

• A sequential code learning procedure: one bit one time.• A new smoothing method for binary code optimization.• Significant performance gains over state‐of‐the‐arts.• Release code soon.

Documents

Supervised Hashing with Kernels - Columbia University · 2012. 8. 10. · Conclusions • A novel inner products based formulation to preserve supervised information into hashing