Automatic Image Annotation Using Group Sparsity

Automatic Image Annotation Using Group Sparsity

Shaoting Zhang1, Junzhou Huang1, Yuchi Huang1, Yang Yu1, Hongsheng Li2,

Dimitris Metaxas1

1CBIM, Rutgers University, NJ2IDEA Lab, Lehigh University, PA

Introductions• Goal: image annotation is to automatically assign

relevant text keywords to any given image, reflecting its content.

• Previous methods: – Topic models [Barnard, et.al., J. Mach. Learn Res.’03;

Putthividhya, et.al., CVPR’10]– Mixture models [Carneiro, et.al., TPAMI’07; Feng,

et.al., CVPR’04] – Discriminative models [Grangier, et.al., TPAMI’08;

Hertz, et.al., CVPR’04]– Nearest neighbor based methods [Makadia, et.al.,

ECCV’08; Guillaumin, et.al., ICCV’09]

Introductions

• Limitations: – Features are often preselected, yet the properties of

different features and feature combinations are not well investigated in the image annotation task.

– Feature selection is not well investigated in this application.

• Our method and contributions: – Use feature selection to solve annotation problem. – Use clustering prior and sparsity prior to guide the

selection.

Outline

• Regularization based Feature Selection– Annotation framework– L2 norm regularization– L1 norm regularization– Group sparsity based regularization

• Obtain Image Pairs• Experiments

Regularization based Feature Selection

• Given similar/dissimilar image pair list (P1,P2)

……………………………………

……………………………………

……………………………………

XFP1 FP2


X

1-111……………

w Y

22||||minargˆ YXww

pRw


• Annotation framework

Testing input

Training data

Weights Similarity

High similarity


2

22 ||||||||minargˆ wYXw

nw

pRw

＋１

• L2 regularization• Robust, solvable: (XTX+λI)-1XTY

• No sparsity

w

%

Histogram of weights


1

22 ||||||||minargˆ wYXw

nw

pRw

＋１

• L1 regularization• Convex optimization• Basis pursuit, Grafting,

Shooting, etc.• Sparsity prior

Histogram of weights

w

%


m

jG

Rwjp

wYXwn

w1

222 ||||||||minargˆ ＋１

• Group sparsity[1]

• L2 inside the same group, L1 for different groups

• Benefits: removal of whole feature groups

• Projected-gradient[2]

[1] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B, 68:49–67, 2006.[2] E. Berg, M. Schmidt, M. Friedlander, and K. Murphy. Group sparsity 　 via linear-time projection. In Technical report, TR-2008-09, 2008. http://www.cs.ubc.ca/~murphyk/Software/L1CRF/index.html

=0 ≠0

RGB HSV

Outline

• Regularization based Feature Selection• Obtain Image Pairs– Only rely on keyword similarity– Also rely on feedback information

• Experiments

Obtain Image Pairs

• Previous method[1] solely relies on keyword similarity, which induces a lot of noise.

[1] A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV, pages 316–329, 2008.

Distance histogram of similar pairs Distance histogram of all pairs

Obtain Image Pairs

• Inspired by the relevance feedback and the expectation maximization method.

k1 nearest k2 farthest

m

jG

Rwjp

wYXwn

w1

222 ||||||||minargˆ ＋１

(candidates of similar pairs)

(candidates of dissimilar pairs)

Outline

• Regularization based Feature Selection• Obtain Image Pairs• Experiments– Experimental settings– Evaluation of regularization methods– Evaluation of generality– Some annotation results

Experimental Settings

• Data protocols– Corel5K (5k images)– IAPR TC12[1] (20k images)

• Evaluation– Average precision– Average recall– #keywords recalled (N+)

[1] M. Grubinger, P. D. Clough, H. Muller, and T. Deselaers. The iapr tc-12 benchmark - a new evaluation resource for visual information systems. 2006.

Experimental Settings

• Features– RGB, HSV, LAB– Opponent – rghistogram– Transformed color distribution– Color from Saliency[1]

– Haar, Gabor[2]

– SIFT[3], HOG[4]

[1] X. Hou and L. Zhang. Saliency detection: A spectral residual approach. In CVPR, 2007.[2] A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV, pages 316–329, 2008.[3] K. van de Sande, T. Gevers, and C. Snoek. Evaluating color descriptors for object and scene recognition. PAMI, 99(1),2010.[4] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, pages 886–893, 2005.

Evaluation of Regularization Methods

Corel5K

IIAPR TC12||||w

Precision Recall N+

Evaluation of Generality

Precision Recall N+

• Weights computed from Corel5K, then applied on IAPR TC12.

λ λ λ

Some Annotation Results

Conclusions and Future Work• Conclusions– Proposed a feature selection framework using both

sparsity and clustering priors to annotate images.– The sparse solution improves the scalability.– Image pairs from relevance feedback perform much

better.• Future work– Different grouping methods.– Automatically find groups (dynamic group sparsity).– More priors (combine with other methods).– Extend this framework to object recognition.

Thanks for listening

Documents

Automatic Image Annotation Using Group Sparsity