View
4
Download
0
Category
Preview:
Citation preview
Lei Wang
VILA group
School of Computing and Information Technology
University of Wollongong, Australia11-Dec-2019
Learning Kernel-matrix-based Representation for Fine-grained Image Recognition
Introduction
• Fine-grained image recognition
Image courtesy of “Wei et. al., Deep learning for fine-grained image analysis: A survey”
Introduction
• Fine-grained image recognition
Image courtesy of “Wei et. al., Deep learning for fine-grained image analysis: A survey”
Introduction
• Fine-grained image recognition
Image courtesy of “Wei et. al., Deep learning for fine-grained image analysis: A survey”
Introduction
• Feature: how to represent an image?
– Scale, rotation, illumination, occlusion, deformation, …
– Differences with respect to other classes
– Ultimate goal: “Invariant and discriminative”
Cat:
• Hand-crafted, global features
– Color, texture, shape, structure, etc.
– An image becomes a feature vector
• Shallow classifiers
– K-nearest neighbor, SVMs, Boosting, …
1. Before year 2000
• Invariant to view angle, rotation, scale, illumination, clutter, ...
2. Days of Bag of Features model (2003-12)
Local Invariant Features
Interest point detection
or Dense sampling
An image becomes “A set of feature vectors”
3. Era of Deep Learning (since 2012)
Deep Local Descriptors
“Cat”
Depth
Height
Width
Again, an image becomes “A set of feature vectors”
Image(s): a set of points/vectors
Image set classification
9
Action recognition
vs.
Neuroimaginganalysis
How to pool a set of points/vectors to obtain a global visual representation ?
Object recognition
Pooling operation
• Covariance pooling (second-order pooling)
A set of local
descriptors
x1
x2
.
.
.
xn
How to pool?
• Max pooling, average (sum) pooling, etc.
• Introduction on Covariance representation
• Our research work
– Moving to Kernel-matrix-based Representation (KSPD)
– End-to-end Learning KSPD in deep neural networks
• Conclusion
Outline
11
Introduction on Covariance representation
Covariance Matrix
vs.
Introduction on Covariance representation
Use a Covariance matrix as a feature representation
13Image is from http://www.statsref.com/HTML/index.html?multivariate_distributions.html
Introduction on Covariance representation
14
belongs to Symmetric Positive Definite (SPD) matrix
resides on a manifold instead of the whole space
Introduction on Covariance representation
?
15
How to measure the similarity of two SPD matrices?
Introduction on SPD matrix
Similarity measures for SPD matrices
16
SPD matrices
Kernel method
Geodesic distance
Euclidean mapping
Introduction on SPD matrix
17
Geodesic distance
Pennec X, Fillard P, Ayache N. A
Riemannian framework for tensor
computing. IJCV, 2006
Fletcher P T, Principal geodesic analysis on symmetric spaces: Statistics of diffusion tensors. Computer Vision and Mathematical Methods in Medical and Biomedical Image Analysis., 2004
Förstner W, Moonen B. A metric for covariance matrices, Geodesy-The Challenge of the 3rd Millennium, 2003
Lenglet C, Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor MRI processing. Journal of Mathematical Imaging and Vision, 2006
2003 2004 2006
Introduction on SPD matrix
18
Euclidean mapping
Arsigny V, Log‐Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic resonance in medicine, 2006,
Veeraraghavan A, Matching shape sequences in video with applications in human movement analysis. PAMI, IEEE Transactions on, 2005
Tuzel O, Pedestrian detection via classification on riemannian manifolds. PAMI, IEEE Transactions on, 2008
2005
2006
2008
Introduction on SPD matrix
19
Kernel methods
Vemulapalli R, Pillai J K, Chellappa R. Kernel learning for extrinsic classification of manifold features, CVPR, 2013
Harandi M et al. Sparse coding and dictionary learning for SPD matrices: a kernel approach, ECCV, 2012
S. Jayasumana, et. al., Kernel methods on the Riemannian manifold of symmetric positive definite matrices, CVPR 2013.
Sra S. Positive definite matrices and the S-divergence. arXiv preprint arXiv:1110.1773, 2011.
Wang R., et. al., Covariance discriminative learning: A natural and efficient approach to image set classification, CVPR, 2012
2011
2012 2013
Quang, Minh Ha, et. Al., Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces. NIPS. 2014.
2014
Introduction on SPD matrix
20
Integration with deep learning
Li et al., Is Second-order Information Helpful for Large-scale Visual Recognition? ICCV2017
Huang et al., A Riemannian Network for SPD Matrix Learning, AAAI2017
Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017
Lin et al, Bilinear CNN Models for Fine-grained Visual Recognition, ICCV2015
Ionescu et al, , Matrix Backpropagation for Deep Networks with Structured Layers, ICCV2015
2015
2017
Koniusz et al., A Deeper Look at Power Normalizations,, CVPR 2018
2018
Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017
Wang et al., G2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition, CVPR2017
2016
Gao et al. Compact Bilinear Pooling, CVPR2016
Cui et al., Kernel Pooling for Convolutional Neural Networks, CVPR 2017
Introduction on SPD matrix
21
Integration with deep learning
Gao et al., Global Second-order Pooling Convolutional Networks, CVPR2019
Yu et al., Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition, ECCV2018
Wang et al., Deep Global Generalized Gaussian Networks, CVPR2019
Li et al, Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization, CVPR2018 Wei et al., Kernelized Subspace
Pooling for Deep Local Descriptors, CVPR2018
2018
Brooks et al., Riemannian batch normalization for SPD neural networks, NeurIPS2019
2019
Zheng et al., Learning Deep Bilinear Transformation for Fine-grained Image Representation , NeurIPS2019
Yu and Salzmann, Statistically-motivated Second-order Pooling, ECCV2018
Lin et al., Second-order Democratic Aggregation, ECCV2018
Fang et al., Bilinear Attention Networks for Person Retrieval, ICCV2019
Zheng et al., Learning Deep Bilinear Transformation for Fine-grained Image Representation , NeurIPS2019
Outline
22
• Introduction on Covariance representation
• Our research work
– Moving to Kernel-matrix-based Representation (KSPD)
– End-to-end Learning KSPD in deep neural networks
• Conclusion
Motivation
Covariance Matrix
Covariance matrix needs to be estimated from data
Motivation
• Covariance estimate becomes singular
– High-dimensional (d) features
– Small sample (n)
• Covariance matrix only characterises linear correlation between feature components
24
Introduction
Applications with high dimensions but small sample issue
25
Small sample 10 ~ 300High dimensions 50 ~ 400
Introduction
Look into Covariance representation
26
…i-th feature j-th feature
Just a linear kernel function!
Proposed kernel-matrix representation (ICCV15)
Let’s use a kernel function instead
27
Advantages:
• Model nonlinear relationship between features;
• For many kernels, M is guaranteed to be nonsingular, no matter what the feature dimensions (d) and sample size (n) are.
• Maintain the size of covariance representation to be d x d and therefore the computational load.
Covariance
Kernel Matrix!
Application to skeletal action recognition
28
* Cov_JH_SVM uses a kernel function to map each of the n samples into an infinite-dimensional space and implicitlycomputes a covariance matrix there.
Application to skeletal action recognition
29
* Cov_JH_SVM uses a kernel function to map each of the n samples into an infinite-dimensional space and implicitlycomputes a covariance matrix there.
Application to object recognition (handcrafted features)
30
Application to scene recognition (extracted deep features)
31
586062646668707274767880
Alex Net (F7) VGG-19 Net(Conv5)
Fisher Vector(CVPR15)
Cov-RP
Ker-RP (RBF)
Comparison on MIT Indoor Scenes Data Set(Classification accuracy in percentage)
*
* Cimpoi et. al., Deep filter banks for texture recognition and segmentation, CVPR2015
Outline
32
• Introduction on Covariance representation
• Our research work
– Moving to Kernel-matrix-based Representation (KSPD)
– End-to-end Learning KSPD in deep neural networks
• Conclusion
Covariance representation
Integration with Deep Learning
Bilinear CNN Models for Fine-grained Visual Recognition, Lin et al, ICCV2015
Covariance representation
Integration with Deep Learning
Matrix Backpropagation for Deep Networks with Structured Layers, Ionescu et al, ICCV2015
Covariance representation
Integration with Deep Learning
G^2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition, Wang et al, CVPR2017
Covariance representation
Integration with Deep Learning
Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017
Covariance representation
Integration with Deep Learning
Is Second-order Information Helpful for Large-scale Visual Recognition?, Li et al., ICCV2017
Proposed DeepKSPD (ECCV18)
Motivation
38
The kernel-matrix-based SPD (KSPD) representation has not been developed upon deep local descriptors has not been jointly learned via deep learning
Existing matrix backpropagation for learning covariance-representation via deep networks
encounters numerical stability issue
Proposed DeepKSPD
Architecture and layers
39
Proposed DeepKSPD
Matrix backpropagation
40
Proposed DeepKSPD
Matrix backpropagation
41
H = f(K) on the kernel matrix K
~ ?
Proposed DeepKSPD
Existing matrix backpropagation
Matrix Backpropagation for Deep Networks with Structured Layers, Ionescu et al, ICCV2015
Proposed DeepKSPD
Result from the literature of Operator Theory (1951)
Proposed DeepKSPD
Existing matrix backpropagation (Ionescu et al, ICCV2015)
Proposed matrix backpropagation
What is their relationship?
Proposed DeepKSPD
Generalize to matrix α-rooting normalization
Experimental Result
Fine-grained Image Recognition
Experimental Result (ECCV18)
Fine-grained Image Recognition
Experimental Result
Numerical stability of backpropagation
Experimental Result
DeepKSPD vs DeepCOV
Experimental Result
Ablation study • Learning width θ in the GRBF kernel• Learning α in matrix α-rooting normalisation
Our archive website
https://saimunur.github.io/spd-archive/
71
73
75
77
79
81
83
85
87
89
91
Acc
ura
cy
CUB-200-2011 dataset (SPD & related)
Our archive website
https://saimunur.github.io/spd-archive/
76
78
80
82
84
86
88
90
92
94
Acc
ura
cy
FGVC-Aircraft dataset (SPD & related)
Our archive website
https://saimunur.github.io/spd-archive/
90
90.5
91
91.5
92
92.5
93
93.5
94
94.5
95
Acc
ura
cy
Stanford Cars dataset (SPD & related)
Research trends on learning SPD representation
• Compactness of second-order feature representation & Computational efficiency
• Efficient training of SPD structural layers by considering the underlying manifold structure
• Second-order correlation across layers
• Deeply integrated into convolutional neural networks
• More applications explored• Generic and Fine-grained image recognition• Image segmentation, Person reidentification and retrieval• Action parsing & analysis, Image super-resolution• More to be explored…
Key references
57
1. R. Wang, H. Guo, L. S. Davis and Q. Dai, "Covariance discriminative learning: A natural and efficient approach to image set classification," 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, 2012, pp. 2496-2503.
2. S. Jayasumana, R. Hartley, M. Salzmann, H. Li and M. Harandi, "Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices," 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, 2013, pp. 73-80.
3. T. Lin, A. RoyChowdhury and S. Maji, "Bilinear CNN Models for Fine-Grained Visual Recognition," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1449-1457.
4. P. Li, J. Xie, Q. Wang and W. Zuo, "Is Second-Order Information Helpful for Large-Scale Visual Recognition?" 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2089-2097.
5. L. Wang, J. Zhang, L. Zhou, C. Tang and W. Li, "Beyond Covariance: Feature Representation with Nonlinear Kernel Matrices," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 4570-4578.
6. M. Engin, L. Wang, L. Zhou, X. Liu “DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition.” The European Conference on Computer Vision (ECCV), Munich, 2018, pp. 629—645.
7. Second- and Higher-order Representations in Computer Vision, Tutorial at ICCV2019.8. SPD Representations Methods Archive
Q&A
Images Courtesy of Google Image
Recommended