Riemannian Sparse Coding for Positive Definite Matrices Anoop Cherian1 Suvrit Sra2
1LEAR Project-team, Inria Grenoble - Rhône-Alpes 2MPI for Intelligent Systems, Tϋbingen, Germany
Introduction
Covariance Descriptors
Geometry of Positive Definite Matrices
Our Contributions
1. Riemannian Sparse Coding Formulation
2. Efficient Optimization
Experiments and Results
Region covariance descriptors are covariance matrices used as visual data descriptors in several computer vision applications. Examples include people tracking, diffusion MRI, object recognition, etc.
vec
Say d different features from patches
xi ԑ d
For p pixels
in a patch
T
iixx
pX ))((
1
1
where μ is the mean of xi’s
Covariance Descriptor
Advantages
• Multi-feature fusion
• Compact
• Real-time computable
• Robust to static noise
• Robust to illumination
• Robust to affine transforms
X is in the space of d x d symmetric positive definite (SPD) matrices (Sd
++) and is called the Covariance Descriptor
• SPD matrices form a manifold (Riemannian manifold) in the Euclidean space due to their positive definiteness property • Distances along the manifold are not straight lines, but curved geodesics!
Due to the great success of sparse coding for vectorial data, we propose a novel scheme for sparse coding SPD matrices using a dictionary whose atoms are SPD matrices. 1. Our novel sparse coding formulation uses a loss function defined
on the affine invariant Riemannian distance (which is the natural distance on this Riemannian manifold)
2. We propose a computationally efficient scheme for optimization
3. We show conditions at which our formulation is convex
4. Experiments on several computer vision applications demonstrate
significant promise of our approach over the state of the art.
Let be a dictionary with n atoms B1, B2, ..., Bn, each Bi ԑ Sd++ . Let X be the input
matrix to be sparse coded. Then, our sparse coding objective is:
Affine invariant Riemannian Distance Sparsity
Prior Work Loss function based on
Sivalingam et al., ECCV, 2010 LogDet divergence
Sra and Cherian, ECML, 2011 Frobenius distance
Ho et al., ICML, 2013 Log-Euclidean distance
Harandi et al., ECCV 2012, Jayasumana et al., CVPR, 2013
Li et al., CVPR, 2013
Kernel methods
In contrast to prior methods, our formulation is based on the intrinsic geometry of the SPD manifold (while other methods use proxy distances).
We use projected gradient descent for minimizing (1), which has the following sequence of iterations over the sparse coefficient vector α: There are three major computational challenges for each iteration in (2)
(1)
(2)
1. Computing the step size ηk -- we use Spectral Projected Gradient [Birgin et al. ‘01]
2. Computing the gradient (αk) efficiently (See paper for the algorithm) 3. Projection step -- we truncate the negative values for efficiency.
Computational Complexity: (nd2 + d3) for computing gradient in each iteration
3. Convexity Properties
The objective in (1) is not convex in α. But surprisingly, it becomes convex under some constraints!
where dR is the affine invariant Riemannian distance as shown in (1) above.
See paper for the proof.
Simulated Experiments
Matrix size fixed at 10 x 10 Number of atoms fixed at 200
Real Data Experiments
Dataset Features Dims #classes Dataset Size Method of Evaluation
Brodatz Texture SIG 5 x 5 110 10,000 SVM classifier
ETH80 object SIGC + Laws 19 x 19 8 3,280 SVM classifier
ETHZ people SIGC + Hue 18 x 18 146 8,580 NN Classifier
RGB-D object SIG + 3G 18 x 18 51 15,000 NN Classifier
S=Spatial features, I=Intensity, C=color, G=Gradient, Laws=Texture filters, 3G=3D gradients We used 80-20% for training and testing SVM resp., we used 20-80% for NN classifier.
Brodatz Textures ETH80 Objects
ETHZ Person Re-identification RGB-D Objects
LE-SC : Guo et al. AVSS’10, K-Stein: Harandi et al. ECCV’12, K-LE-SC: Li et al, ICCV’13, TSC: Sivalingam et al. ECCV’10, GDL: Sra and Cherian, ECML’11