Unsupervised Learning Jointly With Image Clusteringjyang375/Jianwei_Yang... · Unsupervised...

Unsupervised Learning Jointly With Image Clustering

Virginia Tech

Jianwei Yang Devi Parikh Dhruv Batra

https://filebox.ece.vt.edu/~jw2yang/ 1

Huge amount of images!!!

Learning without annotation efforts

What we need to learn?

An open problem

A hot problem

Various methodologies

An open problem

A hot problem

Learning distribution (structure)

Jain, Anil K., M. Narasimha Murty, and Patrick J. Flynn. "Data clustering: a review." ACM computing surveys (CSUR) 31.3 (1999): 264-323.

Clustering

K-means (Image Credit: Jesse Johnson)

Clustering

Hierarchical Clustering

Clustering

K-means (Image Credit: Jesse Johnson)Spectral Clustering

Manor et al, NIPS’04Hierarchical Clustering

Clustering

K-means (Image Credit: Jesse Johnson)Spectral Clustering

Manor et al, NIPS’04Hierarchical Clustering Graph Cut

Shi et al, TPAMI’00

Clustering

DBSCAN, Ester et al, KDD’96 (Image Credit: Jesse Johnson)

Spectral Clustering Manor et al, NIPS’04

Hierarchical Clustering Graph CutShi et al, TPAMI’00

Clustering

EM Algorithm, Dempster et al, JRSS’77

Clustering

EM Algorithm, Dempster et al, JRSS’77NMF, Xu et al, SIGIR‘03 (Image Credit: Conrad Lee)

Sub-space Analysis

PCA (Image Credit: Jesse Johnson)ICA (Image Credit: Shylaja et al)

tSNE, Maaten et al, JMLR’08

Subspace Clustering, Vidal et al.

Sparse coding, Olshausen et al. Vision Research’9717

Learning representation (feature)

Yoshua Bengio, Aaron Courville, and Pierre Vincent. "Representation learning: A review and new perspectives." IEEE Transactions on Pattern Analysis and Machine Intelligence. 35.8 (2013): 1798-1828.

Autoencoder, Hinton et al, Science’06 (Image Credit: Jesse Johnson)

DBN, Hinton et al, Science’06 DBM, Salakhutdinov et al, AISTATS’09

Bengio et al, TPAMI’13

Learning representation (feature)

VAE, Kingma et al, arXiv’13(Image Credit: Fast Forward Labs)

GAN, Goodfellow et al, NIPS’14DCGAN, Radford et al, arXiv’15

(Image Credit: Mike Swarbrick Jones)

Most Recent CV Works

Spatial context, Doersch et al, ICCV’15Temporal context, Wang et al, ICCV’15

Solving Jigsaw, Noroozi et al, ECCV’16Context Encoder, Deepak et al, CVPR’16

Ego-motion, Jayaraman et al, ICCV’15

Most Recent CV Works

Visual concept clustering, Huang et al, CVPR’16

Graph constraint, Li et al, ECCV’16

TAGnet, Wang et al, SDM’16

Deep Embedding, Xie et al, ICML’1621

Our Work

Joint Unsupervised Learning (JULE) of Deep Representations and Image Clusters

Outline

• Intuition

• Approach

• Experiments

• Extensions

Intuition

Meaningful clusters can provide supervisory signals to learn image representations

Intuition

Meaningful clusters can provide supervisory signals to learn image representations

Good representations help to get meaningful clusters

Intuition

Cluster images first, and then learn representations

Intuition

Learn representations first, and then cluster images

Intuition

Cluster images and learn representations progressively

Learn representations first, and then cluster images

IntuitionGood clusterGood representationsGood clusters

Good representations

Poor clusters

Poor representations 29

IntuitionGood clusterGood representationsGood clusters

Poor clusters

IntuitionGood clusterGood representations

Good clusters

Poor clusters

IntuitionGood clusterGood representations

Good clusters

Poor clusters

Approach

• Framework

• Objective

• Algorithm & Implementation

Approach: Framework

Agglomerative Clustering

Convolutional Neural Network

Representation Learning

Agglomerative Clustering

arg min ( | , )y

arg min ( | , )L y I

arg min ( , | )y

Approach: Framework

Convolutional Neural Network Agglomerative Clustering

arg min ( | , )L y I

arg min ( | , )y

Approach: Recurrent Framework

Backward at each time-step is time-consuming and prone to over-fitting!

How about updating once for multiple time-steps?

Backward at each time-step is time-consuming and prone to over-fitting!

Partially Unrolling: divide all T time-steps into P periods

In each period, we merge clusters for multiple times and update CNN parameters at the end of period

P is determined by a hyper-parameter will be introduced later 46

Approach: Objective Function

Overall loss:

arg min ( , | )y

arg min ( | , )y

L y I arg min ( | , )L y I

Loss at time-step t:

Conventional Agg. Clustering Strategy

Proposed Agg. Clustering Strategy

Affinity measure

i-th cluster

K_c nearest neighbor clusters of i-th cluster

Affinity between i-thcluster and its NN

Differences between two cluster affinities 53

Differences between two cluster affinities

Merge these two clusters

Differences between two cluster affinities

Merge these two clusters

Loss in forward pass in period p (merge clusters):

CNN parameters are fixed

Cluster labels are fixed

Forward Pass:

Simple Greedy Algorithm

Merge two clusters which minimize the loss at each time step

Forward Pass:

Approach: Objective

Backward Pass:

Approach: Objective

Backward Pass:

Consider all previous periods

Approach: Objective

Backward Pass:

Cluster based loss is not proper for batch optimization!!!

Approach: Objective

Backward Pass:

Cluster based loss is not proper for batch optimization!!!

Approximation:

Approach: Objective

Backward Pass:

Convert to sample-based loss:

Intra-sample affinity Inter-sample affinity

Recall cluster-based loss:

Approach: Objective

Backward Pass:

Convert to sample-based loss:

Intra-sample affinity Inter-sample affinity

Recall cluster-based loss:

Weighted triplet loss

Approach: Algorithm & Implementation

Raw image data

Assume it is known

Raw image data

Assume it is known

Randomly initialize CNN parameters4 samples in each cluster in average

Raw image data

Assume it is known

Train CNN for about 20 epochs

Raw image data

Assume it is known

Train CNN for about 20 epochs

We can go back and retrain the model, but it improve slightly

Experiments

• Datasets

• Network Architecture

• Image Clustering

• Representation Learning

Experiments: Datasets

MNIST (70000, 10, 28x28) USPS (11000, 10, 16x16) COIL20 (1440, 20, 128x128) COIL100 (7200, 100, 128x128)

UMist (575, 20, 112x92) FRGC (2462, 20, 32x32) CMU-PIE (2856, 68, 32x32) Youtube Face (1000, 41, 55x55)77

Experiments: SettingsTwo important parameters

Set the layer numbers so that theOutput feature map is about 10x10

Experiments: Clustering : Performance

+6.43% on NMI to best performance of existing approaches averaged over all datasets

+12.76% on AC to best performance of existing approaches averaged over all datasets

Average +21.5% on NMI

Average +25.7% on NMI

Our clustering performance vs. that of existing clustering approaches using raw image data.

Clustering performance using our representation fed to existing clustering algorithms.

Experiments: Clustering : Visualization

COIL-20

COIL-100

Experiments: Clustering : Visualization

MNIST-test

Experiments: Clustering : Ablation study

Experiments: Clustering : Verification

Experiments: Clustering : Time Cost

Experiments: Representation Learning

Testing generalization of our learnt (unsupervised) representation to LFW face verification.

Evaluation on CIFAR-10 classification

Representation transfer

Representation learning

Extensions: Data Visualization

Conclusion

• A new unsupervised learning method jointly with image clustering, cast the problem into a recurrent optimization problem;

• In the recurrent framework, clustering is conducted during forward pass, and representation learning is conducted during backward pass;

• A unified loss function in the forward pass and backward pass;

• Performance outperforms the state-of-the-art over a number of datasets;

• It can also learn plausible representations for image recognition.

Thanks!

https://github.com/jwyang/joint-unsupervised-learning 92

Unsupervised Learning Jointly With Image Clusteringjyang375/Jianwei_Yang... · Unsupervised...

Documents

Comparative Analysis of Unsupervised and Supervised · PDF fileComparative Analysis of Unsupervised and Supervised Image Classification Techniques ... supervised image classification

Unsupervised image ranking - Department of Computer Science

SDD-FIQA: Unsupervised Face Image Quality Assessment With

Unsupervised Image Translation using Adversarial Networks

Research Article Unsupervised Joint Image Denoising and ...downloads.hindawi.com/journals/mpe/2016/3909645.pdf · Research Article Unsupervised Joint Image Denoising and Active Contour

A new approach for unsupervised classiﬁcation in image ... · A new approach for unsupervised classiﬁcation in image segmentation 5 When involved to solve the problem of image

Unsupervised Image-to-Image Translation Networksの紹介

Blind Image Denoising using Supervised and Unsupervised

Unsupervised Multi-Modal Image Registration via …...Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation Moab Arar Yiftach Ginger Dov Danon

Unsupervised Image Co-segmentation Based on Cooperative Gamevigir.missouri.edu/~gdesouza/Research/Conference... · Unsupervised Image Co-segmentation Based on Cooperative Game 3 (a)

Non-metric affinity propagation for unsupervised image categorization

Unsupervised Deep Learning - GitHub Pages · • Predict image rotation R. Zhang et al. “Colorful Image Colorization”, ECCV 2016 S. Gidaris et al. “Unsupervised Representation

Unsupervised Cleaning for Web Image Retrieval … Cao, Yang Peng, Mustaqim Moulavi, Harsh Bhavsar Unsupervised Cleaning for Web Image Retrieval Title PowerPoint Presentation Author

Unsupervised Learning for Intrinsic Image …...Unsupervised Learning for Intrinsic Image Decomposition from a Single Image Yunfei Liu1 Yu Li2 Shaodi You3 Feng Lu1, 4, ∗ 1 State

Unsupervised Remote Sensing Image Retrieval Using

Unsupervised Multi-Modal Image Registration via Geometry ...openaccess.thecvf.com/content_CVPR_2020/papers/... · Unsupervised Multi-Modal Image Registration via Geometry Preserving

Fingerprint Image Enhancement Using Unsupervised ... Image Enhancement Using Unsupervised Hierarchical Feature Learning Presented in partial ful lment of the requirements of MS by

Hyperspectral Image Classification Using Unsupervised

Unsupervised Attention-guided Image-to-Image Translationpapers.nips.cc/paper/7627-unsupervised-attention-guided-image-to-image... · Unsupervised Attention-guided Image-to-Image Translation

UNSUPERVISED MACHINE LEARNING TO ANALYSE CITY …chairelogistiqueurbaine.fr/wp-content/uploads/2019/06/UNSUPERVISED... · UNSUPERVISED MACHINE LEARNING TO ANALYSE CITY Image credit: