55
Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM), Tony Jebara (Columbia U), Wei Liu, Junfeng He, and Yu-Gang Jiang (Fudan U) 1

Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Graph-based Techniques for Searching Large-Scale Noisy

Multimedia Data

Shih-Fu ChangDepartment of Electrical Engineering

Department of Computer Science

Columbia University

Joint work with Jun Wang (IBM), Tony Jebara (Columbia U), Wei Liu, Junfeng He, and Yu-Gang Jiang (Fudan U)

1

Page 2: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Graph-based Semi-Supervised Learning• Given a small set of labeled data and a large number of

unlabeled data in a high-dimensional feature space– Build sparse graphs with local connectivity– Propagate information over graphs of large data sets– Hopefully robust to noise and scalable to gigantic sets

2

Input samples with sparse labelsInput samples with sparse labels Label propagation on graphLabel propagation on graph Label inference resultsLabel inference results

Unlabeled Positive NegativeNegativePositive

Page 3: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Intuition Capture local structures via sparse graph

Linear Classifier

Nonlinear ClassifierGraph Semi-Supervised Learning

Through Spare Graph Construction (e.g., kNN)

Page 4: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Graph Construction

Label Propagation

Image\Video dataProcessing (denoising, cropping …)

Possible Applications:Propagating Labels in Interactive Search & Auto Re‐ranking

Feature Extraction

Compute Similarity

Applications

Search, Browsing

User Interface

Interactive browse / label

Interactive Mode

4S.-F. Chang, Columbia U.

Top-rank resultsExisting

Ranking/filtering System

Automatic Mode

Re-ranking over large set

No predefined Category

Page 5: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Example: Web Search Reranking

Keyword Search

Web Search

Top images as +Bottom imgs as ‐

Label Diagnosis Diffusion

Google Search “Statue of Liberty”

Page 6: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Application: Web Search Reranking

Keyword Search

Web Images

Top images as +Bottom imgs as ‐

Label Diagnosis Diffusion

Rerank

Page 7: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Application: Web Search Reranking

Keyword Search

Web Images

Top images as +Bottom imgs as ‐

Label Diagnosis Diffusion

Google Search “Tiger”

Page 8: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Application: Web Search Reranking

Keyword Search

Web Images

Top images as +Bottom imgs as ‐

Label Diagnosis Diffusion

Rerank

How to Handle  Noisy Labels before Propagation?Scalability? 

Page 9: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Background ReviewGiven a dataset of labeled samples , and unlabeled samplesundirected graph of samples as vertices and edges weighted by sample similarity

Define weight matrix ; vertex degree

Page 10: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Example1

1

22

1

Weight matrix Node degree

Label matrix

classes

samples

1 0 00 1 00 0 10 0 1? ? ?

F

1 0 00 1 00 0 10 0 10.1 0.2 0.9

Label predictionGraph-based

SSL

D

1 0 0 0 00 3 0 0 00 0 3 0 00 0 0 4 00 0 0 0 3

Page 11: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Some Options of Constructing Sparse Graph

Distance Threshold K-Nearest Neighbor Graph

B-Matched Graph

1 and

(Huang and Jebara, AISTATS 2007)(Jebara, Wang, and Chang, ICML 2009)

max

max

Page 12: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Several Ways of Constructing Sparse Graphs

Distance threshold Rank threshold (kNN) B-Match

k,b=4

k,b=6

Page 13: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Examples of Graph Construction

(KNN) (B-Matching)

k = 4 b = 4

Page 14: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Graph Construction – Edge Weighting

Binary Weighting

Gaussian Kernel Weighting

Locally Linear Reconstruction Weighting

Page 15: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Measure Smoothness: Graph Laplacian

Graph Laplacian , and normalized Laplacian

smoothness of function f over graph,

Multi-class

Page 16: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Classical Methods:

• Predict a graph function (F) via cost optimization

Local and Global Consistency - LGC (Zhou et al, NIPS 04)

Gaussian Random Fields – GRF (Zhu et al, ICML03)

prediction function function smoothness empirical loss

0

(Zhu et al ICML03, Zhou et al NIPS04, Joachim ICML03)

∗→

Page 17: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Compare method-graphs-weights

B-matching tends to outperform kNN

B-Matching particularly good for GTAM + local linear (LLR) weight

Empirical Observations

17

GTAMGTAMGTAMGTAMGTAMGTAM

(Jebara, Wang, and Chang, ICML 2009)

Page 18: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

18

Noisy Label and other Challenges

Unbalanced Labels

Ill Label Locations

Noisy Data and Labels

LGC Propagation

GRF Propagation

Page 19: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Label Unbalance ‐ A Quick FixNormalize labels within each class based on node degrees

Example:

Node degree matrix

Label matrix

classes

samples

Page 20: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Change uni‐variate optimization to bi‐variateformulation:

Dealing with Noisy Labels‐‐ Graph Transduction via Alternate Minimization 

( GTAM, Wang, Jebara, & Chang, ICML, 2008) ( LDST, Wang, Jiang, & Chang, CVPR, 2009)

Page 21: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Alternate Optimization

Then, search optimal integer Y given F*

First, given Y solve continuous valued

Gradient decent search

Page 22: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Alternate Minimization for Label Tuning

Iteratively repeat the above procedure

Add label:Delete label:

Q =

0.8 0.10.23 0.250.31 0.070.17 0.04

(1,1)

(3,1)

=

1 00 10 00 0

Example:

=

0 00 11 00 0

Page 23: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Convergence procedure(non-monotonic due to discrete step size)

Example – Toy Data

Unlabeled Positive Negative

Label propagation by GTAM

Consider adding label only

Page 24: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Decline of the cost function Q over iterations (with vs. without label tuning)

Iteration # 2

Iteration # 6

Initial Labels

Label Diagnosis and Self Tuning( LDST, Wang, Jian, & Chang, CVPR, 2009)

Add label:

Delete label:

Page 25: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Application: Web Search Reranking

Keyword Search

Web Images

Top images as +Bottom imgs as ‐

Label Diagnosis Diffusion

Google Search “Tiger”

Page 26: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Application: Web Search Reranking

Keyword Search

Web Images

Top images as +Bottom imgs as ‐

Label Diagnosis Diffusion

Rerank

Page 27: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Figure 4. Example images of text search results from flickr.com. A total of nine text queries are used: dog, tiger, panda, bird, flower, airplane, forbidden city, statue of liberty, golden bridge.

Page 28: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Effects of Graph‐based reranking

VisualRank: Jing & Baluja, 08

Page 29: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Graph Construction

Label Propagation

Image\Video dataProcessing (denoising, cropping …)

Possible Applications:Propagating Labels in Interactive Search & Auto Re‐ranking

Feature Extraction

Compute Similarity

Applications

Search, Browsing

User Interface

Interactive browse / label

Interactive Mode

29S.-F. Chang, Columbia U.

No predefined Category

Page 30: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Use image graph to tune & propagate information

Use EEG brain signals to detect target of interest

(joint work with Sajda et al, ACMMM 2009, J. of Neural Engineering, May 2011)

Application: Brain Machine Interface for Image Retrieval-- denoise unreliable labels from brain signal decoding

Page 31: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

31

The ParadigmDatabase (any target that may interest users)

Page 32: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

32

Database

Neural (EEG) decoder

EEG-scores

The Paradigm

Page 33: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

33

Database

Neural (EEG) decoder

Exemplar labels (noisy)

Graph-based Semi-Supervised

Learning

The Paradigm

image features

prediction score

Page 34: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

34

Pre-triage Post-triage

The Paradigm

Page 35: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

35

Pre-triage Post-triage

The Paradigm

Human inspects only a small

sample set via BCI

Machine filters out noise and retrieves targets from very

large DB

• General: no predefined target models, no keyword

• High Throughput: neuro‐vision as bootstrap of fast computer vision

Page 36: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

36

The Neural Signatures of “Recognition”D. Linden, Neuroscientist, 2005, the Oddball Effect

Novel (P3a)

NovelTarget Standard

Target (P3b)

time

Standard

Target

Novel

Page 37: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

37

Effect of graph-based reranking (BCI test)

Top (noisy) results of Brain EEG signal detection

Top results after graph‐based label denoising

& propagation

P‐R  curve significantly improved

Page 38: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

38

More Example Results

Top 20 results of EEG detection

Top 20 results of Hybrid System (BCI‐VPM)

Top 20 results of EEG detection

Top 20 results of Hybrid System (BCI‐VPM)

Page 39: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Graph over million points and more• k-NN graph construction + label prediction

• infeasible for large-scale tasks• Idea: AnchorGraph Regularization

complexity: # anchors m << n

(W. Liu, J. He, S.-F. Chang, ICML2010) 0 1000 2000 3000 4000 5000

0

2

4

6

8

10

12

14x 10

10 Time Complexity

data size n

time

Page 40: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Active topic in research• Large‐scale spectral analysis (Fergus et al, ‘09)

– Approximate solutions as linear combinations of a small number of eigenfucntions of graph Laplacian

– Elegant solutions with linear complexity– But only applicable to ideal data distributions(separable uniform or Gaussian)

• Matrix approximation via Nyström (Zhang et al, ‘09)

– Complexity – But may not be positive semidefinite ‐> non‐convex

W =

Page 41: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Idea: Build low-rank graph via anchors• Use anchor points to “abstract” the graph structure• Compute data-to-anchor similarity: sparse local embedding

• Data-to-data similarity W = inner product in the embedded space

data points

anchor points

x8

x4

u1

u2

u5

u4

u6

u3

x1

Z11

Z12

Z16

W14=0

W18>0

(Liu, He, Chang, ICML10)

Page 42: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Probabilistic Intuition• Affinity between samples i and j,  Wij

= probability of two‐step Markov random walk

AnchorGraph: sparse, positive semi-definite

, where = diag( ), m<< n

Page 43: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

AnchorGraph Regression• Apply the same sparse embedding principle to labels

• The whole graph regularization process becomes low-rank

Small matrix inversion

Predicted function over graph = embedding matrix ∙ inferred labels on anchors

∗ ∗

Page 44: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Intuition: Anchor Graph SSL

initial labelslabel mapping inlabel mapping out

Use low‐rank ARG to infer optimal labels on anchors and samples

Predict optimal labels in the anchor space (~100 labels)

Propagate to original sample space (~million labels)

Page 45: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Performance -small data set

Method Error Rate (%) Time (seconds)1NN 20.15 0.12

LGC with 6NN graph

8.79 403.02

GFHF with 6NN graph

5.19 413.28

AGR^0 7.40 10.20AGR 6.56 16.57

40x speedup

• USPS-Train: 7,291 images of digits, 10 classes, 10 samples per class• AGR^0: K-means anchors and naïve Z

AGR: K-means anchors and optimized Z

accuracy comparable to analytical optimum

Page 46: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Large Data Set Evaluation• 630,000 MNIST images over 10 classes, 100 labeled images only• Conventional analytical solutions infeasible• Among scalable solutions ‐ reduce error rates by 30%‐50%

Method Error Rate (%) Training Time (seconds)

1NN 39.65 5.46Eigenfunction (‘09) 36.94 44.08

PVM (‘09) 29.37 266.89AGR^0 24.71 232.37AGR 19.75 331.72

30%-50% gain

Page 47: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Extension to Web-Scale• Techniques described above not scalable to Web‐scale or dynamic data sets– Cannot handle cases when n = ~ billions – For dynamic data, updating graph is expensive

• Preferred: learn Inductive Models to handle novel dynamic data

47

Page 48: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Data Subsampling & Learn Inductive Model

48

Web-scale database

one million data pointsone million data points

anchor points

subsampling

seed labels

Anchor GraphConstruction

data-to-anchor map z(x)

Anc

hor G

raph

Reg

ular

izat

ion

anchors’ labels a

x

novel data point xz(x) × f(x)=z(x)aT

predict x’s label

Page 49: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

background

truck

ship

horse

frog

dog

deer

cat

bird

automobile

airplane

training images test images

ARG over 80M Tiny Images + CIFAR‐10

Page 50: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Method 1NN LinearSVM

EigenFunction

PVM AGR

1K Anchors

2kAnchors

1K Anchors

2kAnchors

Accuracy(%)

51.66±0.28

60.14±0.34

53.86±0.35

60.55±0..32

60.95±0.41

62.39±0.33

64.23±0.28

TrainingTime (s) 0 8.00 149.83 213.88 517.82 206.60 477.61

Test Time     (s)

6.29e‐4 2.66e‐6 1.39e‐4 5.79e‐5 1.27e‐4 6.20e‐5 1.39e‐4

80Million Tiny 

Images1Million samples(1% labels from CIFAR-10)

ARG as inductive modelNovel test sample

Learn ARG

Page 51: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Additional Issues

• Multi‐edge Graph– Multiple relation edges between nodes

• Multi‐feature Graph– Build graphs in multiple feature spaces– Joint optimization

• Label tuning vs. Active Learning

51

Page 52: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Image‐Based Multi‐Edge Graph 

52

two images with the same tag

dog, flower dog, bird

• one edge connecting the two regions sharing the tag, but not all

• How to propagate label over multiple edges?

Liu et al, ACM Multimedia 2010

Page 53: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Extension to Multi-Feature Graphs

Feature 1

Feature K

0 50 100 150 200-0.1

0

0.1

0.2

0.3

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Graph 1 Graph K

User Input

Ranking list

Label Propagation

How to handle noisy labels in multiple graphs?

How to handle noisy labels in 

multiple graphs?

Page 54: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

Multi-graph SSL vs. single-graph

Caltech 101 data set

Improve performance by 20%-80%

Page 55: Graph-based Techniques for Searching Large-Scale Noisy …sfchang/papers/UCLA IPAM Chang 2012.pdf · 2012-01-20 · Graph-based Techniques for Searching Large-Scale Noisy Multimedia

References and Tools

1. X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi‐supervised learning using Gaussian fields and harmonic functions. ICML, 2003.

2. D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. NIPS, 2004.

3. W. Liu, J. He, and S.‐F. Chang. Large graph construction for scalable semi‐supervised learning. ICML, 2010. Software: http://www.ee.columbia.edu/∼wliu/Anchor Graph.zip.

4. W. Liu, J. Wang, S. Kumar, and S.‐F. Chang. Hashing with graphs. ICML, 2011.5. J. Wang, T. Jebara, and S.‐F. Chang. Graph transduction via alternating minimization. ICML, 

2008.6. J. Wang, Y.‐G. Jiang, and S.‐F. Chang. Label diagnosis through self tuning for web image 

search. CVPR, 2009.7. W. Liu, J. Jun, and S.‐F. Chang, Robust and Scalable Graph‐Based Semi‐Supervised Learning. 

In Review, IEEE Proceedings, 2012.8. J. Wang, E. Pohlmeyer, B. Hanna, Y.‐G. Jiang, P. Sajda, and S.‐F. Chang, 

“Brain State Decoding for Rapid Image Retrieval,” ACM Multimedia Conference, 2009.9. J. Wang, A. Kumar, S.‐F. Chang, “Semi‐Supervised Hashing for Scalable Image Retrieval”, 

CVPR 2010.

55