L24 less labels - Georgia Institute of Technology€¦ · Meta-Learning for Few-Shot Recognition...

CS 4803 / 7643: Deep Learning

Zsolt KiraGeorgia Tech

Topics: – (Continue) Low-label ML Formulations

Administrative• Projects!

– Poster details out on piazza– Note: No late days for anything project related!– Also note: Keep track of your GCP usage and costs! Set

limits on spending

(C) Dhruv Batra & Zsolt Kira 31

Meta-Learning for Few-Shot Recognition

• Key idea: We want to learn from a few examples (called the support set) to make predictions on query set for novel classes– Assume: We have larger labeled dataset for a different set of

categories (base classes)

• How do we test this?– N-way k-shot test– k: Number of examples in support set– N: Number of “confusers” that we have to choose target

class among

Target

Query Set

Normal Approach• Do what we always do: Fine-tuning

– Train classifier on base classes

– Freeze features– Learn classifier weights for new classes using few amounts

of labeled data (during “inference” time!)

(C) Dhruv Batra & Zsolt Kira 33A Closer Look at Few-shot Classification, Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

Cons of Normal Approach• The training we do on the base classes does not

factor the task into account

• No notion that we will be performing a bunch of N-way tests

• Idea: simulate what we will see during test time

Meta-Training Approach• Set up a set of smaller tasks during training which

simulates what we will be doing during testing

– Can optionally pre-train features on held-out base classes (not typical)

• Testing stage is now the same, but with new classes

Meta-Learning Approaches• Learning a model conditioned on support set

Meta-Learner• How to parametrize learning algorithms?

• Two approaches to defining a meta-learner– Take inspiration from a known learning algorithm

• kNN/kernel machine: Matching networks (Vinyals et al. 2016)• Gaussian classifier: Prototypical Networks (Snell et al. 2017)• Gradient Descent: Meta-Learner LSTM (Ravi & Larochelle, 2017) ,

MAML (Finn et al. 2017)

– Derive it from a black box neural network• MANN (Santoro et al. 2016)• SNAIL (Mishra et al. 2018)

(C) Dhruv Batra & Zsolt Kira 37Slide Credit: Hugo Larochelle

Matching Networks

Prototypical Networks

More Sophisticated Meta-Learning Approaches

• Learn gradient descent: – Parameter initialization and update rules

• Learn just an initialization and use normal gradient descent (MAML)

Meta-Learner LSTM

Meta-Learning Algorithm

Meta-Learner LSTM

Model-Agnostic Meta-Learning (MAML)

(C) Dhruv Batra & Zsolt Kira 51Slide Credit: Sergey Levine

Comparison

Experiments

Memory-Augmented Neural Network

But beware

A Closer Look at Few-shot Classification,

Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

Distribution Shift• What if there is a

distribution shift (cross-domain)?

• Lesson: Methods that are successful within-domainmight be worse across domains!

Distribution Shift

Random Task Proposals

Does it Work?

Discussions

• What is the right definition of distributions over problems?– varying number of classes / examples per class (meta-

training vs. meta-testing) ?– semantic differences between meta-training vs. meta-testing

classes ?– overlap in meta-training vs. meta-testing classes (see recent

“low-shot” literature) ?• Move from static to interactive learning

– how should this impact how we generate episodes ?– meta-active learning ? (few successes so far)

Slide Credit: Hugo Larochelle

L24 less labels - Georgia Institute of Technology€¦ · Meta-Learning for Few-Shot Recognition...

Documents

L24-IO (1).pdf

IT09 L24: MANAGEMENT INFORMATION SYSTEMS- Module 2

Spare Parts Catalogue L24 ek - ed. 2008 - A S Catering ...ascateringsupplies.com/ASCateringFileUpload/pdf/L24! 2008.pdf · Spare Parts Catalogue L24 ek - ed. 2008 1 1 - WASH COMPONENTS

Estuary Business Park, South Liverpool, L24 8ADdocs.novaloca.com/1109_5438_636159881786990000.pdf · Estuary Business Park, South Liverpool, L24 8AD STRATEGIC DESIGN & BUILD UNITS

EE324 DISTRIBUTED SYSTEMS L24-BitCoin and Security

P627 S13 L24 22Apr2013 Zimmermann Photovoltaics

Principles of Semiconductor Devices-L24

L24 - The Prayer of My Life

L24. More on Image File Processing

L24. Alexander’s Conquest, pt. 1 - Faculty Server Contactfaculty.uml.edu/ethan_spanier/Teaching/documents/HIST225L24Alex... · L24. Alexander’s Conquest, pt. 1 Krateros Parmenion

OfficialJournaloftheEuropeanUnion L24/1 · ANNEX INTERNATIONALFINANCIALREPORTINGSTANDARDS IFRS1 AmendmentstoIFRS1First-timeAdoptionofIFRSsandtheBasisforConclusionsofIFRS6Exploration

L24 less labels - Georgia Institute of TechnologyRelated Work: Meta-Learning (C) Dhruv Batra & Zsolt Kira 16 • Training a recurrent neural network to optimize – outputs update,

IT09 L24: MANAGEMENT INFORMATION SYSTEMS- Module 1

Clean Track L24€¦ · Clean Track ® L24. 5/12 Form No. 56043161. Service Manual. Clarke Clean Track ® L24, Model 56317170

Semi-Supervised Classification of Network Data Using Very Few Labels

l20a, l24 & l26 Engines Em

L24 PQ Mitigation Asean School 2006.pdf

L24: NoSQL (continued) - course.ccs.neu.edu · 75 Example Databases • Graph with nodes/edges marked with labels and properties (labeled property graph)-Sparksee(DEX) (Java, 1strelease

L24 gomorrah - introduction lesson new

L24 The Future