CS 2750: Machine Learningpeople.cs.pitt.edu/~kovashka/cs2750_sp17/ml_19_other.pdf · Tzeng et al., “Simultaneous Deep Transfer Across Domains and Tasks”, ICCV 2015. Invariant

CS 2750: Machine Learning

Other Topics

Prof. Adriana KovashkaUniversity of Pittsburgh

April 13, 2017

Plan for last lecture

• Overview of other topics and applications

– Reinforcement learning

– Active learning

– Domain adaptation

– Unsupervised feature learning using context

– Ranking

Reinforcement learning


• So far we’ve considered offline learning where we learn a model first and make predictions

• Reinforcement learning is a type of online learning

• Lies in between supervised and unsupervised


• You have an agent acting in an environment, exploring possible behaviors with the intent of maximizing some reward

• For example, the agent wants to learn how to play some game so that it wins frequently


• States

• Actions

• Rewards

https://www.nervanasys.com/demystifying-deep-reinforcement-learning/



• States – e.g. image of board

• Actions – up/down

• Rewards – if won, +1, if lost, -1

http://karpathy.github.io/2016/05/31/rl/


Q-Learning



Policy gradients

• Wait before updating model parameters until end of game when we know if we won or lost, use outcome as gradient to backprop

• Credit assignment: Which actions should I reward in case I won?– Reward is given for a certain action taken in a

certain state

– If I won, reward all actions that led to this

– Penalize all actions that led to a loss



Learning to play Atari games w/ RL

Games

Total reward collected

Mnih et al., “Playing Atari with Deep Reinforcement Learning”, 2013

Learning to localize objects w/ RL

Caicedo and Lazebnik, “Active Object Localization with Deep Reinforcement Learning”, ICCV 2015

Active learning

Pool-based sampling

Settles, “Active Learning” (Synthesis Lectures on AI and ML), 2012

Selective sampling (stream-based)


Query synthesis


Uncertainty sampling


Measures of uncertainty

• Least confident

• Smallest margin

• Highest entropy

highest-probability label

2nd-highest-probability label


Actively choosing sample and annotation type

Kovashka et al., “Actively Selecting Annotations Among Objects and Attributes“, ICCV 2011

Expected entropy reduction on all data

By predicting entropy change over all data, selection accounts for the impact of all desired interactions between labels and data.

Our entropy-based selection function seeks to maximize the expected object label entropy reduction. We measure object class entropy on the labeled and unlabeled image sets:

We seek maximal expected entropy reduction, which is equivalent to minimum entropy after the label addition:


Object label depends on attribute labels


Choose object or attribute labelThe expected entropy scores for object label and attribute label additions can be expressed as follows. Note that these two formulations are comparable since they both measure entropy of the object class.

Then the best (image, label) choice can be made as: where x ranges over unlabeled images and q ranges over possible label types.


Query by committee


Cluster-based


Domain adaptation

The same class looks different in different domains

Adaptive SVM

• Target domain:

• Auxiliary (source) domain:

• Standard SVM:

Yang et al., “Adapting SVM Classifiers to Data with Shifted Distributions”, ICDM Workshops 2007

Adaptive SVM

• Adaptive SVM objective:

• Adaptive SVM dual problem:

• Adaptive SVM prediction:

learned on auxiliary domain with standard SVM

prediction from auxiliary


Personalized image search

• Allow user to “whittle away” irrelevant images via comparative feedback on attributes of results

• But different users might perceive attributes differently

Kovashka et al., “WhittleSearch: Image Search with Relative Attribute Feedback”, CVPR 2012

“Like this… but with curlier hair”

Semantic visual attributes

• High-level descriptive properties shared by objects

• Human-understandable and machine-detectable

• Middle ground between user and system

smiling large-lips

long-hair

natural

perspective open

high

heel

redornaments

metallic

Users perceive attributes differently

• There may be valid perceptual differences within an attribute, yet existing methods assume monolithic attribute sufficient

Formal? User labels: 50% “yes”50% “no”

or

More ornamented? User labels: 50% “first”20% “second”30% “equally”

Binary attribute Relative attribute

Kovashka and Grauman, “Attribute Adaptation for Personalized Image Search”, ICCV 2013

Learning user-specific attributes

• Treat as a domain adaptation problem

• Adapt generic attribute model with minimal user-specific labeled examples

Standard

approach:Vote on labels

Our idea:

“formal”

“not formal”

“formal”

“not

formal”

“formal”

“not formal”

Crowd

User


• Adapting binary attribute classifiers:

Given user-labeled data

and generic model ,

learn adapted model ,

Learning adapted attributes


Learning adapted attributes

“formal”

“not formal”

“formal”

“not formal”Generic boundary

Adapted boundary

Adapted attribute accuracy

• Result over all 3 datasets, 32 attributes, and 75 users

• Generic learns a model from the crowd (no personalization)

• Our method most accurately captures perceived attributes


Domain adaptation w/ metric learning

Saenko et al., “Adapting visual category models to new domains”, ECCV 2010

Colors = domains, shapes = classes

• Want to learn to relate two domains, x is from one domain, y is from the other

• Constraints in learned space:

• Use nearest neighbor classifier in learned space

Domain adaptation with metric learning

Saenko et al., “Adapting visual category models to new domains”, ECCV 2010

Invariant representations w/ deep nets

qd is the probability that a sample belongs to the d-th domain

Tzeng et al., “Simultaneous Deep Transfer Across Domains and Tasks”, ICCV 2015

Invariant representations w/ deep nets

Bousmalis et al., “Domain Separation Networks”, NIPS 2016

Unsupervised feature learning using context

Skip-gram model (word embeddings)

WE(king) – WE(man) + WE(woman) = WE(queen)

Mikolov et al., “Distributed Representations of Words and Phrases…”, NIPS 2013

Mikolov et al., “Distributed Representations of Words and Phrases…”, NIPS 2013

Context prediction for images

A B

1 2 3

54

6 7 8Doersch et al., “Unsupervised Visual Representation Learning by Context Prediction”, ICCV 2015

Randomly Sample PatchSample Second Patch

CNN CNN

Classifier

Relative position task8 possible locations

Doersch et al., “Unsupervised Visual Representation Learning by Context Prediction”, ICCV 2015

CNN CNN

Classifier

Patch Embedding

Input Nearest Neighbors

CNN Note: connects across instances!

Doersch et al., “Unsupervised Visual Representation Learning by Context Prediction”, ICCV 2015

Ranking

Relative attributesWe need to compare images by attribute “strength”

bright

smiling

natural

Parikh and Grauman, “Relative attributes”, ICCV 2011

Learning relative attributes

• We want to learn a spectrum (ranking model) for an attribute, e.g. “brightness”.

• Supervision consists of:

Ordered pairs

Similar pairs


Learn a ranking function

that best satisfies the constraints:

Image features

Learned parameters



Max-margin learning to rank formulation

Image Relative attribute score

Joachims, “Optimizing search engines using clickthrough data”, KDD 2002


Rank margin

wm

Documents

CS 2750: Machine Learningpeople.cs.pitt.edu/~kovashka/cs2750_sp17/ml_19_other.pdf · Tzeng et al., “Simultaneous Deep Transfer Across Domains and Tasks”, ICCV 2015. Invariant