PRanking with Rankinglekheng/meetings/...Ranking Margin w 5 4 3 2 1 , Margin(x,y) = min Ranking...

Preview:

Citation preview

PRanking with Ranking

Based on joint work with Yoram Singer at the Hebrew University of Jerusalem

Koby Crammer Technion – Israel Institute of Technology

Problem

3 3 0 1

Machine Prediction

User’s Rating

Ranking Loss 3

Ranking – Formal Description

•  Instances

•  Labels

•  Structure

•  Ranking rule

•  Ranking Loss

•  Algorithm works in rounds •  On each round the ranking

algorithm : –  Gets an input instance –  Outputs a rank as prediction –  Receives the correct rank-

value –  Computes loss –  Updates the rank-prediction

rule

Online Framework Problem Setting

1 2 3 4 5

x1 is preferred over x2

Goal

•  Algorithms Loss

•  Loss of a fixed function

•  Regret

•  No statistical assumptions over data •  The algorithm should do well irrespectively of

specific sequence of inputs and target labels

Lt = yi − ˆ y ii=1

t

Lt f( ) = yi − f xi( )i=1

t

Lt − inff ∈FLt f( )

Background

Binary Classification

1 2

w

The Perceptron Algorithm Rosenblatt, 1958

•  Hyperplane w

1 2

w

(X,1)

•  Hyperplane w •  Get new instance x •  Classify x :

sign( )

The Perceptron Algorithm Rosenblatt, 1958

•  Hyperplane w •  Get new instance x •  Classify x :

sign( ) •  Update (in case of a mistake)

w

1 2

1 2

w

(X,1)

The Perceptron Algorithm Rosenblatt, 1958

A Function Class for Ranking

Our Approach to Ranking •  Project

Our Approach to Ranking •  Project

•  Apply Thresholds

< >

4 3 2 1 Rank

Update of a Specific Algorithm

it if its not Broken Least change as possible

One step at a time

PRank

•  Direction w, Thresholds

w

5 1 3 2 4

Rank Levels

Thresholds

PRank

•  Direction w, Thresholds

•  Rank a new instance x

w

5 1 3 2 4

PRank

•  Direction w, Thresholds

•  Rank a new instance x •  Get the correct rank y

w

5 1 3 2 4

Correct Rank Interval

PRank

•  Direction w, Thresholds

•  Rank a new instance x •  Get the correct rank y •  Compute Error-Set E w

5 1 3 2 4

PRank – Update

•  Direction w, Thresholds

•  Rank a new instance x •  Get the correct rank y •  Compute Error-Set E •  Update :

– 

w

PRank – Update

•  Direction w, Thresholds

•  Rank a new instance x •  Get the correct rank y •  Compute Error-Set E •  Update :

– 

– 

w

x

x

w

PRank – Summary of Update

•  Direction w, Thresholds

•  Rank a new instance x •  Get the correct rank y •  Compute Error-Set E •  Update :

– 

– 

w

x

x

w

x

Predict :

Get the true rank y

Compute Error set :

Get an instance x Maintain

No

? Yes

Update

The PRank Algorithm

Analysis Two Lemmas

Consistency

•  Can the following happen?

w

b 4 b 2 b 2 b 3 b 1

•  Can the following happen?

•  The order of the thresholds is preserved after each round of PRank : .

Consistency

No

w

b 4 b 2 b 2 b 3 b 1

Given : •  Arbitrary input sequence

Easy Case: •  Assume there exists a model that ranks all

the input instances correctly –  The total loss the algorithm suffers is bounded

Hard Case: •  In general

–  A “regret” is bounded

Regret Bound

Lt − inff ∈F

˜ L t f( )

Margin(x,y) = min

Ranking Margin

w

1 2 4 5 3

,

Margin(x,y) = min

Ranking Margin

w

1 2 4 5 3

,

Margin(x,y) = min

Ranking Margin

w

1 2 4 5 3

,

Margin(x,y) = min

Ranking Margin

Margin = min Margin

w

1 2 4 5 3

,

•  Input sequence , •  Norm of instances is bounded •  Ranked correctly by a normalized ranker with Margin>0

Mistake Bound

Number of Mistakes PRank Makes

Given :

Then :

Exploit Structure

Loss Range Structure

Classification None

Regression Metric

Ranking Order

Under Constraint

Over Constraint

Other Approaches

•  Treat Ranking as Classification or Regression

•  Reduce a ranking problem into a classification problem over pair of examples

–  Not simple to combine preferences predictions over pairs into a singe consistent ordering

–  No simple adaptation for online settings

Basu, Hirsh, Cohen 1998

Freund, Lyer, Schapire, Singer 1998 Herbrich, Graepel, Obermayer 2000

E.g.

E.g.

Empirical Study

An Illustration

PRank Ranking

MC-Perceptron Classification

Widrow-Hoff Regression

•  Five concentric ellipses •  Training set of 50 points •  Three approaches

•  Pranking •  Classification •  Regression

Each-Movie database

•  74424 registered Users •  1648 listed Movies •  Users ranking of movies •  7451 Users saw >100 movies •  1801 Users saw >200 movies

Ranking Loss, 100 Viewers

Ran

k Lo

ss

Round

WH MC-Perceptron PRank

Over constrained

Under constrained

Accurately constrained

Regression Classification PRank

Ranking Loss, 200 Viewers

WH MC-Perceptron PRank

Round

Ran

k Lo

ss

Regression Classification PRank

Demonstration

(1) User choose movies from this list

(2) Movies chosen and ranked by user

(3) Press the ‘learn’ key. The systems learns the user’s taste

(4) The system re-ranks the training set

(5) The system re-ranks a new fresh set of yet unseen movies

(6) Press the ‘flip’ button to see what movies you should not view

(7) The flipped list

•  Many alternatives to formulate ranking •  Choose one that models best your problem •  Exploit and Incorporate structure •  Specifically:

–  Online algorithm for ranking problems via projections and conservative update of the projection’s direction and the threshold values

–  Experiments on a synthetic dataset and on Each-Movie data set indicate that the PRank algorithm performs better then algorithms for classification and regression

Recommended