49
Top-K inference for SEP-LR Michiel Stock twitter: @michielstock Top-K retrieval Problem description Applications Scoring model Example: recipe recommendation The threshold algorithm Algorithm Illustration Theoretical properties Experimental results Extensions Conclusions 1/24 KERMIT Exact and efficient top-K inference for multi-target prediction by querying separable linear relational models Michiel Stock 1 , Krzysztof Dembczy´ nski 2 , Bernard De Baets 1 & Willem Waegeman 1 1 Department of Mathematical Modelling, Statistics and Bioinformatics Ghent University 2 Institute of Computing Science Poznan University of Technology, Poznan 60-695, Poland ECML-PKDD 2016

Exact and efficient top-K inference for multi-target prediction by querying separable linear relational models

Embed Size (px)

Citation preview

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

1/24

KERMIT

Exact and efficient top-K inference formulti-target prediction by querying separable

linear relational models

Michiel Stock1, Krzysztof Dembczynski2,Bernard De Baets1 & Willem Waegeman1

1Department of Mathematical Modelling, Statistics and BioinformaticsGhent University

2Institute of Computing SciencePoznan University of Technology, Poznan 60-695, Poland

ECML-PKDD 2016

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

2/24

KERMIT

Problem statement

Top-K inference

For a given query, a set of targets and pairwise scoring modelfind the K targets with the highest scores.

SKx = arg max

S∈[Y]Kminy∈S

s(x , y) ,

with

the query x

the targets y ∈ Y with |Y| = M

scoring model s(x , y) which quantifies how well a targetmatches a query

the top-K set SKx

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

2/24

KERMIT

Problem statement

Top-K inference

For a given query, a set of targets and pairwise scoring modelfind the K targets with the highest scores.

SKx = arg max

S∈[Y]Kminy∈S

s(x , y) ,

with

the query x

the targets y ∈ Y with |Y| = M

scoring model s(x , y) which quantifies how well a targetmatches a query

the top-K set SKx

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

3/24

KERMIT

Application: item recommendation

Searching for recommended books for a user of a website

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

4/24

KERMIT

Application: image tagging

Searching for relevant tags for an image

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

5/24

KERMIT

Application: molecular docking

Searching for small compounds that bind to a given protein

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

6/24

KERMIT

The score model

Separable linear relational model:

s(x , y) = u(x)ᵀt(y) =R∑

r=1

ur (x) tr (y) .

x = query u(x) = model vector for x

y = target t(y) = model vector for y

1 Collaborative filtering (e.g. matrix factorization).

2 Multi-label classification (e.g. multi-label classifier).

3 Content-based filtering/dyadic prediction (e.g. pairwiseregression).

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

6/24

KERMIT

The score model

Separable linear relational model:

s(x , y) = u(x)ᵀt(y) =R∑

r=1

ur (x) tr (y) .

x = user u(x) = ux (latent feature vector users)

y = book t(y) = ty (latent feature vector books)

1 Collaborative filtering (e.g. matrix factorization).

2 Multi-label classification (e.g. multi-label classifier).

3 Content-based filtering/dyadic prediction (e.g. pairwiseregression).

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

6/24

KERMIT

The score model

Separable linear relational model:

s(x , y) = u(x)ᵀt(y) =R∑

r=1

ur (x) tr (y) .

x = image u(x) = Φ(x) (image feature)

y = tag t(y) = wy (weight vector for tag )

1 Collaborative filtering (e.g. matrix factorization).

2 Multi-label classification (e.g. multi-label classifier).

3 Content-based filtering/dyadic prediction (e.g. pairwiseregression).

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

6/24

KERMIT

The score model

Separable linear relational model:

s(x , y) = u(x)ᵀt(y) =R∑

r=1

ur (x) tr (y) .

x = disease protein u(x) = Φ(x) (protein features)

y = candidate drug t(y) = WΨ(y) (weighted features drugs)

1 Collaborative filtering (e.g. matrix factorization).

2 Multi-label classification (e.g. multi-label classifier).

3 Content-based filtering/dyadic prediction (e.g. pairwiseregression).

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

7/24

KERMIT

Example: recipe recommendation

Recipe retrieval

For a set of ingredients, find the best recipe form a database.

query: x = {egg, chocolate, bacon}

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

spaghetti Bolognese 0 0 0 0 1 1 1 0scrambled egg 1 0 0 0 0 0 1 1pasta carbonara 1 1 0 0 0 1 1 1creme brullee 1 1 1 0 0 0 0 0cheesecake 1 0 1 0 0 0 1 0chocolate bacon pie 1 0 1 1 0 0 0 1chocolate mousse 1 1 1 1 0 0 0 0meal salad 1 0 0 0 1 0 1 1whipped cream 0 1 1 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

7/24

KERMIT

Example: recipe recommendation

Recipe retrieval

For a set of ingredients, find the best recipe form a database.

query: x = {egg, chocolate, bacon}

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

spaghetti Bolognese 0 0 0 0 1 1 1 0scrambled egg 1 0 0 0 0 0 1 1pasta carbonara 1 1 0 0 0 1 1 1creme brullee 1 1 1 0 0 0 0 0cheesecake 1 0 1 0 0 0 1 0chocolate bacon pie 1 0 1 1 0 0 0 1chocolate mousse 1 1 1 1 0 0 0 0meal salad 1 0 0 0 1 0 1 1whipped cream 0 1 1 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

7/24

KERMIT

Example: recipe recommendation

Recipe retrieval

For a set of ingredients, find the best recipe form a database.

query: x = {egg, chocolate, bacon}

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

spaghetti Bolognese 0 0 0 0 1 1 1 0scrambled egg 1 0 0 0 0 0 1 1pasta carbonara 1 1 0 0 0 1 1 1creme brullee 1 1 1 0 0 0 0 0cheesecake 1 0 1 0 0 0 1 0chocolate bacon pie 1 0 1 1 0 0 0 1chocolate mousse 1 1 1 1 0 0 0 0meal salad 1 0 0 0 1 0 1 1whipped cream 0 1 1 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

8/24

KERMIT

Example: recipe recommendation

Transformation:

1 weighting ingredients: t(y)′r = t(y)r√

log(1/pr ), withpr = probability of ingredient r

2 normalizing: t(y)′′r = t(y)′r/||t(y)′r ||2

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0scrambled egg .39 0 0 0 0 0 .60 .70pasta carbonara .25 .45 0 0 0 .62 .39 .45creme brullee .39 .70 .60 0 0 0 0 0cheesecake .42 0 .64 0 0 0 .64 0chocolate bacon pie .28 0 .43 .69 0 0 0 .51chocolate mousse .28 .51 .43 .69 0 0 0 0meal salad .28 0 0 0 .69 0 .43 .51whipped cream 0 .76 .65 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

8/24

KERMIT

Example: recipe recommendation

Transformation:

1 weighting ingredients: t(y)′r = t(y)r√

log(1/pr ), withpr = probability of ingredient r

2 normalizing: t(y)′′r = t(y)′r/||t(y)′r ||2

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0scrambled egg .39 0 0 0 0 0 .60 .70pasta carbonara .25 .45 0 0 0 .62 .39 .45creme brullee .39 .70 .60 0 0 0 0 0cheesecake .42 0 .64 0 0 0 .64 0chocolate bacon pie .28 0 .43 .69 0 0 0 .51chocolate mousse .28 .51 .43 .69 0 0 0 0meal salad .28 0 0 0 .69 0 .43 .51whipped cream 0 .76 .65 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0

0

scrambled egg .39 0 0 0 0 0 .60 .70

.52

pasta carbonara .25 .45 0 0 0 .62 .39 .45

.33

creme brullee .39 .70 .60 0 0 0 0 0

.12

cheesecake .42 0 .64 0 0 0 .64 0

.13

chocolate bacon pie .28 0 .43 .69 0 0 0 .51

.90

chocolate mousse .28 .51 .43 .69 0 0 0 0

.62

meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70

.52

pasta carbonara .25 .45 0 0 0 .62 .39 .45

.33

creme brullee .39 .70 .60 0 0 0 0 0

.12

cheesecake .42 0 .64 0 0 0 .64 0

.13

chocolate bacon pie .28 0 .43 .69 0 0 0 .51

.90

chocolate mousse .28 .51 .43 .69 0 0 0 0

.62

meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45

.33

creme brullee .39 .70 .60 0 0 0 0 0

.12

cheesecake .42 0 .64 0 0 0 .64 0

.13

chocolate bacon pie .28 0 .43 .69 0 0 0 .51

.90

chocolate mousse .28 .51 .43 .69 0 0 0 0

.62

meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0

.12

cheesecake .42 0 .64 0 0 0 .64 0

.13

chocolate bacon pie .28 0 .43 .69 0 0 0 .51

.90

chocolate mousse .28 .51 .43 .69 0 0 0 0

.62

meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0 .12cheesecake .42 0 .64 0 0 0 .64 0

.13

chocolate bacon pie .28 0 .43 .69 0 0 0 .51

.90

chocolate mousse .28 .51 .43 .69 0 0 0 0

.62

meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0 .12cheesecake .42 0 .64 0 0 0 .64 0 .13chocolate bacon pie .28 0 .43 .69 0 0 0 .51

.90

chocolate mousse .28 .51 .43 .69 0 0 0 0

.62

meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0 .12cheesecake .42 0 .64 0 0 0 .64 0 .13chocolate bacon pie .28 0 .43 .69 0 0 0 .51 .90chocolate mousse .28 .51 .43 .69 0 0 0 0

.62

meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0 .12cheesecake .42 0 .64 0 0 0 .64 0 .13chocolate bacon pie .28 0 .43 .69 0 0 0 .51 .90chocolate mousse .28 .51 .43 .69 0 0 0 0 .62meal salad .28 0 0 0 .69 0 .43 .51

.37

whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0 .12cheesecake .42 0 .64 0 0 0 .64 0 .13chocolate bacon pie .28 0 .43 .69 0 0 0 .51 .90chocolate mousse .28 .51 .43 .69 0 0 0 0 .62meal salad .28 0 0 0 .69 0 .43 .51 .37whipped cream 0 .76 .65 0 0 0 0 0

0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0 .12cheesecake .42 0 .64 0 0 0 .64 0 .13chocolate bacon pie .28 0 .43 .69 0 0 0 .51 .90chocolate mousse .28 .51 .43 .69 0 0 0 0 .62meal salad .28 0 0 0 .69 0 .43 .51 .37whipped cream 0 .76 .65 0 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

9/24

KERMIT

Example: recipe recommendation

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

Calculate all scores in O(MR)

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

score

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0 0scrambled egg .39 0 0 0 0 0 .60 .70 .52pasta carbonara .25 .45 0 0 0 .62 .39 .45 .33creme brullee .39 .70 .60 0 0 0 0 0 .12cheesecake .42 0 .64 0 0 0 .64 0 .13chocolate bacon pie .28 0 .43 .69 0 0 0 .51 .90chocolate mousse .28 .51 .43 .69 0 0 0 0 .62meal salad .28 0 0 0 .69 0 .43 .51 .37whipped cream 0 .76 .65 0 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

10/24

KERMIT

Chocolate bacon pie

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

11/24

KERMIT

The threshold algorithm

Get R sorted lists L1, L2, . . . , LR with pointers to the targets,sorted according to partial score ur (x)tr (y).

lower bound = lowest score in SKx

upper bound =R∑

r=1

ur (x)tr (yLr (d))

1: SKx ← ∅; d ← 1

2: while lower bound < upper bound do3: for r ∈ 1 . . .R do4: pop target yLr (d)

5: score yLr (d) (if new) and update SKx

6: end for7: d ← d + 18: end while

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

12/24

KERMIT

Illustration of threshold algorithm

Simplify the problem

Remove ingredients that do not contribute to the score.

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0, 0, 0.87, 0, 0, 0, 0.47]>

egg

crea

m

sug

ar

cho

cola

te

tom

ato

pa

sta

chee

se

ba

con

spaghetti Bolognese 0 0 0 0 .65 .65 .40 0scrambled egg .39 0 0 0 0 0 .60 .70pasta carbonara .25 .45 0 0 0 .62 .39 .45creme brullee .39 .70 .60 0 0 0 0 0cheesecake .42 0 .64 0 0 0 .64 0chocolate bacon pie .28 0 .43 .69 0 0 0 .51chocolate mousse .28 .51 .43 .69 0 0 0 0meal salad .28 0 0 0 .69 0 .43 .51whipped cream 0 .76 .65 0 0 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

13/24

KERMIT

Illustration of threshold algorithm

Simplify the problem

Remove ingredients that do not contribute to the score.

query: x = {egg, chocolate, bacon}t(x) = [0.16, 0.87, 0.47]>

egg

cho

cola

te

ba

con

spaghetti Bolognese 0 0 0scrambled egg .39 0 .70pasta carbonara .25 0 .45creme brullee .39 0 0cheesecake .42 0 0bacon chocolate cake .28 .69 .51chocolate mousse .28 .69 0meal salad .28 0 .51whipped cream 0 0 0

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

14/24

KERMIT

Illustration of threshold algorithm

Make sorted lists (offline)

Make R sorted lists, with the targets (ingredients) sortedaccording to t(y)r .

id egg

cheesecake 5 .42scrambled egg 2 .39creme brullee 4 .39bacon chocolate cake 6 .28chocolate mousse 7 .28meal salad 8 .28pasta carbonara 3 .25spaghetti Bolognese 1 0whipped cream 9 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

14/24

KERMIT

Illustration of threshold algorithm

Make sorted lists (offline)

Make R sorted lists, with the targets (ingredients) sortedaccording to t(y)r .

id egg

cheesecake 5 .42scrambled egg 2 .39creme brullee 4 .39bacon chocolate cake 6 .28chocolate mousse 7 .28meal salad 8 .28pasta carbonara 3 .25spaghetti Bolognese 1 0whipped cream 9 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

15/24

KERMIT

Illustration of threshold algorithm

Threshold algorithm

Pop targets from the top of the lists and process until lowerbound > upper bound.

egg

cho

cola

te

ba

con

score

spaghetti Bolognese 0 0 0scrambled egg .39 0 .70pasta carbonara .25 0 .45creme brullee .39 0 0cheesecake .42 0 0bacon chocolate cake .28 .69 .51chocolate mousse .28 .69 0meal salad .28 0 .51whipped cream 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

16/24

KERMIT

Illustration of threshold algorithm

t(x) = [0.16, 0.87, 0.47]>

id egg

cho

cola

te

ba

con

score

spaghetti Bolognese 1 0 0 0scrambled egg 2 .39 0 .70pasta carbonara 3 .25 0 .45creme brullee 4 .39 0 0cheesecake 5 .42 0 0bacon chocolate cake 6 .28 .69 .51chocolate mousse 7 .28 .69 0meal salad 8 .28 0 .51whipped cream 9 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

To score: {2, 5, 6}

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

17/24

KERMIT

Illustration of threshold algorithm

t(x) = [0.16, 0.87, 0.47]>

id egg

cho

cola

te

ba

con

score

spaghetti Bolognese 1 0 0 0scrambled egg 2 .39 0 .70 .52pasta carbonara 3 .25 0 .45creme brullee 4 .39 0 0cheesecake 5 .42 0 0 .13bacon chocolate cake 6 .28 .69 .51 .90chocolate mousse 7 .28 .69 0meal salad 8 .28 0 .51whipped cream 9 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

To score: {2, 5, 6}S1x = {6}

lower bound = 0.90upper bound =.39× .16 + .69× .16.87 + .47× .51 = 0.93

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

17/24

KERMIT

Illustration of threshold algorithm

t(x) = [0.16, 0.87, 0.47]>

id egg

cho

cola

te

ba

con

score

spaghetti Bolognese 1 0 0 0scrambled egg 2 .39 0 .70 .52pasta carbonara 3 .25 0 .45creme brullee 4 .39 0 0cheesecake 5 .42 0 0 .13bacon chocolate cake 6 .28 .69 .51 .90chocolate mousse 7 .28 .69 0meal salad 8 .28 0 .51whipped cream 9 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

To score: {2, 5, 6}S1x = {6}

lower bound = 0.90

upper bound =.39× .16 + .69× .16.87 + .47× .51 = 0.93

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

17/24

KERMIT

Illustration of threshold algorithm

t(x) = [0.16, 0.87, 0.47]>

id egg

cho

cola

te

ba

con

score

spaghetti Bolognese 1 0 0 0scrambled egg 2 .39 0 .70 .52pasta carbonara 3 .25 0 .45creme brullee 4 .39 0 0cheesecake 5 .42 0 0 .13bacon chocolate cake 6 .28 .69 .51 .90chocolate mousse 7 .28 .69 0meal salad 8 .28 0 .51whipped cream 9 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

To score: {2, 5, 6}S1x = {6}

lower bound = 0.90upper bound =.39× .16 + .69× .16.87 + .47× .51 = 0.93

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

18/24

KERMIT

Illustration of threshold algorithm

t(x) = [0.16, 0.87, 0.47]>

id egg

cho

cola

te

ba

con

score

spaghetti Bolognese 1 0 0 0scrambled egg 2 .39 0 .70 .52pasta carbonara 3 .25 0 .45creme brullee 4 .39 0 0cheesecake 5 .42 0 0 .13bacon chocolate cake 6 .28 .69 .51 .90chocolate mousse 7 .28 .69 0 .62meal salad 8 .28 0 .51whipped cream 9 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

To score: {7}

S1x = {6}

lower bound = 0.90upper bound =.39× .16 + 0× .16 + .47× .51 = 0.30

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

18/24

KERMIT

Illustration of threshold algorithm

t(x) = [0.16, 0.87, 0.47]>

id egg

cho

cola

te

ba

con

score

spaghetti Bolognese 1 0 0 0scrambled egg 2 .39 0 .70 .52pasta carbonara 3 .25 0 .45creme brullee 4 .39 0 0cheesecake 5 .42 0 0 .13bacon chocolate cake 6 .28 .69 .51 .90chocolate mousse 7 .28 .69 0 .62meal salad 8 .28 0 .51whipped cream 9 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

To score: {7}S1x = {6}

lower bound = 0.90upper bound =.39× .16 + 0× .16 + .47× .51 = 0.30

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

18/24

KERMIT

Illustration of threshold algorithm

t(x) = [0.16, 0.87, 0.47]>

id egg

cho

cola

te

ba

con

score

spaghetti Bolognese 1 0 0 0scrambled egg 2 .39 0 .70 .52pasta carbonara 3 .25 0 .45creme brullee 4 .39 0 0cheesecake 5 .42 0 0 .13bacon chocolate cake 6 .28 .69 .51 .90chocolate mousse 7 .28 .69 0 .62meal salad 8 .28 0 .51whipped cream 9 0 0 0

L1 L4 L8

1 5 6 22 2 7 63 4 1 84 6 2 35 7 3 16 8 4 47 3 5 58 1 8 79 9 9 9

To score: {7}S1x = {6}

lower bound = 0.90upper bound =.39× .16 + 0× .16 + .47× .51 = 0.30

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

19/24

KERMIT

Theoretical properties of the threshold algorithm

Guaranteed to find the correct top-K set in finite time...

...for any size K of the top-K set.

Instance optimal: no algorithm that does not make wildguesses has a lower time complexity than the thresholdalgorithm (up to linear scaling), for any query and any setof targets.

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

19/24

KERMIT

Theoretical properties of the threshold algorithm

Guaranteed to find the correct top-K set in finite time...

...for any size K of the top-K set.

Instance optimal: no algorithm that does not make wildguesses has a lower time complexity than the thresholdalgorithm (up to linear scaling), for any query and any setof targets.

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

20/24

KERMIT

Threshold algorithm for collaborative filtering

101 102 103 104 105 106

Size of database (M)

10-5

10-4

10-3

10-2

10-1

100

Rel

ativ

e co

st th

resh

old

Memory-based (sparse)

101 102 103 104 105 106

Size of database (M)

10-5

10-4

10-3

10-2

10-1

100

Model-based (dense)

K = 1K = 5K = 10K = 25K = 50data = Recipesdata = BookCrossingdata = Audioscrobblerdata = Movielens100Kdata = Movielens1MR = 10R = 250

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

21/24

KERMIT

Extensions

Partial threshold algorithm: stop computing scoreswhen they cannot improve the top-K set. Noimprovement possible if

s(x , y) ≤l∑

r=1

ur (x) tr (y) +R∑

r=l+1

ur (x) tr (yLr (d))

≤ lower bound .

Halted threshold algorithm: stopping the thresholdalgorithm before it terminates.

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

22/24

KERMIT

Queries under a microscope

101 102 103 104

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

Low

er b

ound

100 queries Audioscrobbler data

101 102 103

10-4

10-3

10-2

10-1

100

101

Low

er b

ound

100 queries Uniprot data

101 102 103 104

Iteration

0.0

0.2

0.4

0.6

0.8

1.0

Frac

tion

of q

uerie

s found correct top-5terminated

101 102 103

Iteration

0.0

0.2

0.4

0.6

0.8

1.0

Frac

tion

of q

uerie

s found correct top-5terminated

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

23/24

KERMIT

Take-home messages

Many machine learning methods can be formulated aslinear separable relational models.

The simple and exact threshold problem efficiently solvesthe top-K inference problem for these models.

Future research: probabilistic threshold algorithm andextensions to non-separable linear relational models.

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

23/24

KERMIT

Take-home messages

Many machine learning methods can be formulated aslinear separable relational models.

The simple and exact threshold problem efficiently solvesthe top-K inference problem for these models.

Future research: probabilistic threshold algorithm andextensions to non-separable linear relational models.

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

23/24

KERMIT

Take-home messages

Many machine learning methods can be formulated aslinear separable relational models.

The simple and exact threshold problem efficiently solvesthe top-K inference problem for these models.

Future research: probabilistic threshold algorithm andextensions to non-separable linear relational models.

Top-Kinference for

SEP-LR

Michiel Stocktwitter:

@michielstock

Top-Kretrieval

Problemdescription

Applications

Scoring model

Example: reciperecommendation

The thresholdalgorithm

Algorithm

Illustration

Theoreticalproperties

Experimentalresults

Extensions

Conclusions

24/24

KERMIT

Exact and efficient top-K inference formulti-target prediction by querying separable

linear relational models

Michiel Stock1, Krzysztof Dembczynski2,Bernard De Baets1 & Willem Waegeman1

1Department of Mathematical Modelling, Statistics and BioinformaticsGhent University

2Institute of Computing SciencePoznan University of Technology, Poznan 60-695, Poland

ECML-PKDD 2016