70
The Effects of Time on Query Flow Graph-based Models for Query Suggestion Ranieri Baraglia, Franco Maria Nardini Raffaele Perego, Fabrizio Silvestri HPC Lab, ISTI-CNR, Pisa O D E R U D W R U \ Carlos Castillo, Debora Donato Yahoo! Research Barcelona martedì 4 maggio 2010

The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Embed Size (px)

DESCRIPTION

Ranieri Baraglia, Carlos Castillo, Debora Donato, Franco Maria Nardini, Raffaele Perego and Fabrizio Silvestri: "The Effects of Time on Query Flow Graph-based Models for Query Suggestion". In proceedings of RIAO. Paris, France, 2010. Slides prepared by Franco Maria Nardini

Citation preview

Page 1: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Effects of Time on Query Flow Graph-based Models for

Query SuggestionRanieri Baraglia, Franco Maria Nardini

Raffaele Perego, Fabrizio Silvestri

HPC Lab, ISTI-CNR, Pisa

Carlos Castillo, Debora Donato

Yahoo! Research Barcelona

martedì 4 maggio 2010

Page 2: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Outline

martedì 4 maggio 2010

Page 3: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Outline

• Introduction

• Aims of this Work

• The Query-Flow Graph

• Evaluating the Aging Effect

• Combating the Aging Effect

• Distributed QFG Building

• Conclusions & Future Works

martedì 4 maggio 2010

Page 4: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Introduction

martedì 4 maggio 2010

Page 5: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Introduction

• Web search engines use query recommender systems to improve users’ search experience;

martedì 4 maggio 2010

Page 6: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Introduction

• Web search engines use query recommender systems to improve users’ search experience;

• Query recommender systems give hints to users on possible “interesting queries”:

• relative to their information needs;

martedì 4 maggio 2010

Page 7: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Introduction

• Web search engines use query recommender systems to improve users’ search experience;

• Query recommender systems give hints to users on possible “interesting queries”:

• relative to their information needs;

• Query recommender systems exploit the knowledge of past web search engines users:

• recorded in query logs.

martedì 4 maggio 2010

Page 8: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Aims of this Work

martedì 4 maggio 2010

Page 9: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Aims of this Work

• to show that time has negative effects on a query recommender model:

• the model becomes unable to generate good suggestions as time passes;

• bursty queries;

martedì 4 maggio 2010

Page 10: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Aims of this Work

• to show that time has negative effects on a query recommender model:

• the model becomes unable to generate good suggestions as time passes;

• bursty queries;

• to extend a state-of-the-art recommender system by providing a methodology for dealing efficiently with evolving data;

• to define a “good” strategy to update the model;

• to define an distributed/parallel algorithm to update the model;

martedì 4 maggio 2010

Page 11: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

martedì 4 maggio 2010

Page 12: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

• QFG [Boldi et al., CIKM’08] is a compact and powerful representation of Web Search engine users’ behavior;

barcelona fc

<T>

0.506

barcelona fcwebsite

0.043barcelona fc

fixtures0.031

realmadrid

0.017

barcelonaweather

0.523

barcelonahotels

0.018

barcelonaweatheronline

0.100

barcelona

0.018

0.011

0.439

cheapbarcelona

hotels

0.072

luxurybarcelona

hotels

0.029

0.080

0.416

0.043

0.023

Figure 2: A portion of the query flow graph usingthe weighting scheme based on relative frequencies,described on Section 4.

Let f(s, q) and f(q, t) indicate the number of times query qis the first and last query of a session, respectively.

The weight we use is:

w!(q, q!) =

!f(q,q!)

f(q) if (w(q, q!) > !) ! (q = s) ! (q = t)

0 otherwise,

which uses the chaining probabilities w(q, q!) basically todiscard pairs that have a probability of less than ! to bepart of the same chain.

By construction, the sum of the weights of the edges go-ing out from each node is equal to 1. The result of such anormalization can be viewed as the transition matrix P of aMarkov chain.

In Figure 2 we show a small snapshot of the query flowgraph we produce with this weighting scheme. This containsthe query “barcelona” and some of its followers up to adepth of 2, selected in decreasing order of count. Also theterminal node t is present in the figure. Note that the sum ofoutgoing edges from each node does not reach 1 just becausenot all outgoing edges (and relative destination nodes) arereported.

5. FINDING CHAINSIn this section we describe our first application of the

query-flow graph: finding chains of queries in user sessions.As we have already mentioned, finding chains is a very im-portant problem as it allows improving query-log analysis,user profiling, mining user behavior, and more. For thisapplication we use the first weighing scheme described inSection 4 based on chaining probabilities.

The problem we consider is the following. We are given asupersession S = "q1, q2, . . . , qk# of one particular user. We

are also given the query-flow graph, which has been com-puted with the sessions of S as part of its input. The chain-finding problem can also be defined in the case that thesessions of S have not participated in the construction ofthe query-flow graph. However, in this paper we focus onthe former case and we leave the latter for future work.

One of the challenges of the problem we consider arisesfrom our definition of chains: we allow chains not to be con-secutive in the supersession S; in other words, the super-session S may contain many intertwined chains such as theones shown in the Table 1. Previous work has mostly focusedon the case where all chains are consecutive.

Chain #1 Chain #2

. . . . . .football results january 2nd pointui forumroyal carribean cruises audi ipswichholidays golfers elbowmotherwell football club cox ipswich... ...

Table 1: Two fragments from actual sessions con-taining non-consecutive chains.

The chain-finding problem can be formalized as follows:let us define a chain cover of S = "q1, q2, . . . qk# as a par-tition of the set {1, . . . , k} into subsets C1, . . . , Ch. Eachset Cu = {iu1 < · · · < iu!u

} is thought of as a chain Cu ="s, qiu

1, . . . , qiu

!u, t#, that is associated the probability

P (Cu) = P (s, qiu1)P (qiu

1, qiu

2) . . . P (qiu

!u"1, qiu

!u)P (qiu

!u, t)

and we want to find a chain cover maximizing P (C1) . . . P (Ch).When a query appears more than once, “duplicate” nodes

for that query are added to the formulation, which makes thedescription of the algorithm slightly more complicated thanwhat is presented here. For simplicity of the presentation weomit the details related to queries appearing more than oncebelow, which are not fundamental to the understanding ofthe algorithm.

We separate this problem into two subproblems: sessionreordering and session breaking. The session reordering prob-lem is to ensure that all the queries belonging to the samesearch mission are consecutive. Then, the session breakingproblem is much easier as it only needs to deal with non-intertwined chains.

5.1 Session re-ordering by ATSPWe formulate the session re-ordering problem as an in-

stance of the Assymmetric Traveler Salesman Problem (ATSP).Let w(q, q!) be a weight defined as a chaining probabilityfrom Section 4. Given the session S = "q1, q2, . . . qk#, con-sider a directed weighted graph GS = (V, E, h) with nodesV = {s, q1, . . . , qk, t}, edges E and edge weights h definedas h(qi, qj) = $ log w(qi, qj) . An edge (qi, qj) exists in E ifw(qi, qj) > 0.

An optimal ordering is a permutation " of "1, 2, . . . k# thatmaximizes

k"1"

i=1

w(q"(i), q"(i+1)).

This is equivalent to finding a Hamiltonian path of minimumweight in this graph.

613

martedì 4 maggio 2010

Page 13: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

• QFG [Boldi et al., CIKM’08] is a compact and powerful representation of Web Search engine users’ behavior;

• QFG is a graph composed by:

1. a set of nodes, V = Q ∪ {s,t};

2. a set of directed edges, E ⊆ V x V:

• (q, q’) are connected if they are consecutive at least one time in at least one session;

3. a weighting function w = E --> (0, 1]:

• assigning a weight w(q, q’) to each edge;

barcelona fc

<T>

0.506

barcelona fcwebsite

0.043barcelona fc

fixtures0.031

realmadrid

0.017

barcelonaweather

0.523

barcelonahotels

0.018

barcelonaweatheronline

0.100

barcelona

0.018

0.011

0.439

cheapbarcelona

hotels

0.072

luxurybarcelona

hotels

0.029

0.080

0.416

0.043

0.023

Figure 2: A portion of the query flow graph usingthe weighting scheme based on relative frequencies,described on Section 4.

Let f(s, q) and f(q, t) indicate the number of times query qis the first and last query of a session, respectively.

The weight we use is:

w!(q, q!) =

!f(q,q!)

f(q) if (w(q, q!) > !) ! (q = s) ! (q = t)

0 otherwise,

which uses the chaining probabilities w(q, q!) basically todiscard pairs that have a probability of less than ! to bepart of the same chain.

By construction, the sum of the weights of the edges go-ing out from each node is equal to 1. The result of such anormalization can be viewed as the transition matrix P of aMarkov chain.

In Figure 2 we show a small snapshot of the query flowgraph we produce with this weighting scheme. This containsthe query “barcelona” and some of its followers up to adepth of 2, selected in decreasing order of count. Also theterminal node t is present in the figure. Note that the sum ofoutgoing edges from each node does not reach 1 just becausenot all outgoing edges (and relative destination nodes) arereported.

5. FINDING CHAINSIn this section we describe our first application of the

query-flow graph: finding chains of queries in user sessions.As we have already mentioned, finding chains is a very im-portant problem as it allows improving query-log analysis,user profiling, mining user behavior, and more. For thisapplication we use the first weighing scheme described inSection 4 based on chaining probabilities.

The problem we consider is the following. We are given asupersession S = "q1, q2, . . . , qk# of one particular user. We

are also given the query-flow graph, which has been com-puted with the sessions of S as part of its input. The chain-finding problem can also be defined in the case that thesessions of S have not participated in the construction ofthe query-flow graph. However, in this paper we focus onthe former case and we leave the latter for future work.

One of the challenges of the problem we consider arisesfrom our definition of chains: we allow chains not to be con-secutive in the supersession S; in other words, the super-session S may contain many intertwined chains such as theones shown in the Table 1. Previous work has mostly focusedon the case where all chains are consecutive.

Chain #1 Chain #2

. . . . . .football results january 2nd pointui forumroyal carribean cruises audi ipswichholidays golfers elbowmotherwell football club cox ipswich... ...

Table 1: Two fragments from actual sessions con-taining non-consecutive chains.

The chain-finding problem can be formalized as follows:let us define a chain cover of S = "q1, q2, . . . qk# as a par-tition of the set {1, . . . , k} into subsets C1, . . . , Ch. Eachset Cu = {iu1 < · · · < iu!u

} is thought of as a chain Cu ="s, qiu

1, . . . , qiu

!u, t#, that is associated the probability

P (Cu) = P (s, qiu1)P (qiu

1, qiu

2) . . . P (qiu

!u"1, qiu

!u)P (qiu

!u, t)

and we want to find a chain cover maximizing P (C1) . . . P (Ch).When a query appears more than once, “duplicate” nodes

for that query are added to the formulation, which makes thedescription of the algorithm slightly more complicated thanwhat is presented here. For simplicity of the presentation weomit the details related to queries appearing more than oncebelow, which are not fundamental to the understanding ofthe algorithm.

We separate this problem into two subproblems: sessionreordering and session breaking. The session reordering prob-lem is to ensure that all the queries belonging to the samesearch mission are consecutive. Then, the session breakingproblem is much easier as it only needs to deal with non-intertwined chains.

5.1 Session re-ordering by ATSPWe formulate the session re-ordering problem as an in-

stance of the Assymmetric Traveler Salesman Problem (ATSP).Let w(q, q!) be a weight defined as a chaining probabilityfrom Section 4. Given the session S = "q1, q2, . . . qk#, con-sider a directed weighted graph GS = (V, E, h) with nodesV = {s, q1, . . . , qk, t}, edges E and edge weights h definedas h(qi, qj) = $ log w(qi, qj) . An edge (qi, qj) exists in E ifw(qi, qj) > 0.

An optimal ordering is a permutation " of "1, 2, . . . k# thatmaximizes

k"1"

i=1

w(q"(i), q"(i+1)).

This is equivalent to finding a Hamiltonian path of minimumweight in this graph.

613

martedì 4 maggio 2010

Page 14: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

martedì 4 maggio 2010

Page 15: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

• two weighting schemes:

• relative frequencies: counting query occurrences;

• chaining probabilities: (q,q’) in the same chain

• classification on a set of features (text, n-grams, session) over all sessions where (q,q’) are consecutive;

martedì 4 maggio 2010

Page 16: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

• two weighting schemes:

• relative frequencies: counting query occurrences;

• chaining probabilities: (q,q’) in the same chain

• classification on a set of features (text, n-grams, session) over all sessions where (q,q’) are consecutive;

• noisy edges: edges with low probability are removed;

martedì 4 maggio 2010

Page 17: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

martedì 4 maggio 2010

Page 18: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

• Query recommendation:

• random walk with restart on the graph;

• considering history of the users (on the preference vector);

martedì 4 maggio 2010

Page 19: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

The Query-Flow Graph

• Query recommendation:

• random walk with restart on the graph;

• considering history of the users (on the preference vector);

• A score is associated to each suggestion;

martedì 4 maggio 2010

Page 20: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Framework

martedì 4 maggio 2010

Page 21: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Framework

• Experiments on the AOL query log:

martedì 4 maggio 2010

Page 22: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Framework

• Experiments on the AOL query log:

• 20 millions queries;

martedì 4 maggio 2010

Page 23: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Framework

• Experiments on the AOL query log:

• 20 millions queries;

• 650,000 different users;

martedì 4 maggio 2010

Page 24: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Framework

• Experiments on the AOL query log:

• 20 millions queries;

• 650,000 different users;

• 3 months (03/01/2006 --> 05/31/2006).

martedì 4 maggio 2010

Page 25: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Framework

• Experiments on the AOL query log:

• 20 millions queries;

• 650,000 different users;

• 3 months (03/01/2006 --> 05/31/2006).

• Three segments of the query log:

!"#$%&'()$ !"#$*+',-$ !"#$%&.$ /!#)$%&.$

M2 M1

martedì 4 maggio 2010

Page 26: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Assumptions

martedì 4 maggio 2010

Page 27: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Assumptions

The AOL data-set contains about 20 million queries is-

sued by about 650, 000 different users, submitted to the AOL

search portal over a period of three months from 1st March,

2006 to 31st May, 2006. Each query record comes with the

user ID, timestamp, and the list of results returned to the

user. After the controversial discussion followed to its initial

public delivery, AOL has withdrawn the query log from their

servers and is not offering it for download anymore. We de-

cided to run experiments on that log anyways, because of

the following reasons. First of all, the log spans a long pe-

riod of time and this allows us to show how models for query

suggestions degrade in a more realistic way. Second, we are

not disclosing any sensitive information neither about users

nor about usage behavior. Therefore, we are not breaching

into the privacy of any specific user. Last, but not least, the

query log is still available on the web. Everybody can easily

find and download it. Indeed we consider this a strong point

in favor of its use: the availability of data allows the repeata-

bility of experiments which is an important requirement for

any scientific work.

To assess the aging effects on QFG models we conducted

several experiments to evaluate the impact of different fac-

tors. The log has been split into three different segments.Two of them have been used for training and the third one

for testing. The three segments correspond to the three dif-

ferent months of users activities recorded in the query log.

We fixed the test set – i.e. the set of queries from which

we generate recommendations – to be the queries submit-

ted in the last month. We also have conducted experiments

with different training granularities, based on weekly and bi-

weekly training sets. Results on those shorter training seg-

ments are consistent with those presented in the following,

and we are omitting them for brevity.

The QFGs over the two monthly training segments have

been constructed according to the algorithm presented by

Boldi et al. in [4]. This method uses chaining probabilitiesmeasured by means of a machine learning method. The ini-

tial step was thus to extract those features from each train-

ing log, and storing them into a compressed graph repre-

sentation. In particular we extracted 25 different features

(time-related, session and textual features) for each pair of

queries (q, q�) that are consecutive in at least one session of

the query log.

Table 1 shows the number of nodes and edges of the dif-

ferent graphs corresponding to each query log segment used

for training.

time window id nodes edges

March 06 M1 3,814,748 6,129,629

April 06 M2 3,832,973 6,266,648

Table 1: Number of nodes and edges for the graphscorresponding to the two different training seg-ments.

It is important to remark that we have not re-trained the

classification model for the assignment of weights associated

with QFG edges. We reuse the one that has been used in [4]

for segmenting users sessions into query chains1. This is

another point in favor of QFG-based models. Once you train

the classifier to assign weights to QFG edges, you can reuse

it on different data-sets without losing in effectiveness. We

1We thank the authors of [4] for providing us their model.

want to point out, indeed, that what we are evaluating in

this work is that QFGs themselves age much faster than the

model used to build them. This is a subtle difference we

want to state clearly.

Once the QFG has been built, the query recommendation

methods are based on the probability of being at a certain

node after performing a random walk over the query graph.

This random walk starts at the node corresponding to the

query for which we want to generate a suggestion. At each

step, the random walker either remains in the same node

with a probability α, or it follows one of the out-links with

probability equal to 1 − α; in the latter case, out-links are

followed proportionally to w(i, j). In all the experiments

we computed the stable vector of the random walk on each

QFG by using α = 0.15. Actually, the stable vector is com-

puted according to a Random Walk with Restart model [13].

Instead of restarting the random walk from a query chosen

uniformly at random, we restart the random walk only from

a given set of nodes. This is done by using a preference

vector v, much in the spirit of the Topic-based PageRank

computation [10], defined as follows. Let q1, . . . , qn be a

query chain (q1 is the most recently submitted query). The

preference vector v is defined in the following way: vq = 0

for all q /∈ q1, . . . , qn and vqi ∝ βi. β is a weighting factor

that we set in all of our experiments to be β = 0.90.

5. EVALUATING THE AGING EFFECT

One of the main goals of this paper is to show that time

has some negative effects on the quality of query suggestions

generated by QFG-based models. It is also worth remark-

ing that we can safely extend the discussion that follows also

to suggestion models different from QFG-based ones. As a

matter of fact, the presence of “bursty” [12] topics could re-

quire frequent model updates whatever model we are using.

To validate our hypothesis about the aging of QFG-based

models we have conducted experiments on models built on

the two different training segments described in the above

section.

In order to assess the various reasons why a QFG-based

model ages we have considered, for each segment, two classes

of queries, namely F1, and F3, which respectively corre-

spond to queries having a strong decrease and a strong in-

crease in frequency. F1 is the set of the 30 queries that are

among the 1,000 most frequent queries in the first month

(M1) but whose frequency has had the greater drop in the

last month covered by the query log (M3). Conversely, F3

is the set of the 30 queries among the 1,000 most frequent

queries in the test log M3 whose frequency has the greater

drop in the first part of the log M1. Actually, to make the

assessment more significant, we do not include queries that

are too similar, and we do not include queries containing

domain names within the query string. Figure 1 graphically

show where the selected queries for each class fall when we

plot the popularity of the top-1000 most frequent queries in

M1 (M3) by considering query ids assigned according to

frequencies in M3 (M1).

Some examples of queries in F1 are: “shakira”, “ameri-canidol”, “ecards”, “nfl”. Such queries are related to par-

ticular events in March 2006, for instance singer Shakira

in March 2006 released a new album. Some examples of

queries in F3 are: “mothers day gift”, “mothers day poems”,“memorial day”, “da vinci code”. As in the previous case,

F3 queries are strongly related to that particular period of

• M1, M2 are used for training;

• two different QFGs;

martedì 4 maggio 2010

Page 28: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Assumptions

• Queries in the third month are used for testing;

The AOL data-set contains about 20 million queries is-

sued by about 650, 000 different users, submitted to the AOL

search portal over a period of three months from 1st March,

2006 to 31st May, 2006. Each query record comes with the

user ID, timestamp, and the list of results returned to the

user. After the controversial discussion followed to its initial

public delivery, AOL has withdrawn the query log from their

servers and is not offering it for download anymore. We de-

cided to run experiments on that log anyways, because of

the following reasons. First of all, the log spans a long pe-

riod of time and this allows us to show how models for query

suggestions degrade in a more realistic way. Second, we are

not disclosing any sensitive information neither about users

nor about usage behavior. Therefore, we are not breaching

into the privacy of any specific user. Last, but not least, the

query log is still available on the web. Everybody can easily

find and download it. Indeed we consider this a strong point

in favor of its use: the availability of data allows the repeata-

bility of experiments which is an important requirement for

any scientific work.

To assess the aging effects on QFG models we conducted

several experiments to evaluate the impact of different fac-

tors. The log has been split into three different segments.Two of them have been used for training and the third one

for testing. The three segments correspond to the three dif-

ferent months of users activities recorded in the query log.

We fixed the test set – i.e. the set of queries from which

we generate recommendations – to be the queries submit-

ted in the last month. We also have conducted experiments

with different training granularities, based on weekly and bi-

weekly training sets. Results on those shorter training seg-

ments are consistent with those presented in the following,

and we are omitting them for brevity.

The QFGs over the two monthly training segments have

been constructed according to the algorithm presented by

Boldi et al. in [4]. This method uses chaining probabilitiesmeasured by means of a machine learning method. The ini-

tial step was thus to extract those features from each train-

ing log, and storing them into a compressed graph repre-

sentation. In particular we extracted 25 different features

(time-related, session and textual features) for each pair of

queries (q, q�) that are consecutive in at least one session of

the query log.

Table 1 shows the number of nodes and edges of the dif-

ferent graphs corresponding to each query log segment used

for training.

time window id nodes edges

March 06 M1 3,814,748 6,129,629

April 06 M2 3,832,973 6,266,648

Table 1: Number of nodes and edges for the graphscorresponding to the two different training seg-ments.

It is important to remark that we have not re-trained the

classification model for the assignment of weights associated

with QFG edges. We reuse the one that has been used in [4]

for segmenting users sessions into query chains1. This is

another point in favor of QFG-based models. Once you train

the classifier to assign weights to QFG edges, you can reuse

it on different data-sets without losing in effectiveness. We

1We thank the authors of [4] for providing us their model.

want to point out, indeed, that what we are evaluating in

this work is that QFGs themselves age much faster than the

model used to build them. This is a subtle difference we

want to state clearly.

Once the QFG has been built, the query recommendation

methods are based on the probability of being at a certain

node after performing a random walk over the query graph.

This random walk starts at the node corresponding to the

query for which we want to generate a suggestion. At each

step, the random walker either remains in the same node

with a probability α, or it follows one of the out-links with

probability equal to 1 − α; in the latter case, out-links are

followed proportionally to w(i, j). In all the experiments

we computed the stable vector of the random walk on each

QFG by using α = 0.15. Actually, the stable vector is com-

puted according to a Random Walk with Restart model [13].

Instead of restarting the random walk from a query chosen

uniformly at random, we restart the random walk only from

a given set of nodes. This is done by using a preference

vector v, much in the spirit of the Topic-based PageRank

computation [10], defined as follows. Let q1, . . . , qn be a

query chain (q1 is the most recently submitted query). The

preference vector v is defined in the following way: vq = 0

for all q /∈ q1, . . . , qn and vqi ∝ βi. β is a weighting factor

that we set in all of our experiments to be β = 0.90.

5. EVALUATING THE AGING EFFECT

One of the main goals of this paper is to show that time

has some negative effects on the quality of query suggestions

generated by QFG-based models. It is also worth remark-

ing that we can safely extend the discussion that follows also

to suggestion models different from QFG-based ones. As a

matter of fact, the presence of “bursty” [12] topics could re-

quire frequent model updates whatever model we are using.

To validate our hypothesis about the aging of QFG-based

models we have conducted experiments on models built on

the two different training segments described in the above

section.

In order to assess the various reasons why a QFG-based

model ages we have considered, for each segment, two classes

of queries, namely F1, and F3, which respectively corre-

spond to queries having a strong decrease and a strong in-

crease in frequency. F1 is the set of the 30 queries that are

among the 1,000 most frequent queries in the first month

(M1) but whose frequency has had the greater drop in the

last month covered by the query log (M3). Conversely, F3

is the set of the 30 queries among the 1,000 most frequent

queries in the test log M3 whose frequency has the greater

drop in the first part of the log M1. Actually, to make the

assessment more significant, we do not include queries that

are too similar, and we do not include queries containing

domain names within the query string. Figure 1 graphically

show where the selected queries for each class fall when we

plot the popularity of the top-1000 most frequent queries in

M1 (M3) by considering query ids assigned according to

frequencies in M3 (M1).

Some examples of queries in F1 are: “shakira”, “ameri-canidol”, “ecards”, “nfl”. Such queries are related to par-

ticular events in March 2006, for instance singer Shakira

in March 2006 released a new album. Some examples of

queries in F3 are: “mothers day gift”, “mothers day poems”,“memorial day”, “da vinci code”. As in the previous case,

F3 queries are strongly related to that particular period of

• M1, M2 are used for training;

• two different QFGs;

martedì 4 maggio 2010

Page 29: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Experimental Assumptions

• Queries in the third month are used for testing;

• We evaluate the aging effect by measuring the quality of suggestions produced by models on M1, and M2;

• If the model ages M2 outperforms M1, in terms of quality of suggestions;

The AOL data-set contains about 20 million queries is-

sued by about 650, 000 different users, submitted to the AOL

search portal over a period of three months from 1st March,

2006 to 31st May, 2006. Each query record comes with the

user ID, timestamp, and the list of results returned to the

user. After the controversial discussion followed to its initial

public delivery, AOL has withdrawn the query log from their

servers and is not offering it for download anymore. We de-

cided to run experiments on that log anyways, because of

the following reasons. First of all, the log spans a long pe-

riod of time and this allows us to show how models for query

suggestions degrade in a more realistic way. Second, we are

not disclosing any sensitive information neither about users

nor about usage behavior. Therefore, we are not breaching

into the privacy of any specific user. Last, but not least, the

query log is still available on the web. Everybody can easily

find and download it. Indeed we consider this a strong point

in favor of its use: the availability of data allows the repeata-

bility of experiments which is an important requirement for

any scientific work.

To assess the aging effects on QFG models we conducted

several experiments to evaluate the impact of different fac-

tors. The log has been split into three different segments.Two of them have been used for training and the third one

for testing. The three segments correspond to the three dif-

ferent months of users activities recorded in the query log.

We fixed the test set – i.e. the set of queries from which

we generate recommendations – to be the queries submit-

ted in the last month. We also have conducted experiments

with different training granularities, based on weekly and bi-

weekly training sets. Results on those shorter training seg-

ments are consistent with those presented in the following,

and we are omitting them for brevity.

The QFGs over the two monthly training segments have

been constructed according to the algorithm presented by

Boldi et al. in [4]. This method uses chaining probabilitiesmeasured by means of a machine learning method. The ini-

tial step was thus to extract those features from each train-

ing log, and storing them into a compressed graph repre-

sentation. In particular we extracted 25 different features

(time-related, session and textual features) for each pair of

queries (q, q�) that are consecutive in at least one session of

the query log.

Table 1 shows the number of nodes and edges of the dif-

ferent graphs corresponding to each query log segment used

for training.

time window id nodes edges

March 06 M1 3,814,748 6,129,629

April 06 M2 3,832,973 6,266,648

Table 1: Number of nodes and edges for the graphscorresponding to the two different training seg-ments.

It is important to remark that we have not re-trained the

classification model for the assignment of weights associated

with QFG edges. We reuse the one that has been used in [4]

for segmenting users sessions into query chains1. This is

another point in favor of QFG-based models. Once you train

the classifier to assign weights to QFG edges, you can reuse

it on different data-sets without losing in effectiveness. We

1We thank the authors of [4] for providing us their model.

want to point out, indeed, that what we are evaluating in

this work is that QFGs themselves age much faster than the

model used to build them. This is a subtle difference we

want to state clearly.

Once the QFG has been built, the query recommendation

methods are based on the probability of being at a certain

node after performing a random walk over the query graph.

This random walk starts at the node corresponding to the

query for which we want to generate a suggestion. At each

step, the random walker either remains in the same node

with a probability α, or it follows one of the out-links with

probability equal to 1 − α; in the latter case, out-links are

followed proportionally to w(i, j). In all the experiments

we computed the stable vector of the random walk on each

QFG by using α = 0.15. Actually, the stable vector is com-

puted according to a Random Walk with Restart model [13].

Instead of restarting the random walk from a query chosen

uniformly at random, we restart the random walk only from

a given set of nodes. This is done by using a preference

vector v, much in the spirit of the Topic-based PageRank

computation [10], defined as follows. Let q1, . . . , qn be a

query chain (q1 is the most recently submitted query). The

preference vector v is defined in the following way: vq = 0

for all q /∈ q1, . . . , qn and vqi ∝ βi. β is a weighting factor

that we set in all of our experiments to be β = 0.90.

5. EVALUATING THE AGING EFFECT

One of the main goals of this paper is to show that time

has some negative effects on the quality of query suggestions

generated by QFG-based models. It is also worth remark-

ing that we can safely extend the discussion that follows also

to suggestion models different from QFG-based ones. As a

matter of fact, the presence of “bursty” [12] topics could re-

quire frequent model updates whatever model we are using.

To validate our hypothesis about the aging of QFG-based

models we have conducted experiments on models built on

the two different training segments described in the above

section.

In order to assess the various reasons why a QFG-based

model ages we have considered, for each segment, two classes

of queries, namely F1, and F3, which respectively corre-

spond to queries having a strong decrease and a strong in-

crease in frequency. F1 is the set of the 30 queries that are

among the 1,000 most frequent queries in the first month

(M1) but whose frequency has had the greater drop in the

last month covered by the query log (M3). Conversely, F3

is the set of the 30 queries among the 1,000 most frequent

queries in the test log M3 whose frequency has the greater

drop in the first part of the log M1. Actually, to make the

assessment more significant, we do not include queries that

are too similar, and we do not include queries containing

domain names within the query string. Figure 1 graphically

show where the selected queries for each class fall when we

plot the popularity of the top-1000 most frequent queries in

M1 (M3) by considering query ids assigned according to

frequencies in M3 (M1).

Some examples of queries in F1 are: “shakira”, “ameri-canidol”, “ecards”, “nfl”. Such queries are related to par-

ticular events in March 2006, for instance singer Shakira

in March 2006 released a new album. Some examples of

queries in F3 are: “mothers day gift”, “mothers day poems”,“memorial day”, “da vinci code”. As in the previous case,

F3 queries are strongly related to that particular period of

• M1, M2 are used for training;

• two different QFGs;

martedì 4 maggio 2010

Page 30: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect

martedì 4 maggio 2010

Page 31: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect

1

10

100

1000

10000

100000

1e+06

1 10 100 1000

Top 1000 queries in month 1 on month 1

Top 1000 queries in month 3 on month 1

!"#"$%"&'()"*+",'

martedì 4 maggio 2010

Page 32: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect

• Two classes of test queries:

• F1: 30 queries highly frequent in M1 having a large drop in the test month (ex. shakira).

• F3: 30 queries highly frequent in the test month having a large drop in M1 (ex. da vinci code, mothers day gift);

1

10

100

1000

10000

100000

1e+06

1 10 100 1000

Top 1000 queries in month 1 on month 1

Top 1000 queries in month 3 on month 1

!"#"$%"&'()"*+",'

martedì 4 maggio 2010

Page 33: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect

• Two classes of test queries:

• F1: 30 queries highly frequent in M1 having a large drop in the test month (ex. shakira).

• F3: 30 queries highly frequent in the test month having a large drop in M1 (ex. da vinci code, mothers day gift);

• F1, F3 contain very diverse queries;

1

10

100

1000

10000

100000

1e+06

1 10 100 1000

Top 1000 queries in month 1 on month 1

Top 1000 queries in month 3 on month 1

!"#"$%"&'()"*+",'

martedì 4 maggio 2010

Page 34: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (II)

martedì 4 maggio 2010

Page 35: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (II)

lost harley davidson5138 50973742 26522162 26152001 23411913 23411913 2341

!"

#!!!"

$!!!"

%!!!"

&!!!"

'!!!"

(!!!"

#" $" %" &" '" ("

)*+,"

-./)01"2.342+*5"

martedì 4 maggio 2010

Page 36: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (II)

• When k suggestions share the same score, those are useless;

lost harley davidson5138 50973742 26522162 26152001 23411913 23411913 2341

!"

#!!!"

$!!!"

%!!!"

&!!!"

'!!!"

(!!!"

#" $" %" &" '" ("

)*+,"

-./)01"2.342+*5"

martedì 4 maggio 2010

Page 37: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (II)

• When k suggestions share the same score, those are useless;

• Same suggestion score:

• same probability on the graph;

• the model is not able to give a priority to recommendations;

lost harley davidson5138 50973742 26522162 26152001 23411913 23411913 2341

!"

#!!!"

$!!!"

%!!!"

&!!!"

'!!!"

(!!!"

#" $" %" &" '" ("

)*+,"

-./)01"2.342+*5"

martedì 4 maggio 2010

Page 38: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (II)

• When k suggestions share the same score, those are useless;

• Same suggestion score:

• same probability on the graph;

• the model is not able to give a priority to recommendations;

• Confirmed by an user-study on F1, and F3;

lost harley davidson5138 50973742 26522162 26152001 23411913 23411913 2341

!"

#!!!"

$!!!"

%!!!"

&!!!"

'!!!"

(!!!"

#" $" %" &" '" ("

)*+,"

-./)01"2.342+*5"

martedì 4 maggio 2010

Page 39: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (III)

martedì 4 maggio 2010

Page 40: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (III)

• Working hypothesis:

• useful recommendations do not share the same recommendation score;

martedì 4 maggio 2010

Page 41: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (III)

• Working hypothesis:

• useful recommendations do not share the same recommendation score;

• Automatic evaluation;

• 400 highly frequent queries in the test month;

• evaluating the number of useful recommendations;

• k = 3;

martedì 4 maggio 2010

Page 42: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (IV)

martedì 4 maggio 2010

Page 43: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (IV)

Set QueryM1 M2

Score Suggestions Score Suggestions

F3

da vinci

49743 da vinci’s self portrait black and white 73219 da vinci and math47294 the vitruvian man 33769 da vinci biography35362 last supper da vinci 31383 da vinci code on portrait31307 leonardo da vinci 29565 flying machines30234 post it 28432 inventions by leonardo da vinci30234 handshape 20stories 26003 leonardo da vinci paintings

23343 friends church23343 jerry c website

survivor

8533 watch survivor episodes 7392 survivor preview8083 survivor island 7298 watch survivor episodes4391 2006 survivor 7110 survivor island album4310 bubba gump hat 4578 survivor edition 20064310 big shorts 3801 cbs the great race

3801 chicken and broccoli

lost

16522 lost fan site 5138 lost season 23634 abcpreview.go.com 3742 lost update2341 altricious 2162 lost easter eggs2341 5 year treasury rate 1913 abcpreview.go.com2341 1-800-sos-radon 1913 antique curio cabinets

1913 4440

F1

anna nicole smith

11113 anna nicole smith nude 23497 anna nicole smith recent news11101 anna nicole smith - supreme court 18694 anna nicole smith and supreme court11054 anna nicole smith diet 18546 anna nicole smith and playboy10274 anna nicole smith with liposuction 16836 anna nicole smith pictures4677 cameron diaz video 15289 anna nicole smith free nude pics4677 calcination 13436 anna nicole smith diet

6642 branson tractors6642 bandlinoblu jeans

harley davidson

5097 harley davidson ny 5749 harley davidson premium sound system owners manual2652 american harley davidson 3859 automatic motorcycles2615 2002 harley davidson ultra classic 3635 harley davidson credit2602 adamec harley davidson 3618 cherokee harley davidson2341 air fight 2103 harley davidson sporster2341 928 zip code 1965 2002 harley davidson classic2341 antispy ware 1394 regions banking

1394 aol email only1394 adultactioncamcom

shakira

10989 shakira video 3281 hips don’t lie7522 shakira albums 3281 shakira hips don’t lie video7522 shakira my hips don’t lie 3175 shakira video5836 shakira biography 3042 shakira wallpaper3864 70s music funk 2811 shakira album3864 97.1zht 2592 shakira nude

1868 cant fight the moonlight1868 free video downloads

Table 2: Some examples of recommendations generated on different QFG models. Queries used to gener-

ate recommendations are taken from different query sets. For each query we present the most important

recommendations with their assigned relative scores.

reduces the “noise” on the data and generates more preciseknowledge on which recommendations are computed. Fur-thermore, the increase is quite independent from the thresh-old level, i.e. by increasing the threshold from 0.5 to 0.75the overall quality is, roughly, constant.

filteringthreshold

average numberof useful sugges-tions on M1

average numberof useful sugges-tions on M2

0 2.84 2.910.5 5.85 6.230.65 5.85 6.230.75 5.85 6.18

Table 4: Recommendation statistics obtained by us-

ing the automatic evaluation method on a set of 400

queries drawn from the most frequent in the third

month.

We further break down the overall results shown in Table 4to show the number of queries on which the QFG-basedmodel generated a given number of useful suggestions. Weplot this histogram to compare those numbers on M1 andM2 in Figure 2. To highlight more the effect of incrementalupdates we show in Figure 3 the total number of querieshaving at least a certain number of useful recommendation.For example, the third bucket shows how many queries haveat least three useful suggestions. For each bucket, results forM2 are always better than the ones for M1.

First of all, when we train a QFG-based model on M2

the percentage of queries having 0 useful results is remark-ably lower than those measured on the model trained on

0

12,5

25

37,5

50

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2

Figure 2: Histogram showing the number of queries

(on the y axis) having a certain number of useful

recommendations (on the x axis). Results are eval-

uated automatically.

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2

Figure 3: Histogram showing the total number of

queries (on the y axis) having at least a certain num-

ber of useful recommendations (on the x axis). For

instance the third bucket shows how many queries

have at least three useful suggestions.

• Results:

martedì 4 maggio 2010

Page 44: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (IV)

Set QueryM1 M2

Score Suggestions Score Suggestions

F3

da vinci

49743 da vinci’s self portrait black and white 73219 da vinci and math47294 the vitruvian man 33769 da vinci biography35362 last supper da vinci 31383 da vinci code on portrait31307 leonardo da vinci 29565 flying machines30234 post it 28432 inventions by leonardo da vinci30234 handshape 20stories 26003 leonardo da vinci paintings

23343 friends church23343 jerry c website

survivor

8533 watch survivor episodes 7392 survivor preview8083 survivor island 7298 watch survivor episodes4391 2006 survivor 7110 survivor island album4310 bubba gump hat 4578 survivor edition 20064310 big shorts 3801 cbs the great race

3801 chicken and broccoli

lost

16522 lost fan site 5138 lost season 23634 abcpreview.go.com 3742 lost update2341 altricious 2162 lost easter eggs2341 5 year treasury rate 1913 abcpreview.go.com2341 1-800-sos-radon 1913 antique curio cabinets

1913 4440

F1

anna nicole smith

11113 anna nicole smith nude 23497 anna nicole smith recent news11101 anna nicole smith - supreme court 18694 anna nicole smith and supreme court11054 anna nicole smith diet 18546 anna nicole smith and playboy10274 anna nicole smith with liposuction 16836 anna nicole smith pictures4677 cameron diaz video 15289 anna nicole smith free nude pics4677 calcination 13436 anna nicole smith diet

6642 branson tractors6642 bandlinoblu jeans

harley davidson

5097 harley davidson ny 5749 harley davidson premium sound system owners manual2652 american harley davidson 3859 automatic motorcycles2615 2002 harley davidson ultra classic 3635 harley davidson credit2602 adamec harley davidson 3618 cherokee harley davidson2341 air fight 2103 harley davidson sporster2341 928 zip code 1965 2002 harley davidson classic2341 antispy ware 1394 regions banking

1394 aol email only1394 adultactioncamcom

shakira

10989 shakira video 3281 hips don’t lie7522 shakira albums 3281 shakira hips don’t lie video7522 shakira my hips don’t lie 3175 shakira video5836 shakira biography 3042 shakira wallpaper3864 70s music funk 2811 shakira album3864 97.1zht 2592 shakira nude

1868 cant fight the moonlight1868 free video downloads

Table 2: Some examples of recommendations generated on different QFG models. Queries used to gener-

ate recommendations are taken from different query sets. For each query we present the most important

recommendations with their assigned relative scores.

reduces the “noise” on the data and generates more preciseknowledge on which recommendations are computed. Fur-thermore, the increase is quite independent from the thresh-old level, i.e. by increasing the threshold from 0.5 to 0.75the overall quality is, roughly, constant.

filteringthreshold

average numberof useful sugges-tions on M1

average numberof useful sugges-tions on M2

0 2.84 2.910.5 5.85 6.230.65 5.85 6.230.75 5.85 6.18

Table 4: Recommendation statistics obtained by us-

ing the automatic evaluation method on a set of 400

queries drawn from the most frequent in the third

month.

We further break down the overall results shown in Table 4to show the number of queries on which the QFG-basedmodel generated a given number of useful suggestions. Weplot this histogram to compare those numbers on M1 andM2 in Figure 2. To highlight more the effect of incrementalupdates we show in Figure 3 the total number of querieshaving at least a certain number of useful recommendation.For example, the third bucket shows how many queries haveat least three useful suggestions. For each bucket, results forM2 are always better than the ones for M1.

First of all, when we train a QFG-based model on M2

the percentage of queries having 0 useful results is remark-ably lower than those measured on the model trained on

0

12,5

25

37,5

50

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2

Figure 2: Histogram showing the number of queries

(on the y axis) having a certain number of useful

recommendations (on the x axis). Results are eval-

uated automatically.

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2

Figure 3: Histogram showing the total number of

queries (on the y axis) having at least a certain num-

ber of useful recommendations (on the x axis). For

instance the third bucket shows how many queries

have at least three useful suggestions.

• Results:

• Average number of useful suggestions is greater in M2 than in M1;

• Filtering process helps a lot;

martedì 4 maggio 2010

Page 45: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (V)

martedì 4 maggio 2010

Page 46: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (V)

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2

• On a histogram (cumulative distribution):

martedì 4 maggio 2010

Page 47: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Evaluating the Aging Effect (V)

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2

• On a histogram (cumulative distribution):

• Results on M2 are always better than those on M1:

• less queries without suggestions;martedì 4 maggio 2010

Page 48: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect

martedì 4 maggio 2010

Page 49: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect

• QFG recommender models age:

• Average recommendation quality degrades;

• Recommendations should not be influenced by time;

martedì 4 maggio 2010

Page 50: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect

• QFG recommender models age:

• Average recommendation quality degrades;

• Recommendations should not be influenced by time;

• Update of the model vs. rebuilding it “from scratch”;

martedì 4 maggio 2010

Page 51: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (II)

martedì 4 maggio 2010

Page 52: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (II)M1. Furthermore, for Figure 3 we can observe that a model

trained on M2 has a larger percentage of queries for whichthe number of useful suggestions is at least 4.

This confirms our hypothesis that QFG-based recommen-dation models age and have to be updated in order to alwaysgenerate useful suggestions.

6. COMBATING AGING IN QUERY-FLOWGRAPHS

Models based on Query Flow Graphs age quite rapidly interms of their performance for generating useful recommen-dations.

The pool of queries we use to generate recommendationmodels (M1) contains both frequent and time-sensitive queries.We consider time-sensitive those queries that are both fre-quent and have a large variation with respect to their valuein the previous month(s) (e.g. > 50% of positive variation).Time-sensitive queries are related for instance to movies,new products being launched, fashion, and in general withnews events generating query bursts. In Figure 1 it is easyto identify time-sensitive queries by looking at those withthe greater variation between their frequencies over the twomonths.

As Tables 3 and 8 show, if we compute recommendationson older models the average recommendation quality de-grades. Indeed, “stable” queries, i.e. those queries that are(relatively) frequent in each period of the year, also have tobe taken into account by the model. Therefore, we must beable to give suggestions that are not too much influencedby time or, in other words, to update the recommendationmodel instead of rebuilding a new model from scratch dis-regarding older knowledge.

There are two ways of building an updated QFG: i) re-building the model by considering the whole set of queriesfrom scratch, or ii) using the old QFG-based model andadding fresh queries to it.

A straightforward option is to compute a new QFG rep-resenting the two months. The new part of the query log(M2) is merged with the old one (M1) obtaining a newdata-set to be used for building the new model. The mergeddata-set can be processed from scratch by performing all thesteps necessary to build a QFG (i.e. preprocessing, featuresgeneration, compression, normalization, chain graph gener-ation, random walk computation). The old model is simplydiscarded.

The other option is computing recommendations on an“incremental model” that is built by merging the old QFGswith the one obtained from the new queries. In this way:i) we refresh the model with new data covering time-relatedqueries; ii) the “old” part of the model contributes to main-tain quality on frequent and time-unrelated queries. Themethodology we are proposing incrementally extends therecommendation model with a sub-model built on more re-cent data. A sound incremental approach has to consistentlyupdate the old model with fresh data continuously or afterfixed periods of time. Another advantage of this approach isthat the new model can be built by spending only the timeneeded to build a QFG from a relatively small set of newqueries, plus the cost of the merging process.

Let us introduce an example to show the main differencesamong the two approaches in terms of computational time.Table 5 shows the total elapsed times to create different

QFGs. Suppose the model used to generate recommenda-tions consists of a portion of data representing one month(for M1 and M2) or two months (for M12) of the querylog. The model is being updated every 15 days (for M1

and M2) or every 30 days (for M12). By using the first ap-proach, we pay 22 (44) minutes every 15 (30) days to rebuildthe new model from scratch on a new set of data obtainedfrom the last two months of the query log. Instead, by usingthe second approach, we need to pay only 15 (32) minutesfor updating the one-month (two-months) QFG.

“From scratch” “Incremental”Dataset strategy [min.] strategy [min.]

M1 (March 2006) 21 14M2 (April 2006) 22 15

M12 (March and April) 44 32

Table 5: Time needed to build a Query Flow Graphfrom scratch and using our “incremental” approach(from merging two QFG representing an half ofdata).

The time spent in updating incrementally the model is, inpractice, shorter than the time needed to build the modelfrom scratch (in our case it is almost two third (i.e. 32%)that time). The process of merging two QFGs can be per-formed using an open-source Java tool [8] that implementsa graph algebra giving users the possibility to perform someoperations on WebGraph [7] encoded graphs. In our exper-iments we used three different QFGs built on M1, M2, andM12 defined as in the first column of Table 5.

query M1 M2 M12

mothers day 2 3 3da vinci 4 6 7lost 2 4 6

philippines 2 2 3

Table 6: Manual assessment of the number of usefulrecommendations generated for some time-relatedqueries on the three different models.

Table 6 shows the importance of having an incrementalapproach for time-related queries. It is evident from thetable that the model built on M12 always gives the bestrecommendations, in terms of quality, with respect to thetwo separate models M1 and M2.Table 7 shows example query suggestions generated by

the system, to demonstrate the improved quality of the rec-ommendations. These results suggest a positive effect of theincremental-update method on the recommendation quality.As in the previous section we evaluated the quality of the

recommendations, also by using the automatic procedureabove described. The results are shown on Table 8. We haveused the same set of 400 queries for which recommendationswere generated using the QFG built on M12.Again, the results suggested by the anecdotal evidence,

are confirmed by the assessment procedure. The model builton the two merged train segments is better than the singleM2 model (which was, in turn, better than the model builton M1). Improvements, in this case, are quite significantand range from 25% to 28% in accuracy.Using the automatic evaluation method we have investi-

• Solution: incremental update of M1 by means of “fresh data” in M2

• Graph algebra [Bordino et al., 2008];

• Some measures on the two different approaches:

martedì 4 maggio 2010

Page 53: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (II)M1. Furthermore, for Figure 3 we can observe that a model

trained on M2 has a larger percentage of queries for whichthe number of useful suggestions is at least 4.

This confirms our hypothesis that QFG-based recommen-dation models age and have to be updated in order to alwaysgenerate useful suggestions.

6. COMBATING AGING IN QUERY-FLOWGRAPHS

Models based on Query Flow Graphs age quite rapidly interms of their performance for generating useful recommen-dations.

The pool of queries we use to generate recommendationmodels (M1) contains both frequent and time-sensitive queries.We consider time-sensitive those queries that are both fre-quent and have a large variation with respect to their valuein the previous month(s) (e.g. > 50% of positive variation).Time-sensitive queries are related for instance to movies,new products being launched, fashion, and in general withnews events generating query bursts. In Figure 1 it is easyto identify time-sensitive queries by looking at those withthe greater variation between their frequencies over the twomonths.

As Tables 3 and 8 show, if we compute recommendationson older models the average recommendation quality de-grades. Indeed, “stable” queries, i.e. those queries that are(relatively) frequent in each period of the year, also have tobe taken into account by the model. Therefore, we must beable to give suggestions that are not too much influencedby time or, in other words, to update the recommendationmodel instead of rebuilding a new model from scratch dis-regarding older knowledge.

There are two ways of building an updated QFG: i) re-building the model by considering the whole set of queriesfrom scratch, or ii) using the old QFG-based model andadding fresh queries to it.

A straightforward option is to compute a new QFG rep-resenting the two months. The new part of the query log(M2) is merged with the old one (M1) obtaining a newdata-set to be used for building the new model. The mergeddata-set can be processed from scratch by performing all thesteps necessary to build a QFG (i.e. preprocessing, featuresgeneration, compression, normalization, chain graph gener-ation, random walk computation). The old model is simplydiscarded.

The other option is computing recommendations on an“incremental model” that is built by merging the old QFGswith the one obtained from the new queries. In this way:i) we refresh the model with new data covering time-relatedqueries; ii) the “old” part of the model contributes to main-tain quality on frequent and time-unrelated queries. Themethodology we are proposing incrementally extends therecommendation model with a sub-model built on more re-cent data. A sound incremental approach has to consistentlyupdate the old model with fresh data continuously or afterfixed periods of time. Another advantage of this approach isthat the new model can be built by spending only the timeneeded to build a QFG from a relatively small set of newqueries, plus the cost of the merging process.

Let us introduce an example to show the main differencesamong the two approaches in terms of computational time.Table 5 shows the total elapsed times to create different

QFGs. Suppose the model used to generate recommenda-tions consists of a portion of data representing one month(for M1 and M2) or two months (for M12) of the querylog. The model is being updated every 15 days (for M1

and M2) or every 30 days (for M12). By using the first ap-proach, we pay 22 (44) minutes every 15 (30) days to rebuildthe new model from scratch on a new set of data obtainedfrom the last two months of the query log. Instead, by usingthe second approach, we need to pay only 15 (32) minutesfor updating the one-month (two-months) QFG.

“From scratch” “Incremental”Dataset strategy [min.] strategy [min.]

M1 (March 2006) 21 14M2 (April 2006) 22 15

M12 (March and April) 44 32

Table 5: Time needed to build a Query Flow Graphfrom scratch and using our “incremental” approach(from merging two QFG representing an half ofdata).

The time spent in updating incrementally the model is, inpractice, shorter than the time needed to build the modelfrom scratch (in our case it is almost two third (i.e. 32%)that time). The process of merging two QFGs can be per-formed using an open-source Java tool [8] that implementsa graph algebra giving users the possibility to perform someoperations on WebGraph [7] encoded graphs. In our exper-iments we used three different QFGs built on M1, M2, andM12 defined as in the first column of Table 5.

query M1 M2 M12

mothers day 2 3 3da vinci 4 6 7lost 2 4 6

philippines 2 2 3

Table 6: Manual assessment of the number of usefulrecommendations generated for some time-relatedqueries on the three different models.

Table 6 shows the importance of having an incrementalapproach for time-related queries. It is evident from thetable that the model built on M12 always gives the bestrecommendations, in terms of quality, with respect to thetwo separate models M1 and M2.Table 7 shows example query suggestions generated by

the system, to demonstrate the improved quality of the rec-ommendations. These results suggest a positive effect of theincremental-update method on the recommendation quality.As in the previous section we evaluated the quality of the

recommendations, also by using the automatic procedureabove described. The results are shown on Table 8. We haveused the same set of 400 queries for which recommendationswere generated using the QFG built on M12.Again, the results suggested by the anecdotal evidence,

are confirmed by the assessment procedure. The model builton the two merged train segments is better than the singleM2 model (which was, in turn, better than the model builton M1). Improvements, in this case, are quite significantand range from 25% to 28% in accuracy.Using the automatic evaluation method we have investi-

• Solution: incremental update of M1 by means of “fresh data” in M2

• Graph algebra [Bordino et al., 2008];

• Some measures on the two different approaches:

• Incremental updates: 2/3 of the time w.r.t. “from scratch” strategy;

• Evaluation on the same set of 400 queries;

martedì 4 maggio 2010

Page 54: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (III)

martedì 4 maggio 2010

Page 55: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (III)

Query Set Query M12

F3

da vinci

86251 da vinci and math85436 da vinci biography83427 da vinci code on portrait82119 da vinci’s self portrait black and

white80945 flying machines79563 inventions by leonardo da vinci74346 leonardo da vinci paintings30426 friends church

survivor

7250 survivor preview7250 watch survivor episodes7236 survivor island panama7110 survivor island exile4578 survivor edition 20063980 survivor games3717 big shorts3717 baltimore bulk trash

lost

13258 lost fan site3716 lost season 23640 abcpreview.go.com3640 lost chat room3582 lost update3272 lost easter eggs1913 henry gale1858 44401858 1-800-sos-radon1858 4440

F1

anna nicolesmith

15174 anna nicole smith recent news14876 anna nicole smith and supreme court13567 anna nicole smith court appeal12768 anna nicorle smith and playboy10509 anna nicole smith pictures9832 anna nicole smith free nude pics9411 anna nicole smith show8971 anna nicole smith diet8880 branson tractors8880 bandlinoblu jeans

harleydavidson

5969 harley davidson premium sound sys-tem owners manual

4073 harley davidson ny4001 automatic motorcycles3738 cherokee harley davidson3038 harley davidson credit2562 custom harley davidson2494 harley davidson sporster2166 2002 harley davidson classic2085 regions banking2085 1998 dodge ram ground effects kits2085 adultactioncamcom

shakira

4174 hips don’t lie4174 shakira albums4140 shakira hips don’t lie video3698 shakira video3135 shakira nude3099 shakira wallpaper3020 shakira biography3018 shakira aol music2015 free video downloads

Table 7: Some examples of recommendations gen-

erated on different QFG models. Queries used to

generate recommendations are taken from differentquery sets.

filtering

threshold

average number

of useful sugges-

tions on M2

average number

of useful sugges-

tions on M120 2.91 3.64

0.5 6.23 7.95

0.65 6.23 7.94

0.75 6.18 7.9

Table 8: Recommendation statistics obtained by us-

ing the automatic evaluation method on a relatively

large set of 400 queries drawn from the most fre-

quent in the third month.

gated the main reasons why we obtain such an improvement.

Looking at the different bars in Figure 4 we can observe that

values that have had the greatest improvement are those cor-

responding to a number of suggestions larger than 8 (with

the only exceptions of the cases of 10 and 14 suggestions).

In particular the improvement is remarkable in the case of

“more than 18”useful suggestions given. Figure 5 also shows

the total number of queries having at least a certain num-

ber of useful recommendations. Again, results for M12 are

remarkably better than those referring to M1, and M2.

To conclude, we have shown that recommendations given

0

12,5

25

37,5

50

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 4: Histogram showing the number of queries

(on the y axis) having a certain number of useful

recommendations (on the x axis). Results are eval-

uated automatically.

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 5: Histogram showing the total number of

queries (on the y axis) having at least a certain num-

ber of useful recommendations (on the x axis). For

instance the third bucket shows how many queries

have at least three useful suggestions.

using QFG-based models are sensitive to the aging of the

query base on which they are built. We have also shown

the superiority of QFG-based models built on queries drawn

from larger period of time and we have shown how to build

such a models without the need of retraining them from

scratch.

7. DISTRIBUTED QFG BUILDINGIn this section we present a method to build QFGs that

exploit the observation made above on the feasibility of an

incremental approach to QFG-based model update. We

present an approach for building a QFG-based model on

a distributed architecture.

7.1 Divide-and-Conquer ApproachThe first approach exploits a parallel divide-and-conquer

computation that proceeds by dividing the query log into

m distinct segments, building the QFG in parallel on each

processor available and the iteratively merging the differentsegments until a single QFG is obtained.

The process is depicted in Figure 6. Our algorithm builds

a QFG as follows:

1. the query log is split into m parts. In the example

represented in Figure 6 files were split into 15 days

intervals;

2. the features requested to build the QFG are extracted

from the data contained in each interval. Each interval,

is processed in parallel on the different machines of the

cloud;

3. each part is compressed using the WebGraph frame-

work, obtaining a partial data-graph;

• Results:

martedì 4 maggio 2010

Page 56: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (III)

Query Set Query M12

F3

da vinci

86251 da vinci and math85436 da vinci biography83427 da vinci code on portrait82119 da vinci’s self portrait black and

white80945 flying machines79563 inventions by leonardo da vinci74346 leonardo da vinci paintings30426 friends church

survivor

7250 survivor preview7250 watch survivor episodes7236 survivor island panama7110 survivor island exile4578 survivor edition 20063980 survivor games3717 big shorts3717 baltimore bulk trash

lost

13258 lost fan site3716 lost season 23640 abcpreview.go.com3640 lost chat room3582 lost update3272 lost easter eggs1913 henry gale1858 44401858 1-800-sos-radon1858 4440

F1

anna nicolesmith

15174 anna nicole smith recent news14876 anna nicole smith and supreme court13567 anna nicole smith court appeal12768 anna nicorle smith and playboy10509 anna nicole smith pictures9832 anna nicole smith free nude pics9411 anna nicole smith show8971 anna nicole smith diet8880 branson tractors8880 bandlinoblu jeans

harleydavidson

5969 harley davidson premium sound sys-tem owners manual

4073 harley davidson ny4001 automatic motorcycles3738 cherokee harley davidson3038 harley davidson credit2562 custom harley davidson2494 harley davidson sporster2166 2002 harley davidson classic2085 regions banking2085 1998 dodge ram ground effects kits2085 adultactioncamcom

shakira

4174 hips don’t lie4174 shakira albums4140 shakira hips don’t lie video3698 shakira video3135 shakira nude3099 shakira wallpaper3020 shakira biography3018 shakira aol music2015 free video downloads

Table 7: Some examples of recommendations gen-

erated on different QFG models. Queries used to

generate recommendations are taken from differentquery sets.

filtering

threshold

average number

of useful sugges-

tions on M2

average number

of useful sugges-

tions on M120 2.91 3.64

0.5 6.23 7.95

0.65 6.23 7.94

0.75 6.18 7.9

Table 8: Recommendation statistics obtained by us-

ing the automatic evaluation method on a relatively

large set of 400 queries drawn from the most fre-

quent in the third month.

gated the main reasons why we obtain such an improvement.

Looking at the different bars in Figure 4 we can observe that

values that have had the greatest improvement are those cor-

responding to a number of suggestions larger than 8 (with

the only exceptions of the cases of 10 and 14 suggestions).

In particular the improvement is remarkable in the case of

“more than 18”useful suggestions given. Figure 5 also shows

the total number of queries having at least a certain num-

ber of useful recommendations. Again, results for M12 are

remarkably better than those referring to M1, and M2.

To conclude, we have shown that recommendations given

0

12,5

25

37,5

50

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 4: Histogram showing the number of queries

(on the y axis) having a certain number of useful

recommendations (on the x axis). Results are eval-

uated automatically.

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 5: Histogram showing the total number of

queries (on the y axis) having at least a certain num-

ber of useful recommendations (on the x axis). For

instance the third bucket shows how many queries

have at least three useful suggestions.

using QFG-based models are sensitive to the aging of the

query base on which they are built. We have also shown

the superiority of QFG-based models built on queries drawn

from larger period of time and we have shown how to build

such a models without the need of retraining them from

scratch.

7. DISTRIBUTED QFG BUILDINGIn this section we present a method to build QFGs that

exploit the observation made above on the feasibility of an

incremental approach to QFG-based model update. We

present an approach for building a QFG-based model on

a distributed architecture.

7.1 Divide-and-Conquer ApproachThe first approach exploits a parallel divide-and-conquer

computation that proceeds by dividing the query log into

m distinct segments, building the QFG in parallel on each

processor available and the iteratively merging the differentsegments until a single QFG is obtained.

The process is depicted in Figure 6. Our algorithm builds

a QFG as follows:

1. the query log is split into m parts. In the example

represented in Figure 6 files were split into 15 days

intervals;

2. the features requested to build the QFG are extracted

from the data contained in each interval. Each interval,

is processed in parallel on the different machines of the

cloud;

3. each part is compressed using the WebGraph frame-

work, obtaining a partial data-graph;

• Results:

• Average number of useful suggestion is greater in M12 than in M2, or in M1;

martedì 4 maggio 2010

Page 57: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (IV)

martedì 4 maggio 2010

Page 58: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (IV)

Query Set Query M12

F3

da vinci

86251 da vinci and math85436 da vinci biography83427 da vinci code on portrait82119 da vinci’s self portrait black and

white80945 flying machines79563 inventions by leonardo da vinci74346 leonardo da vinci paintings30426 friends church

survivor

7250 survivor preview7250 watch survivor episodes7236 survivor island panama7110 survivor island exile4578 survivor edition 20063980 survivor games3717 big shorts3717 baltimore bulk trash

lost

13258 lost fan site3716 lost season 23640 abcpreview.go.com3640 lost chat room3582 lost update3272 lost easter eggs1913 henry gale1858 44401858 1-800-sos-radon1858 4440

F1

anna nicolesmith

15174 anna nicole smith recent news14876 anna nicole smith and supreme court13567 anna nicole smith court appeal12768 anna nicorle smith and playboy10509 anna nicole smith pictures9832 anna nicole smith free nude pics9411 anna nicole smith show8971 anna nicole smith diet8880 branson tractors8880 bandlinoblu jeans

harleydavidson

5969 harley davidson premium sound sys-tem owners manual

4073 harley davidson ny4001 automatic motorcycles3738 cherokee harley davidson3038 harley davidson credit2562 custom harley davidson2494 harley davidson sporster2166 2002 harley davidson classic2085 regions banking2085 1998 dodge ram ground effects kits2085 adultactioncamcom

shakira

4174 hips don’t lie4174 shakira albums4140 shakira hips don’t lie video3698 shakira video3135 shakira nude3099 shakira wallpaper3020 shakira biography3018 shakira aol music2015 free video downloads

Table 7: Some examples of recommendations gen-

erated on different QFG models. Queries used to

generate recommendations are taken from differentquery sets.

filtering

threshold

average number

of useful sugges-

tions on M2

average number

of useful sugges-

tions on M120 2.91 3.64

0.5 6.23 7.95

0.65 6.23 7.94

0.75 6.18 7.9

Table 8: Recommendation statistics obtained by us-

ing the automatic evaluation method on a relatively

large set of 400 queries drawn from the most fre-

quent in the third month.

gated the main reasons why we obtain such an improvement.

Looking at the different bars in Figure 4 we can observe that

values that have had the greatest improvement are those cor-

responding to a number of suggestions larger than 8 (with

the only exceptions of the cases of 10 and 14 suggestions).

In particular the improvement is remarkable in the case of

“more than 18”useful suggestions given. Figure 5 also shows

the total number of queries having at least a certain num-

ber of useful recommendations. Again, results for M12 are

remarkably better than those referring to M1, and M2.

To conclude, we have shown that recommendations given

0

12,5

25

37,5

50

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 4: Histogram showing the number of queries

(on the y axis) having a certain number of useful

recommendations (on the x axis). Results are eval-

uated automatically.

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 5: Histogram showing the total number of

queries (on the y axis) having at least a certain num-

ber of useful recommendations (on the x axis). For

instance the third bucket shows how many queries

have at least three useful suggestions.

using QFG-based models are sensitive to the aging of the

query base on which they are built. We have also shown

the superiority of QFG-based models built on queries drawn

from larger period of time and we have shown how to build

such a models without the need of retraining them from

scratch.

7. DISTRIBUTED QFG BUILDINGIn this section we present a method to build QFGs that

exploit the observation made above on the feasibility of an

incremental approach to QFG-based model update. We

present an approach for building a QFG-based model on

a distributed architecture.

7.1 Divide-and-Conquer ApproachThe first approach exploits a parallel divide-and-conquer

computation that proceeds by dividing the query log into

m distinct segments, building the QFG in parallel on each

processor available and the iteratively merging the differentsegments until a single QFG is obtained.

The process is depicted in Figure 6. Our algorithm builds

a QFG as follows:

1. the query log is split into m parts. In the example

represented in Figure 6 files were split into 15 days

intervals;

2. the features requested to build the QFG are extracted

from the data contained in each interval. Each interval,

is processed in parallel on the different machines of the

cloud;

3. each part is compressed using the WebGraph frame-

work, obtaining a partial data-graph;

• On a histogram (cumulative distribution):

martedì 4 maggio 2010

Page 59: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Combating the Aging Effect (IV)

Query Set Query M12

F3

da vinci

86251 da vinci and math85436 da vinci biography83427 da vinci code on portrait82119 da vinci’s self portrait black and

white80945 flying machines79563 inventions by leonardo da vinci74346 leonardo da vinci paintings30426 friends church

survivor

7250 survivor preview7250 watch survivor episodes7236 survivor island panama7110 survivor island exile4578 survivor edition 20063980 survivor games3717 big shorts3717 baltimore bulk trash

lost

13258 lost fan site3716 lost season 23640 abcpreview.go.com3640 lost chat room3582 lost update3272 lost easter eggs1913 henry gale1858 44401858 1-800-sos-radon1858 4440

F1

anna nicolesmith

15174 anna nicole smith recent news14876 anna nicole smith and supreme court13567 anna nicole smith court appeal12768 anna nicorle smith and playboy10509 anna nicole smith pictures9832 anna nicole smith free nude pics9411 anna nicole smith show8971 anna nicole smith diet8880 branson tractors8880 bandlinoblu jeans

harleydavidson

5969 harley davidson premium sound sys-tem owners manual

4073 harley davidson ny4001 automatic motorcycles3738 cherokee harley davidson3038 harley davidson credit2562 custom harley davidson2494 harley davidson sporster2166 2002 harley davidson classic2085 regions banking2085 1998 dodge ram ground effects kits2085 adultactioncamcom

shakira

4174 hips don’t lie4174 shakira albums4140 shakira hips don’t lie video3698 shakira video3135 shakira nude3099 shakira wallpaper3020 shakira biography3018 shakira aol music2015 free video downloads

Table 7: Some examples of recommendations gen-

erated on different QFG models. Queries used to

generate recommendations are taken from differentquery sets.

filtering

threshold

average number

of useful sugges-

tions on M2

average number

of useful sugges-

tions on M120 2.91 3.64

0.5 6.23 7.95

0.65 6.23 7.94

0.75 6.18 7.9

Table 8: Recommendation statistics obtained by us-

ing the automatic evaluation method on a relatively

large set of 400 queries drawn from the most fre-

quent in the third month.

gated the main reasons why we obtain such an improvement.

Looking at the different bars in Figure 4 we can observe that

values that have had the greatest improvement are those cor-

responding to a number of suggestions larger than 8 (with

the only exceptions of the cases of 10 and 14 suggestions).

In particular the improvement is remarkable in the case of

“more than 18”useful suggestions given. Figure 5 also shows

the total number of queries having at least a certain num-

ber of useful recommendations. Again, results for M12 are

remarkably better than those referring to M1, and M2.

To conclude, we have shown that recommendations given

0

12,5

25

37,5

50

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 4: Histogram showing the number of queries

(on the y axis) having a certain number of useful

recommendations (on the x axis). Results are eval-

uated automatically.

0

100

200

300

400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >18

M1 M2 M12

Figure 5: Histogram showing the total number of

queries (on the y axis) having at least a certain num-

ber of useful recommendations (on the x axis). For

instance the third bucket shows how many queries

have at least three useful suggestions.

using QFG-based models are sensitive to the aging of the

query base on which they are built. We have also shown

the superiority of QFG-based models built on queries drawn

from larger period of time and we have shown how to build

such a models without the need of retraining them from

scratch.

7. DISTRIBUTED QFG BUILDINGIn this section we present a method to build QFGs that

exploit the observation made above on the feasibility of an

incremental approach to QFG-based model update. We

present an approach for building a QFG-based model on

a distributed architecture.

7.1 Divide-and-Conquer ApproachThe first approach exploits a parallel divide-and-conquer

computation that proceeds by dividing the query log into

m distinct segments, building the QFG in parallel on each

processor available and the iteratively merging the differentsegments until a single QFG is obtained.

The process is depicted in Figure 6. Our algorithm builds

a QFG as follows:

1. the query log is split into m parts. In the example

represented in Figure 6 files were split into 15 days

intervals;

2. the features requested to build the QFG are extracted

from the data contained in each interval. Each interval,

is processed in parallel on the different machines of the

cloud;

3. each part is compressed using the WebGraph frame-

work, obtaining a partial data-graph;

• On a histogram (cumulative distribution):

• Results on M12 are always better than M1, and M2;

• large improvement of queries with at least four good suggestions;

martedì 4 maggio 2010

Page 60: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Distributed QFG Building

martedì 4 maggio 2010

Page 61: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Distributed QFG Building

• a parallel way to update QFGs: Divide-and-Conquer approach;

• the query log is split in m parts;

• parallel extraction of the features;

• compressing step;

• merging graphs;

• final operations (normalization, pagerank, etc.);

4. using the graph algebra described in [8], each partialgraph is iteratively merged. Each iteration is done inparallel on the different available nodes of the cloud;

5. the final resulting data-graph is now processed withother steps [4] (normalization, chain extraction, ran-dom walk) to obtain the complete and usable QFG.

!"#$%&'#

!#()*+,#-./#

01)2()*+,'#3456&#7)8#

!#()*+,#-./#

!"#$%&'# !"#$%&'# !"#$%&'#

-./# -./# -./# -./#

9#()*+,'#-./#

Figure 6: Example of the building of a two monthsquery flow graph with a “parallel” approach.

Table 9 summarizes the computational costs of building aQFG in a distributed way. The main benefit of this approachis to significantly reduce the time needed to perform thepreliminary generation step.

generation of 15-days data-graph 6 min 15 secmerge of two 15-days data-graphs 5 min 23 sec

merge of two one-month data-graphs 11 min 04 secTOTAL 22 min 42 sec

Table 9: Time needed to build a two-months data-graph using our “incremental” approach and split-ting the query log in 4 parts.

8. CONCLUSION AND FUTURE WORK

In this paper we have studied the effect of time on recom-mendations generated using Query Flow Graphs [4] (QFGs).These models aggregate information in a query log by pro-viding a markov-chain representation of the query reformu-lation process followed by multiple users. In this paperwe have shown how to extend QFG-based recommendationmodels to evolving data. We have shown that the inter-ests of search-engine users change over time and new top-ics may become popular, while other that focused for sometime the attention of the crowds can suddenly loose impor-tance. The knowledge extracted from query logs can thussuffer from an aging effect, and the models used for recom-mendations rapidly become unable to generate useful andinteresting suggestions. We have shown that the building ofa new fresh QFG from scratch is expensive. To overcome

this problem we have introduced an incremental algorithmfor updating an existing QFG. The solution proposed allowsthe recommendation model to be kept always updated byincrementally adding fresh knowledge and deleting the agedone.In order to validate our claims and assess our methodol-

ogy, we built different QFGs from the query log of a real-world search engine, and we analyze the quality of the rec-ommendation models obtained from these graphs to showthat they inexorably age. It is worth noticing that a com-prehensive user-study is (almost) impossible on this kindof task. To assess the effect of aging with a survey, wewould need to make users aware of the context and newsevents happened in the period to which the query log isreferred (March-May 2006). Then, we have proposed a gen-eral methodology for dealing with aging QFG models thatallows the recommendation model to be kept up-to-datedin a remarkably lower time than that required for buildinga new model from scratch. As a side result we have pro-posed a parallel/distributed solution allowing to make QFGcreation/update operations scalable.

9. REFERENCES[1] R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query

Recommendation Using Query Logs in Search Engines,volume 3268/2004 of LNCS. Springer, 2004.

[2] R. Baraglia, C. Castillo, D. Donato, F. M. Nardini,R. Perego, and F. Silvestri. Aging effects on query flowgraphs for query suggestion. In Proc. CIKM’09. ACM,2009.

[3] D. Beeferman and A. Berger. Agglomerative clusteringof a search engine query log. In Proc. KDD’00. ACM,2000.

[4] P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis,and S. Vigna. The query-flow graph: model andapplications. In Proc. CIKM’08. ACM, 2008.

[5] P. Boldi, F. Bonchi, C. Castillo, D. Donato, andS. Vigna. Query suggestions using query-flow graphs.In Proc. WSCD’09. ACM, 2009.

[6] P. Boldi, F. Bonchi, C. Castillo, and S. Vigna. From’dango’ to ’japanese cakes’: Query reformulationmodels and patterns. In Proc WI’09. IEEE CS, 2009.

[7] P. Boldi and S. Vigna. The webgraph framework i:compression techniques. In Proc. WWW’04. ACMPress, 2004.

[8] I. Bordino and D. Laguardia. Algebra for the jointmining of query log graphs.http://www.dis.uniroma1.it/∼bordino/qlogs algebra/,2008.

[9] B. M. Fonseca, P. Golgher, B. Possas, B. Ribeiro-Neto,and N. Ziviani. Concept-based interactive queryexpansion. In Proc. CIKM’05. ACM, 2005.

[10] T. H. Haveliwala. Topic-sensitive pagerank. In Proc.WWW’02. ACM, 2002.

[11] R. Jones, B. Rey, O. Madani, and W. Greiner.Generating query substitutions. In Proc. WWW’06.ACM Press, 2006.

[12] J. Kleinberg. Bursty and hierarchical structure instreams. In Proc. KDD’02. ACM, 2002.

[13] H. Tong, C. Faloutsos, and J.-Y. Pan. Fast randomwalk with restart and its applications. In Proc.ICDM’06. IEEE CS, 2006.

martedì 4 maggio 2010

Page 62: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Conclusions

martedì 4 maggio 2010

Page 63: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Conclusions

• We study the effects of time on QFG-based query recommender systems;

martedì 4 maggio 2010

Page 64: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Conclusions

• We study the effects of time on QFG-based query recommender systems;

• We built different QFGs from the AOL query log;

• we analyze the quality of recommendation;

• we show that recommendation models ages;

• we introduce an “incremental” algorithm for updating the model;

• we propose a parallel/distributed way of building QFGs;

martedì 4 maggio 2010

Page 65: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Future Works

martedì 4 maggio 2010

Page 66: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Future Works

• to define a strategy for merging graphs assigning different weights to each subgraph;

• more importance to “fresh” data;

martedì 4 maggio 2010

Page 67: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Future Works

• to define a strategy for merging graphs assigning different weights to each subgraph;

• more importance to “fresh” data;

• to compare the robustness of QFG recommender systems with other query recommenders with respect to aging;

martedì 4 maggio 2010

Page 68: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Future Works

• to define a strategy for merging graphs assigning different weights to each subgraph;

• more importance to “fresh” data;

• to compare the robustness of QFG recommender systems with other query recommenders with respect to aging;

• to design a MapReduce algorithm to build and update efficiently QFGs recommender systems;

martedì 4 maggio 2010

Page 69: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

Questions?

Thank you for your attention!

martedì 4 maggio 2010

Page 70: The Effects of Time on Query Flow Graph-based Models for Query Suggestion

References

• [Boldi et al., CIKM’08]: The Query Flow Graph: model and applications. Boldi, Bonchi, Castillo, Donato, Gionis, Vigna. CIKM’08.

• [Boldi et al., WSCD’09]: Query Suggestions using Query-Flow Graphs. Boldi, Bonchi, Castillo, Donato, Vigna. WSCD’09.

• [Bordino et al., 2008]: Algebra for the joint mining of query log graphs, 2008.

martedì 4 maggio 2010