Query recommendation by search filters and criteria prediction

Machine Learning for Search Criteria Prediction

Dhwaj Raj

Convention

Query

=

Search Critera

=

Set of search filters chosen by user

User searches : at our real estate portal

2 3 4 5 6 7 8 9 100

10000

20000

30000

40000

50000

60000

7000069121

30963

17332

95615911 3819 2481 1595 1206

Number of searches in an active user session

UsersOver1 week

Long tail samples ...

Average Searches in active user session = 3.6

Aim of Query/ Search Criteria Recommendation

● One of the major ways to improve user experience.

● Minimize search actions while fulfulling the user intent.

● Facilitate and Guide users through their information-seeking tasks.

● Help them explore alternate ways to express their intent.

Introduction

● We propose search log mining to capture collective intelligence for providing query suggestions by predicting user's future search queries.

● When enough history is observed for a given query chain, future queries can be accurately modeled.

● Understanding what users are querying in the current search sessions, and predict what they will search next have many potential applications such as query recommendation, web page re-ordering, advertisement arrangement, and so on.

Suggested Applications

● query recommendation

Suggested Applications* Page Re-ordering

Suggested Applications* Ad Placement


● Use prediction from history chains for query based personalization

● Using feedback to search relevance.


● AND MANY OTHER APPLICATIONS …..

● Also, This technology of action chain based predictions can be used for domains other than search query as well. Such as listing recommendation, page action prediction etc.

Search Query Chains

● User1: A → B → C● User2: B → C → E● User3: A → B → C → E● User4: B → C → F● User5: A → E

● Where A, B, C etc are search queries/criterias● UserNew1: B → C → ?● UserNew2: A → ?

Basis of Model

● Following experiments were done:

1. Distribution of 2 action query chain does not correlate with high order query chains. For e.g distribution of E → F does not correlate well with distribution of B → E → F, or C → E → F or B → C → E → F

– Observation: probability of step1 → step2 also depends conditionally on the action history i.e. on past actions used to reach step1.

Basis of Model

2. Search criteria is a set of filter entities:

A,B,C can be represented as (e1,e2,e3,e4,e5...)

Where e1,e2 etc are entity types for e.g. proptype, city, locality, budget etc.

● Experiments show that probability of a criteria state A,B or C depends conditionally on set of entities present in the criteria.

For eg. Probability of A(resi,sector105noida,underconstr) → B(resi,sector110noida,underconstr)

has a different probability distribution thanA(resi,sector105noida,readyPosess) → B(resi,sector110noida,readyPosess)

and

A(resi,sector105noida) → B(resi,sector110noida)

Basis of Model

3. Experiments were to done to observe correlation among distribution of history chains.● Observation:

A → B → C → E

correlate with B → A → C → E but same history order chains have higher correlation.

● Conclusion: Order matters in history but unordered history also correlate well,

so order should be given improtance but should not be sole criteria for model.

● Use bayesian independence but give reward to same order.P( next | A → B → C → D) = k. P(next | B → A → C → D)

Markov Chains & decision process

At each time step, the process is in some state s, and the decision maker may choose any action a that is available in state s. The process responds at the next time step by randomly moving into a new state s', and giving the decision maker a corresponding reward R_a(s,s').

Conditianal Probability computations to make these chain graphs memoryless:

Constructing query-flow graphs

● query flow graphs are generally stored as a transition matrix.

● Each cell is filled with computed conditional probability.

A B C

AA 0.1 0 0.3

AB 0.9 0 0

AC 0 0.2 0

BA 0 0.1 0

BC 0.6 0.3 0

A B C

A 0 0 0.02

B 0 0 0

C 0.1 0.5 0

D 0 0 0

E 0 0.1 0

Sub-Chains & Sub-Criteria-ChainsHelp reduce sparseness.

● training from log: resi_flat_sale_sector12noida → resi_flat_sale_sector22noida

● Testing for : resi_flat_sale_sector12noida_10-25laks → ?

Sub Chains● Window shifting algorithm

A → B → C → D → E → F → G → H transform to : (A → B → C, B → C → D, A → B, B → C, etc...)

● Skip sequence

A → B → C → D → E → F → G → H transform to : ( A → B → D, A → B → E etc..)

Sub Criteria Chains● resi_flat_sale_sector12noida_10-25laks_unfurnished → resi_flat_sale_sector10noida_25-

30laks_unfurnished

Transforms to (resi_flat_sale_sector12noida → resi_flat_sale_sector10noida, resi_flat_sale_sector12noida_10-25laks → resi_flat_sale_sector10noida_25-30laks, resi_flat_sale_sector12noida_unfurnished → resi_flat_sale_sector10noida_unfurnished)

● Only minor criteria items get replaced because major criteria is representative of an unique search focus.

Query Log Mining

● Visitor session info● Visitor search actions● Active Session : current session tracking keeps

session alive for days. Written optimized procedure queries to change vam id for every change in major criteria and for every idle time of 1hr.

● Clustering similar entity values to reduce sparseness.

Assigning Weights

Storing query flow graphs

● Transition matrices are very sparse and eat up un-necessary space for 0 prob pairs.

● To overcome these we use state transition map:

● These query flow graph will be queried very frequently.

State seq next prob

A->B->C D 0.02

A->B->C E 0.05

A->B C 0.1

Retrieval -1● Query flow graph can be seen as a key value

map where key is (stateseq) and val is (next,probability) pair.

● To utilize the system for various applications including real time interaction, the retrieval part should be really fast (>100ms).

● Key set size > 30 million● Inverted index storages like lucene etc are not

used because insertion is very slow for given key set size.

Retrieval - 2● Relational databases like mysql etc yielded slow insertion and

slower retrieval.

● In memory structures like redis or hashmaps were not able to occupy all keys.

● Finally we used Cassandra NoSQL which is giving a very good insertion and retrieval performance.

● Advantage of cassandra : can be expanded and extended to our need and it's cool to use.

● Ref: using_cassandra

Prediction

1. Predicting next entity type to be clicked. For e.g. next filter will be on locality, or furnishing or budget etc.

2. Predicting next entity value to be clicked. For e.g. for each entity type which value will be chosen next such as which locality or which budget range.

Learned weights

QueryFlow

TransitionMap

Input{curr state,

past state seq}

Sub Chains Generator

Sub Criteria Chains generatorState Fetcher

Predictor

Entity type predictionEntity value Prediction

Already prepared in training phase

What is not covered in this project● Some of the work on query recommendation focuses on measures of

query attribute similarity based on domain knowledgebase but we have not included this in project scope.

● For 99acres it can be seen as locality affinity etc.

● Why not : maintainance issues.

● Why not : we do not want too much dependency on domain specific data rather it should stay as plug and play model working only on query logs.

● Why not: domain similarity unable to capture crowd's wisdom– Conditional nature of attribute dependency. For e.g user's choice for next locality also

depend on choice of budget, bedroom etc.

– For eg. A user searching for sector xx is presented sector yy due to affinity but users are aware of the current affairs and they do not want to search in yy because HC has banned work there.

– A user is searching for jaypee kosmos and is presented supertech upcountry(near) but users had some knowledge of hidden attributes like FRR and green space so he makes next search to jaypee aman(far).

– Our Assumption: user crowd have more knowledge of current trends and ground realities so entity relations can be better captured from their behaviors.

Results● In user behavior modelling, it is very difficult to say that model is x%

accurate for new user.

● Even if we model accurately and predict accurately majority of users may have different search focuses. So accuracy can be measured truly by presenting the prediction to the user and then evaluating that how many users we influenced.

● So in user behavior modeling we can conclude with data that our model fits well for already seen users/history.

● In our case, a full fledged demo was created with 10 days of search_criteria training.

● On same validation set it provides accuracy of 79%.

● On separate training and fresh test sets accuracy of 67%.

● Demo has been hosted on local server for testing (link to be distributed).

● Thank you.

Technology

Query recommendation by search filters and criteria prediction