15
Personalized Web Page Ranking using Collaborative Filtering MADE BY: HARSHIT AGGARWAL 10103417

Personalized Web Page Ranking Using Collaborative Filtering

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Personalized Web Page Ranking Using Collaborative Filtering

Personalized Web Page Ranking using Collaborative Filtering

MADE BY:

HARSHIT AGGARWAL 10103417

Page 2: Personalized Web Page Ranking Using Collaborative Filtering

What is the first thing that comes to your mind when you hear the phrase search engine ??

Page 3: Personalized Web Page Ranking Using Collaborative Filtering

What algorithm does Google use ?

PAGE RANK ALGORITHM

Page 4: Personalized Web Page Ranking Using Collaborative Filtering

PageRank Page rank is a numeric value given to a website depending on

the importance the web page has like their unique content, back links, site structure, anchor text and many more.

PageRank is a link analysis algorithm and it assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set.

The original PageRank algorithm was described by Lawrence Page and Sergey Brin in several publications. It is given by PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) where PR(A) is

the PageRank of page A,

PR(Ti) is the PageRank of pages Ti which link to page A,

C(Ti) is the number of outbound links on page Ti and

d is a damping factor which can be set between 0 and 1.

Page 5: Personalized Web Page Ranking Using Collaborative Filtering

Importance and Need of Page Ranking

Adds weight and authority to the website

Denotes Web Page’s popularity

Advertisers use your PageRank as basis for ad placements

As the size of the web is enormous, we need to mine the data before displaying it to the user. This data Is then ranked so that the most relevant pages come up to the user and increase his HAPPINESS and save his time.

Page 6: Personalized Web Page Ranking Using Collaborative Filtering

BUT GOOGLE STILL HAS SOME ISSUES?

CONFUSED?

Page 7: Personalized Web Page Ranking Using Collaborative Filtering

LET ME GIVE YOU AN EXAMPLE ?

LETS GOOGLE LEOPARD??

Page 8: Personalized Web Page Ranking Using Collaborative Filtering

YOU WOULD THINK WHATS WRONG WITH THE RESULTS ??

THE SEARCH RESULT RETRIEVED IS GREAT IF YOU ARE A BIOLOGIST

THINK LIKE A COMPUTER GEEK WHO WAS ACTUALLY SEARCHING FOR LEOPARD THE MAC OS.

Page 9: Personalized Web Page Ranking Using Collaborative Filtering

ANOTHER EXAMPLE GOOGLE “MOUSE”

THIS RESULT WILL MAKE THE BIOLOGIST UNHAPPY

Page 10: Personalized Web Page Ranking Using Collaborative Filtering

WHATS THE SOLUTION ??

PERSONALIZED PAGE RANKING

For a given query, a personalized Web search can

provide different search results for different users or

organize search results differently for each user, based

upon their interests, preferences, and information

needs. Personalized web search differs from generic

web search, which returns identical research results

to all users for identical queries, regardless of varied

user interests and information needs.

Page 11: Personalized Web Page Ranking Using Collaborative Filtering

COLLABRATIVE FILTERING

Page 12: Personalized Web Page Ranking Using Collaborative Filtering

HOW DO WE ACHIEVE IT ??

We make users rate pages on the basis of their tastes.

We find other users similar to these users.

We then predict how users would have rated pages they have not yet rated.

And then using these predictions different users get different search results

Page 13: Personalized Web Page Ranking Using Collaborative Filtering

What is the problem here ??

People aren’t willing to give their personal info to websites after PRISM.

Page 14: Personalized Web Page Ranking Using Collaborative Filtering

How do we tackle it ?

At present we are assuming that every user uses a single IP Address.

And the user is identified by this IP address.

Page 15: Personalized Web Page Ranking Using Collaborative Filtering

Limitations of the Solution:

If two pages get the same score they cant be ranked properly as we are not sure about the relevance of the data out of the both.

Biased user-User may falsely give ratings on a particular page and alter its page rank.

Data loss during crawling as some of the pages cannot be crawled due to privacy issues of the site provider.

Cold start for user as well as content