11

Click here to load reader

Pageranking

Embed Size (px)

DESCRIPTION

how page ranking in search engines work

Citation preview

Page 1: Pageranking

Page Ranking

presented by

Arvind,Chintan,Raveendra

Page 2: Pageranking

Motivation

● World Wide Web was released in 1991 and within few years, it justified its nam e.

● In January 2001, the num ber of hosts stood at 110 m illion and the num ber of web-sites had reached 30 m illion .

● A search engine m ust search through these m illions of sites to give m ost ' relevan t' results to the user.

● There the concept of ranking of pages com es usefu l.

Page 3: Pageranking

Goals of page ranking

• A page m ust have a h igh PageRank if there are m any pages that poin t to it.

• If there are som e pages that poin t to it and have a h igh PageRank then also it m ust have h igh rank.

• Pages that are well cited from m any p laces (like http :/ / www.iisc.ernet.in / ) around the web are

worth looking at. • Pages that have perhaps on ly one citation from

som ething like the Yahoo! hom epage are also generally worth looking at.

Page 4: Pageranking

Exam ple• A Problem sim ilar to page ranking arises in

rating sport team s• Consider the ranking of cricket team s. Their

perform ance in a tournam en t is shown below.

• Can win coun t alone suffice for rating team s?• Here we like to rate A higher than B since A won

against B.

Won(W) /Lost(L) India Australia Pakistan Kenya Total Wins

India ( A ) - W W L 2

Australia ( B ) L - W W 2

Pakistan ( C ) L L - W 1

Kenya ( D ) W L L - 1

Page 5: Pageranking

Exam ple(...con td.)• Consider the graph, where an edge 

is drawn from loser to winner. 

• First assign equal weights (w) to every one and then assign them new weights(w') as follows:

where i lost against a

k(i) is total losses of team i

This is because, we want a team to go higher up the ranking for winning against a team which is already higher up the ranking than for winning against a team which is not highly rated as shown in 2nd figure.

w ' ai

w i k i

Page 6: Pageranking

•Weights get refined in successive iterations as shown in diagram s beside.•Continuing like th is we converge to an equilibrium state as shown in figure below:

Exam ple(...con td.)

Page 7: Pageranking

Eigen Vectors

• Speaking in term s of m atrices, we are using a m atrix norm alized along the colum ns

• Then we are m ultip lying M by the in itial weight vector W to get a new weight vector W' which is again m ultip led by M to get W' ' and so on un til we get a vector Wi' such that

W ' = M * W ' = W '

0 1 ½ 00 0 ½ ½0 0 0 ½1 0 0 0

M =

ii+1 i

Page 8: Pageranking

Eigen Vectors (...con td.)

• This is nothing but the eigen -value problem with eigen -value 1.

• That is we wan t to solve the equation M * W = W

i.e. we wan t to find an eigenvector W with eigen -value 1.

● We could directly have used th is concept to find the required weights for the team s.

Page 9: Pageranking

Extending to web pages

• We can use the sam e concept to find weights for web pages and rank them .

csa_showcase.com

yahooindia.com linux.org

bogus.com

waste.com

1/ 3

1/ 3

1/ 3

1/ 31/ 3

1/ 3

1/ 2 1/ 2

1

1

Here, M =

Solving the equation M*W=Wgives us the following weights:

csa_showcase.com 0.4

linux.org 0.4

yahooindia.com 0.2

bogus.com 0

waste.com 0

Page 10: Pageranking

Add- ons

• Page Rank com putation can be considered as a stationary distribution of Markov chains.

• The eigen -value com putation W=M.W can be considered as finding the fixed poin t. This is sim ilar to the equation X=F(X) ( whose convergen t value can be iteratively found, say by Newton Raphson m ethod)

Page 11: Pageranking

Reference

• Lawrence Page, Sergey Brin , Rajeev Motwan i, and Terry Winograd. “The PageRank Citation Ranking: Bringing Order to the Web” Techn ical Report, Stan ford Un iversity, 1998