68
Amy N. Langville Mathematics Department College of Charleston [email protected] Math Meet 2/20/10

Google- opoly

Embed Size (px)

DESCRIPTION

Google- opoly. Amy N. Langville Mathematics Department College of Charleston [email protected]. Math Meet 2/20/10. Outline. Short History of Web Search Link Analysis and Google’s PageRank The Random Surfer Google-opoly March Madness Conclusion. Thesis. 1998. - PowerPoint PPT Presentation

Citation preview

Page 1: Google- opoly

Amy N. LangvilleMathematics Department

College of [email protected]

Math Meet 2/20/10

Page 2: Google- opoly

Outline

Short History of Web SearchLink Analysis and Google’s PageRankThe Random SurferGoogle-opolyMarch MadnessConclusion

Page 3: Google- opoly

Thesis

1998

Page 4: Google- opoly

Pre-1998 Web

Trip back in time to 1995– How did you find information then?

Page 5: Google- opoly

Pre-1998 Web

Trip back in time to 1995– How did you find information then?– Better question:

Page 6: Google- opoly

Pre-1998 Web

Trip back in time to 1995– How did you find information then?– Better question: how old were you then?

Page 7: Google- opoly

Pre-1998 Web

Trip back in time to 1995– How did you find information then?– Better question: how old were you then?

Page 8: Google- opoly

Inverted IndexMain tool of pre-1998 search engines

Page 9: Google- opoly

Problems with the Inverted Index

•Too many pages

Page 10: Google- opoly
Page 11: Google- opoly

Problems with the Inverted Index

• Too many pages• Spam

Page 12: Google- opoly

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Page 13: Google- opoly
Page 14: Google- opoly
Page 15: Google- opoly

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Page 16: Google- opoly

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Page 17: Google- opoly

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Learn how to make millions

Win a ipod

Text 8 if you’re awake

Page 18: Google- opoly

Link Analysis

• pre-1998 engines only used text analysis.

• Link analysis saved search from SEOs and built companies like Google, Yahoo, Ask.

• Nearly every major search engine uses link analysis.

1998text analysis Link analysis

Page 19: Google- opoly

Link Analysis1998

text analysis Link analysis

Page 20: Google- opoly
Page 21: Google- opoly

Moral #1

Sometimes being perceived as an expert forces you to become one.

Page 22: Google- opoly

What happens when you google?

All the old text analysis + the new link analysis

Page 23: Google- opoly

What happens when you google?

ranked list

1

2

3

4

5 6

7

8

Page 24: Google- opoly

Why are rankings so important?

Page 25: Google- opoly

Web as a graph

Each node is a webpage.

Each arrow is a hyperlink.

Page 26: Google- opoly

In-links vs. Out-links

Page 27: Google- opoly

A Trip to Google-topia

Emmie

Randy, the Random Surfer

video clip

Page 28: Google- opoly

A Random Walk on the Web graph

Page 29: Google- opoly
Page 30: Google- opoly
Page 31: Google- opoly
Page 32: Google- opoly
Page 33: Google- opoly

Matrix Notation

Page 34: Google- opoly

BUT THERE ARE SOME PROBLEMS!

Page 35: Google- opoly
Page 36: Google- opoly
Page 37: Google- opoly

The surfer gets stuck!

This is called a dangling node.

How does Google fix this?

Page 38: Google- opoly

The surfer can “teleport”

We add a link from the dangling node to every other node.

When web surfing, this is equivalent to typing an address in the URL bar.

Page 39: Google- opoly

Probability Matrix

We must also take this into consideration for our probability matrix.

Page 40: Google- opoly

Dangling nodes and teleportation

video clip

Page 41: Google- opoly

Let’s look at another problem.

Page 42: Google- opoly
Page 43: Google- opoly
Page 44: Google- opoly
Page 45: Google- opoly
Page 46: Google- opoly
Page 47: Google- opoly
Page 48: Google- opoly
Page 49: Google- opoly

Our surfer gets stuck in the webpages 4, 5, and 6.

This is called a cycle.

How do we fix this?

Page 50: Google- opoly

Cycling

video clip

Page 51: Google- opoly

Full Teleportation

We must consider the possibility of, at any time, using the URL bar to type an address.

We add an extra link from every vertex to every other vertex.

Page 52: Google- opoly

Surfing vs. teleporting

Do people always use the URL bar as much as they use hyperlinks?

Google doesn’t think so. They think you only use the URL about 15%

of the time.

Page 53: Google- opoly

Computing PageRank by observing Randy

video clip

Page 54: Google- opoly

Summary of Ranking

Search query

Pull out relevant webpages from inverted index

Use PageRank and other information to rank webpages

Page 55: Google- opoly

Creators of Google

Sergey Brin and Larry Page

Computer Science majors

Now entire PhD programs in information retrieval

Page 56: Google- opoly

Creators of Google

Sergey Brin and Larry Page

Computer Science majors

Now entire PhD programs in information retrieval

The world’s largest eigenvector computationThe world’s largest eigenvector computation

Page 57: Google- opoly

Moral #2

Take a leave of absence for brilliant ideas.

Page 58: Google- opoly

More on PageRank

SIAM’s WhydoMath? Project– url =http://dev.whydomath.org/node/google/index.html

DDL on PageRank – url = http://spinner.cofc.edu/~langvillea/DISSECTION-LAB/ClarePageRankModule/

1_WebLetter.html?referrer=webcluster& LOCI: Google-opoly

– url=http://mathdl.maa.org/mathDL/23/?pa=content&sa=viewDocument&nodeId=3355

Page 59: Google- opoly

Moral #3

The more ways you can view a problem, the more likely you are to truly understand it, and hence, solve it.

Page 60: Google- opoly

Google-opoly

applets

Page 61: Google- opoly

March MadnessHow should teams vote?

• Losing teams give one vote to each team that beats them.

• Losing teams vote with margin of victory.

• Both winning and losing teams vote with # points scored.

Page 62: Google- opoly

Point Differential Voting

Page 63: Google- opoly
Page 64: Google- opoly
Page 65: Google- opoly
Page 66: Google- opoly

Moral #4

Now is a great time to do math.

Page 67: Google- opoly

Conclusion

PageRank is a sophisticated algorithm that set Google apart

The Web can be represented with graphs and matrices

PageRank’s idea of Voting has many applications.

Page 68: Google- opoly

Acknowledgements

Tim ChartierCarl MeyerEmmie DouglasKathryn PedingsClare RodgersErich KreutzerBen KovanichRyan Dumville

Luke IngramAnjela GovanNick DovidioYoshi YamamotoNeil GoodsonColin Stephenson