13
Search Engine and Search Engine and SEO SEO Presented by Yanni Li Presented by Yanni Li

Search Engine and SEO

Embed Size (px)

DESCRIPTION

Search Engine and SEO. Presented by Yanni Li. Various Components of Search Engine. History. Meta Tag - a hypertext markup language to show the properties of the webpage or website - PowerPoint PPT Presentation

Citation preview

Search Engine and SEOSearch Engine and SEO

Presented by Yanni LiPresented by Yanni Li

Various Components of Search EngineVarious Components of Search Engine

HistoryHistoryMeta Tag Meta Tag - a hypertext markup language - a hypertext markup language

to show the properties of the webpage or to show the properties of the webpage or websitewebsite

However, it's soon found that ranking of However, it's soon found that ranking of search results have a huge benefit space, search results have a huge benefit space, some webmasters abused Meta Tags by some webmasters abused Meta Tags by including including irrelevantirrelevant keywords to keywords to artificiallyartificially increase type impressions for their increase type impressions for their websites and increase their ad revenueswebsites and increase their ad revenues

What is SEO?What is SEO?Search engine optimization (SEO) is the Search engine optimization (SEO) is the

process of improving the volume or quality process of improving the volume or quality of traffic to a web site from search engines of traffic to a web site from search engines via "via "naturalnatural" or " or un-paidun-paid search results. search results.

SEO has developed into a profession .SEO has developed into a profession .

Before starting, the first thing needs to Before starting, the first thing needs to understand is how SEs rank websites.understand is how SEs rank websites.

SE Ranks Documents by SE Ranks Documents by ScoresScoresGenerally, SE rank documents by their Generally, SE rank documents by their

estimation of the estimation of the usefulness usefulness of a of a document for a user query.Most SE document for a user query.Most SE systems assign a numeric score to every systems assign a numeric score to every document and document and rank documents by this rank documents by this score.score.

Different SEs use different scoring Different SEs use different scoring mechanisms. mechanisms.

Google make heavy use of the Google make heavy use of the structurestructure present in present in hypertext.hypertext.

GoogleGoogle (( 11))The simplest case is a The simplest case is a single word querysingle word query. .

In order to rank a document with a single In order to rank a document with a single word query, Google looks at that word query, Google looks at that document's document's hit listhit list for for that wordthat word. Google . Google considers each hit to be one of several considers each hit to be one of several different types (title, anchor, URL, plain different types (title, anchor, URL, plain text large font, plain text small font ...), text large font, plain text small font ...), each of which has its own each of which has its own type-weight.type-weight.

GoogleGoogle (( 22)) The type-weights make up a vector indexed The type-weights make up a vector indexed

by type. Google counts the by type. Google counts the number of hits number of hits of of each type in the hit list. Then every count is each type in the hit list. Then every count is converted into a count-weight. Count-weights converted into a count-weight. Count-weights increase linearlyincrease linearly with counts at first but with counts at first but quickly taper off so that quickly taper off so that more than a certain more than a certain count will not helpcount will not help. Google take the dot . Google take the dot product of the vector of count-weights with product of the vector of count-weights with the vector of type-weights to compute an IR the vector of type-weights to compute an IR score for the document.score for the document.

Two Kinds of SEOTwo Kinds of SEO White Hat SEO White Hat SEO

-- conforms to the search engines' guidelines -- conforms to the search engines' guidelines and involves no deceptionand involves no deception

--create content for --create content for usersusers and search engines and search engines

Black Hat SEO Black Hat SEO

--tend to --tend to deceivedeceive search engine search engine

---content a search engine indexes and ranks ---content a search engine indexes and ranks

isn’t the same isn’t the same as the content a user will see. as the content a user will see.

Some White Hat SEOsSome White Hat SEOs Domain SelectionDomain Selection

-choose a domain that has keywords-choose a domain that has keywords Design friendly webpagesDesign friendly webpages

-- don’t like too much flash, java script...-- don’t like too much flash, java script...

--make the site easy and fast to crawl.--make the site easy and fast to crawl.

Write a suitable length of the articleWrite a suitable length of the article

-too short-too shortwon’t have a high rankwon’t have a high rank

-too long-too longloose keyword densityloose keyword densitylow ranklow rank

users tend to shut down the article at the first glanceusers tend to shut down the article at the first glance Write Compact theme of each articleWrite Compact theme of each article

--long article, covering a number of different topics whose relevance are --long article, covering a number of different topics whose relevance are not high, won’t rank very well in search engine.not high, won’t rank very well in search engine.

Some Black hat SEOsSome Black hat SEOs Doorway pagesDoorway pages--automatically generates a large number of keywords pages--automatically generates a large number of keywords pages--from these pages automatically shifted to the home page--from these pages automatically shifted to the home page Cloaked pagesCloaked pages

Keyword stuffingKeyword stuffing

Link SpamLink Spam-set up multiple web pages pointing to a target web page to boost the -set up multiple web pages pointing to a target web page to boost the

latter’s total in-links. latter’s total in-links. -easy to build a new webpage, so this spam is growing rapidly.-easy to build a new webpage, so this spam is growing rapidly.

Battle between SE and SpammerBattle between SE and Spammer

Search Engine Spammer

Meta Tag Irrelevant Keywords

Term Frequency Keyword Stuffing

Link Analysis...

Link Spam...

ReferencesReferences

[1]Christopher D. Manning Prabhakar Raghavan. Hinrich Schütze. Introduction [1]Christopher D. Manning Prabhakar Raghavan. Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press. Cambridge, 2009.to Information Retrieval. Cambridge University Press. Cambridge, 2009.

[2] Sergey Brin, Lawrence Page. The Anatomy of a Large-Scale Hyper textual [2] Sergey Brin, Lawrence Page. The Anatomy of a Large-Scale Hyper textual WebSearch Engine.WebSearch Engine.

Thank You !Thank You !