15
By Sabique Ahmed Khan Computer Science III rd year A Seminar Report on

Deep Web

Embed Size (px)

Citation preview

By

Sabique Ahmed KhanComputer Science

III rd year

A Seminar Report on

What is Deep Web?

Surface WebSurface Web

• WWW content indexed by typical search engines i.e Google, Bing etc.

• Contains only a fraction of all overall unstructured content available online today.

• Surface search results are based on “relevancy by popularity”

Background of Deep WebBackground of Deep Web

Cont..

How big is the Deep Web?

o550 billion documents o500 times the content of the ‘

surface web’ oGoogle has identified 1.2 billion

documents oAn Internet search typically searches .03% (1/3000) of

available content.

What’s in the Deep Web?What’s in the Deep Web?

Why use the Deep Web?Why use the Deep Web?

• Deep Web is massive approximately 500 times greater than that visible to conventional search engines with much higher quality throughout.

• Fast, economical, provide depth knowledge.

• Deep Web coverage is Broad, Relevant.

• Deep Web searchable databases and search engines combined total of more than 250,000 sites

How Search Engines Work

• Search engines obtain their listings in two ways: Authors may submit their own web pages, or the search engines “crawl” or “spider”

• Crawlers work by recording every hypertext link in every page they index crawling.

Deep Web can be accessed via the software TOR, which allows users anonymity.

Julian Assange, the founder of Wikileaks.com uses the TOR browser to access the US government confidential documents.

Deep Web gets 80% of its budget from the US govt and rest comes from the other groups.

Deep web was involved in $1.2 billion worth of transaction.

Bit coin is the dominant currency of the deep web.

Edward Snowden also used Deep Web to leak files of NSA's mass surveillance programs called PRISM.

Deep web contains 7,500 TB of information compared to 19 TB of Surface Web.

Cont..

Conclusion

• The deep Web thus appears to be a critical source when it is imperative to find a “needle in a hay stack”.

• At present , the Internet is functionally divided into two areas ---

1% of the information content is in the surface web (yahoo, google etc) and

99% of is in the deep web.