46
Module 3 - Internet

Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Embed Size (px)

Citation preview

Page 1: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Module 3 - Internet

Page 2: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Search Engines

Search engine anatomy

Different search engines

Effective searching techniques

Page 3: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Search Engines

Need for it? Multitude of web pages exist on the web. How to locate the most relevant to your needs?

Anatomy of a Search Engine Spider a.k.a robots, webbots

A program that traverses the web and stores the contents of all searchable web pages.

Web sites can deny access to some resources. Using a robots.txt file eg. Try http://www.usask.ca/robots.txt

User-agent: * Disallow: /testing

Page 4: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Anatomy…

Spider… Indexing Software

Indexes the web pages into a easily searchable database collection

Interface for queries Allows users to enter keywords and other combinations. Searches are performed within the indexed database

Page 5: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Different Search Engines www.yahoo.com

Directory listing organised into various categories Yellow pages in our phone book. All page are hand linked

“Yet Another Hierarchical Officious Oracle” Gulliver’s travels- ‘yahoo’

www.altavista.com “a view from above” First truly huge collection of indexed database of web pages

www.google.com “googol”: 1 followed by 100 zeros Top search engine today - over 100 million queries a day.

Page 6: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Why Google?

Relevant results are ranked at the top (first) page of a query. Why relevance is important?

Typical user rarely goes beyond the first page How is relevance measured?

Number of links that point to the same page. Not just by the number of times a keyword is repeated. Careful here: If enough people say a lie to be true, it

becomes the truth. - Goebbelsian Lies Googlebomb: “talentless hack” Googlewhack: ‘the search for the one’!!

Eg. ceremonial overstuffing

Page 7: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Effective Searching Composing the right keywords in the query

Saves time and frustration

AND OR NOT AND: combines two keywords

specifies that both keywords should be found on the resulting web page

OR: combines two keywords Specifies one or both keywords to be found on the web

page NOT: operates on a single keyword

Ensures that this keyword should not be found in any page returned.

Examples: vacation london OR paris bass AND fishing NOT music

Page 8: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Effective Searching..

+/- signs + indicates a keyword must be present in the result - indicates a keyword must not be present The signs are usually stuck to the keyword Example: +bass +fish –music

star wars episode +1

Quotation marks “ ” Groups a set of keywords and the resulting page should

have these in the exact same order Can be used in combination with other methods Examples: “star wars episode 1”

“to kill a mocking bird” -movie

Page 9: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Networking and Telecommunication

Page 10: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Topics

Linking Up: Network Basics Connecting to the Internet Networks: Near and Far Communication Software

Page 11: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Linking Up: Network Basics

A computer network is any system of two or more computers that are linked together.

How do networks impact systems? People share computer hardware, thus reducing

costs People share data and software programs, thus

increasing efficiency and production

Page 12: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Linking Up: Network Basics

Internet is a network of networks Globally connected network that links various

organisations and individuals. Web is not Internet.

WWW is one particular usage of internet. Email, FTP (File Transfer Protocol) are other such

uses.

Page 13: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Connecting to the Internet

The amount of information that can be transmitted in a given amount of time is defined as the bandwidth Impacted by:

Physical media that make up the network Amount of network traffic Software protocols of the network

Page 14: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Communication á la Modem A modem is a hardware device that

connects a computer’s serial port to a telephone line (for remote access).

Modulator-demodulator

May be internal on the system board or external modem sitting in a box linked to a serial port.

Modem transmission speed is measured in bits per second (bps) and generally transmit at 28,000 bps to 56.6K bps

Page 15: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Connecting to the Internet

Direct connections using T1 or T3 lines. 1.5Mpbs to 45 Mbps

Dial up connections Modems

Broadband connections DSL Digital Subscriber Line 300Kbps to 1.3Mbps Cable Modems 10Mbps.

Page 16: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Networks: Near and Far……

Page 17: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Networks Near and Far

Local-area network (LAN) Computers are linked within a

building or cluster of buildings.

Each computer and peripheral is an individual node on the network.

Nodes are connected by cables which may be either twisted pair (copper wires) or coaxial cable.

Page 18: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Wide-Area Networks

A network that extends over a long distance.

Each network site is a node on the WAN network Made up of LANs linked by

phone lines, microwave towers, and communication satellites.

Data is transmitted over common pathways called a backbone.

CANet3 http://www.canet3.net/stats/CAnet3map/CAnet3map.htm

Page 19: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

CANet3: Canadian backbone

Page 20: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Protocols for Communication……

Page 21: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Communication Software

Protocol - set of rules for the exchange of data between a terminal and a computer or between two computers TCP/IP Transmission Control Protocol / Internet

Protocol Messages are broken into Packets - 1500 bytes Packets are numbered and sent over the network

Page 22: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Communication Software

IP defines the addressing system 128.236.24.161 - 4 bytes, 0 to 255 Every packet includes the source IP, destination

IP and the packet number (7 of 13) TCP is an end-to-end protocol.

packets are reliably transmitted from one computer to another.

Lost packets are re-transmitted.

Page 23: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Communication Software Communication software establishes a protocol

that is followed by the computer’s hardware Different forms:

Client/server model - one or more computers act as dedicated servers and all the remaining computers act as clients Web server and client browsers

Peer-to-peer model - every computer on the network is both client and server Napster, Gnutella

Many networks are hybrids, using features of the client/server and peer-to-peer models

Page 24: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Client/Server Model

Client software sends requests from the user to the server

eg. http://www.cs.usask.ca

Server software responds to client requests by providing data

Page 25: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Internet Addresses…

Page 26: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Internet Addresses

The host is named using DNS (domain name system), which translates IP addresses into a string of names. Address: 128.233.130.63 is www.cs.usask.ca Address: 216.239.51.101 is www.google.com Easier to remember strings of alphabets than

numbers.

Page 27: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Internet Domains

Top level domains include:

.edu - educational sites

.com - commercial sites

.gov - government sites

.mil - military sites

.net - network administration sites

.org - nonprofit organizations

.ca - Canada

Page 28: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Addressing Computers

Unique IP numbers Need for it? – similar to the house address

DNS servers Arranged in a hierarchy - 4 top level servers in US Multiple computers can be mapped on to the same domain

name Eg. www.yahoo.com

Gateways Takes care of routing packets in and out of a LAN

Routers Takes care of routing packets across multiple network nodes

Page 29: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Addressing Persons

[email protected]

Examples:

[email protected]

User President whose mail is stored on the host whitehouse in the government domain of USA

User abc123 at the server for Computer Science,University of Saskatchewan, Canada.

Page 30: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Internet Email Addresses

An Internet address includes: [email protected]

username is the person’s “mailbox”

hostname is the name of the host computer and is followed by one or more domains separated by periods:

– host.subdomain.domain : @mail.usask.ca

– host.domain : @hotmail.com

– host.subdom.subdom.domain : @finance.sk.gov.ca

Page 31: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Web Addresses

Protocol for Web pages

Dissecting Web Page address:

http:// www.vote-smart.org/

Path to the host

Resource Page

help/database.html

Page 32: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Addressing Resources

URL: Uniform Resource Locator Web: http://www.cs.usask.ca/index.html

A Web server stores Web pages and sends pages to client Web browsers.

FTP: ftp://ftp.cs.usask.ca File transfer protocol (FTP) allows users to download

files from remote servers to their computers and to upload files.

Telnet: telnet://scrooge.usask.ca Allows users to login into remote computers.

Other resources like Gopher, NNTP - newsgroups

Page 33: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Cookies

Cookies: what are they? Are files created on your computer by a website to store information about you.

To accept or not ?Benefits:

stores some of the personal information (repeat info)

allows pages to be customised to your preferences

Eg. Layouts, advertisements…

Privacy issues.

Do you want your browsing patterns to be used by a company/organisation?

Page 34: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Email, Viruses and Internet Issues

Page 35: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Topics

E-mail: Access Protocols Other Internet Applications: Chat,

Newsgroups Netiquette: some tips Intranets and Extranets Viruses Internet: Ethical and Political issues

Page 36: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Email on the Internet

Email formats include: ASCII text--can be viewed by any mail client program

HTML--displays text formatting, pictures, and links to Web pages

SMTP – Simple Mail Transfer Protocol Asynchronous communication form UUCP – Unix to Unix Copy

Page 37: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Email on the Internet

What appears on the screen depends on the type of Internet connection you have and the mail program you use.

Popular graphical email programs include Eudora, Outlook and Netscape Communicator.

Page 38: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Email on the Internet

IMAP Vs POP: Internet Message Access Protocol Vs Post Office

Protocol Messages remain on the email server Vs

messages are downloaded to your computer and deleted in the mail server.

Online Vs Offline access. Retrieve messages in any order Vs “in-order” retrieval Limit set by your e-mail server Vs number of

messages is limited by your hard-disk size.

Page 39: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Mailing Lists & Network News

Mailing lists allow you to participate in email discussion groups on special-interest topics. E-mails are sent to the whole group

A newsgroup is a public discussion on a particular subject consisting of notes written to a central Internet site and redistributed through a worldwide newsgroup network called Usenet. Protocol used NNTP – Network News Transport Protocol I-HELP is a similar application. - More like a message board. Could be local interest too: usask.forsale

Page 40: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Real-Time Communication

Users are logged in at the same time.

Instant Messaging for exchanging instant messages with on-line friends and co-workers

Chat Rooms for conversing with multiple people in real-time

Internet telephony (IP telephony) for long-distance toll-free telephone service

Videoconferencing for two-way meetings

Page 41: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Rules of Thumb: Netiquette

Say what you mean and say it with care. Keep it short and to the point. Proof-read your messages. Learn the “nonverbal” language of the Net. :) Keep your cool. Don’t be a source of spam (Internet junk mail). Lurk before you leap. Check your FAQs (Frequently Asked Questions) Give something back.

Page 42: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Intranets and Extranets

Intranets are self-contained intra-organizational networks that offer email, newsgroups, file transfer, Web publishing and other Internet-like services. Firewalls prevent unauthorized communication

and secure sensitive internal data Gateways where the firewalls exist, act as the

gate keeper.

Page 43: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Intranets and Extranets

Extranets are private TCP/IP networks designed for outside use by customers, clients and business partners of the organisation.

Electronic data interchange - EDI - a set of specifications for ordering, billing, and paying for parts and services over private networks

Page 44: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Viruses Viruses are programs that could damage your data

and hinder a computer’s normal functioning. Activate itself : executable files, boot sector, macros Replicate itself: through e-mail attachments Do “something”: destroy contents

Trojan horses are malicious programs disguised as useful software.

Worms are programs that could travel across the network and replicate themselves.

Anti-Virus programs check for known viruses Strains are identified by “unique” strings and their actions.

Page 45: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Internet Issues: Ethical and Political Dilemmas Copyright Laws: how do they apply for online

content? Especially across international boundaries.

Filtering software to combat inappropriate content Parental controls.

Digital cash to make on-line transactions easier and safer

Encryption software to prevent credit card theft Digital signatures to prevent email forgery Digital divide: computer haves from have-nots.

Page 46: Module 3 - Internet. Search Engines Search engine anatomy Different search engines Effective searching techniques

Next Class

HTML

This text coded as HTML ..

Appears like this on the screen …

<H1>Welcome to Computer Confluence</H1><b>Publishing on the Web</b>