Upload
edwin-lawson
View
215
Download
0
Embed Size (px)
Citation preview
Introduction to Web Browsers and Basic Search Strategies
Using Search Engines
Davina Pruitt-Mentle
EDUC 698M
EDUC 698M Davina Pruitt-Mentle 2
Outline• History (WWW & Internet)
• Search tools
• Search Engines vs. Subject Directory
• Meta search Engines
• Steps for Searching
• Effective Strategies
• Narrow or broaden a search?
• Wildcards
EDUC 698M Davina Pruitt-Mentle 3
Internet History
• Internet made up of thousands of networks worldwide
• No one in charge of Internet - No governing body
• Internet backbone owned by private companies
EDUC 698M Davina Pruitt-Mentle 4
Looking at the Net
Taken from: http://www.cio.com/WebMaster/sem2_net.html
EDUC 698M Davina Pruitt-Mentle 5
Understanding the Map
• Computers use TCP/IP to communicate (Transmission Control Protocol/Internet Protocol)
• Computers use client/server architecture
EDUC 698M Davina Pruitt-Mentle 6
Internet Providers:
• Research and Educational Institutions
• Government and Military Entities
• Businesses
• Private Organizations
• Commercial Providers
EDUC 698M Davina Pruitt-Mentle 7
Internet Protocols
• Email (Simple Mail Transport Protocol)
• Telnet (Login to remote host computer)
• FTP (File Transfer Protocol) - transfers files between server and client
• HTTP (HyperText Transfer Protocol)
EDUC 698M Davina Pruitt-Mentle 8
History• WWW or Web or W3 includes all information,
text, images, audio, video, and computational services that are accessible from the internet
• July 8, 1999 Nature - approximately 800 million pages of publicly accessible information(1)
• Web continues to grow, tripling in size over the past two years(2)
(1) Steve Lawrence & C. Lee Giles, “Accessibility of Information on the Web,” Nature 400 (July 8, 1999), 107
(2) OCLC Office of Research, “June 1999 Web Statistics” Web Characterization Project
EDUC 698M Davina Pruitt-Mentle 9
WWW
• System of Internet servers that support hypertext to access several Internet protocols on a single interface
• Almost all protocols accessible on Internet are accessible on web (email - FTP - Telnet - etc)
• In addition, WWW own protocol: HyperText Transfer Protocol
EDUC 698M Davina Pruitt-Mentle 10
HTTP
• Hypertext - means of information retreival
• Contains links that connect to other documents
• Links selected by user
• Virtual “web” of connections
EDUC 698M Davina Pruitt-Mentle 11
HTTP (cont)
• Produce HTTP through HTML
• HyperText Markup Language
• Way of writing or creating with “tags” added to tell information– i.e. <b> Bold </b> yields Bold
EDUC 698M Davina Pruitt-Mentle 12
More History
• Internet initially conceived in 1989 by Tim Berners-Lee at CERN (European Particle Physics Lab in Switzerland)
• Needed a wide variety of information to be shared and distributed to many different computers and platforms
• “Universal readership”
EDUC 698M Davina Pruitt-Mentle 13
Web Popular Because:
• Easy to use
• Easy to navigate
• Combines words, graphics, sound, video
• Easy to Publish
• Plethora of information
• Reach larger audience
EDUC 698M Davina Pruitt-Mentle 14
Summary: Web vs. Internet
• What is the relationship between the web and the Internet?
• The Internet contains physical components– computers– networks– services
EDUC 698M Davina Pruitt-Mentle 15
Web vs. Internet
• The Internet connects thousands of computers across the world, but it is the web that allows communication to occur
• Web - abstraction and common set of services on top of the Internet
• Web - set of protocols and tools that let us share information with each other
Directed Search Strategies
Davina Pruitt-Mentle
July Design Institute
July 20, 2000
EDUC 698M Davina Pruitt-Mentle 17
How Do I Find Information on the Internet?
• Join an email discussion or USENET newsgroup
• Go directly to a site if you have the address
• Browse
• Explore subject directory
• Conduct Search
EDUC 698M Davina Pruitt-Mentle 18
How Does Information Get Indexed by the Search Tools
• A publisher of a web page can register the site with the search engine or directory
• Database collects data autonomously
EDUC 698M Davina Pruitt-Mentle 19
Browsers• Netscape Navigator (Communicator)
– Product of Netscape (Now owned by AOL)– Originally was dominant– Multi-platform (all operating systems)
• Internet Explorer– Product of Microsoft– Current Dominant Browser– Not available for all operating systems
• Browser compatibility problems can cause web page problems
EDUC 698M Davina Pruitt-Mentle 20
Netscape Search
EDUC 698M Davina Pruitt-Mentle 21
Netscape Search
• 1: Access to different search engines
• 2. Type words or phrases into text entry box
• 3. Click Button• 4. Preserve favorite
search engine
EDUC 698M Davina Pruitt-Mentle 22
Internet Explorer Search
•Separate Panel In Browser
•Uses MicroSoft Network search
EDUC 698M Davina Pruitt-Mentle 23
Internet Explorer Search
• Direct access to only Microsoft Network’s search engines
• Allows easy access to different types of search– Web pages– People– Businesses– Maps
EDUC 698M Davina Pruitt-Mentle 24
Internet Keywords• Type straight in location bar of
Netscape/Explorer
• Simple words instead of URL (uniform resource location)
• Words tie to websites
• Can be tied to language preference
• Example: Typing in maryland converts to http://www.state.md.us/
EDUC 698M Davina Pruitt-Mentle 25
Know your URL’s• “Address” of a file on the Internet
• Contains type of protocol followed by the computer name, directory and file name
• Examples– http://www.capecod.net/Wixon/wixon.htm
– gopher://gopher.boombox.micro/
– ftp:// wuarchive.wustl.edu/pub/windows/psp3.zip
– mailto:[email protected]
EDUC 698M Davina Pruitt-Mentle 26
Anatomy of a Web Address
• protocol://host/path/filename
See handout “Anatomy of a Web Address”
EDUC 698M Davina Pruitt-Mentle 27
Two Basic Approaches to Searching
(although not really “basic”)
• Search Engines
• Subject Directories
EDUC 698M Davina Pruitt-Mentle 28
Search Engines vs. DirectoriesSearch Engines• Computer built index of
information on web
• More inclusive
• Used to find specific resources
• Searchable by keyword
• Excessive “hits”
• Every page of a Website is indexed
• Better for general searches, but can be used to find specific information
Directories• Human aided, organized list
• May be general or subject-specific
• May be able to “search” directory
– Google - general
– NetTech Educational Technology Coordinator Website - subject specific
• User has control of browsing
• Fixed vocabulary
• Links go to Website home pages only
• Better at general searches
EDUC 698M Davina Pruitt-Mentle 29
What are Search Engines?• Designed to assist you in searching through
the enormous amount of information on the Web
• No single search tool has everything
• Each engine is a large database which utilizes different search techniques and tools (spiders or robots) to build indexes to the Internet (some also utilize submissions and administration)
EDUC 698M Davina Pruitt-Mentle 30
Which Search Engine?
• Yahoo• Altavista• Excite• Google• NorthernLights• Hotbot• InfoseekSee Handout - “The Little Search Engine that Could”
EDUC 698M Davina Pruitt-Mentle 31
How to Choose
Consider
• Size of the database (# of URLs)
• Currency of the database (updates)
• Search interface
• Help screens
• Search features
• Results listed (# of documents retrieved)
• Relevance of results
EDUC 698M Davina Pruitt-Mentle 32
More About Search Engines• Searches for matching terms (keywords or
several keywords)
• Results “ranked” by relevancy (for some)
• Can search by– subject or category– keyword
• Learn about each search engine’s description, options, and rules and restrictions
EDUC 698M Davina Pruitt-Mentle 33
GO TO
http://www.google.com/help.html
EDUC 698M Davina Pruitt-Mentle 34
Searches for exact matches Try different versions of your search term Example: “Boston hotel” vs. “Boston hotels”
Rephrase query Example: “cheap plane tickets” vs. “cheap
airplane tickets”
EDUC 698M Davina Pruitt-Mentle 35
• Automatically places “and” between words (expands search)
• To reduce search –– add more terms in original search
– refine search within the current search results. (adding terms to first words will return a subset of the original query)
• Exclude a word by using a – sign– Example: to search bass but not speaker bass –speaker
• Does not support “or” operator
• Does not support “stemming” or “wildcard” searches
• Not case sensitive
EDUC 698M Davina Pruitt-Mentle 36
• Finds street maps– Just enter a U.S. street address, including zip
code or city/state into the search box– Google recognizes query as a map request
Try your address
EDUC 698M Davina Pruitt-Mentle 37
Phrase Searches and Connectors
• Phrase Searches are useful when searching for famous sayings or specific names “Gone with the Wind”
• Phrase Connectors are recognized– Hyphens
– Slashes
– Periods
– Equal signs
– Apostrophes• Example: mother-in-law
EDUC 698M Davina Pruitt-Mentle 38
Stop Words• Stop words are ignored • These rarely help narrow and slow down search
– http
– com
– certain single digits
– certain single letters
• to include stop words use [space]+• Example
– Star Wars, Episode 1 Star wars episode +1
– OS/2 OS/ +2
***don’t forget the space before the + - signs
EDUC 698M Davina Pruitt-Mentle 39
How to Interpret Results
See Handout
EDUC 698M Davina Pruitt-Mentle 40
Combines in one search a very large full-text Web-page database (~160 million pages) with over 5,400 searchable full-text published (print) journals and an array of online news resources
EDUC 698M Davina Pruitt-Mentle 41
You may access both relevant web-pages and relevant journals and news releases
Tagged WWW like other search tools or Special Collection (published, fee-for-viewing
journal articles or other publication)
EDUC 698M Davina Pruitt-Mentle 42
GOTOhttp://www.northernlight.com/docs/specoll_help_overview.html
• To obtain an item from the Special Collection:
Click on link Decide if you are willing to pay fee
• Page provides citation so you can locate publication in library
EDUC 698M Davina Pruitt-Mentle 43
• Results grouped in folders listed at left• Folders dynamically generated by search results
– From a controlled vocabulary
– Similar to library cataloging
– Not fixed like subject directories
• Click on any folder to refine or further focus search
• Sub-folders allow you to further “zero in”
Unique Folders Approach
EDUC 698M Davina Pruitt-Mentle 44
• Subjects (baseball, desserts)
• Source descriptors (commercial, personal, magazines, databases)
• Types of documents (press releases, product review, maps)
• Languages (major Romanized languages only)
Four Types of Folders
EDUC 698M Davina Pruitt-Mentle 45
• Basic Search
• Power Search
• Industry Search
• Investext Search
• News
Approaches to Searching
EDUC 698M Davina Pruitt-Mentle 46
• Http://www.northernlight.com
• From Home Page
• Allows Boolean logic
• Phrase in “ ”
• Truncation (*for many characters or % for 1 character)
• + requires, - excludes
Basic Search
EDUC 698M Davina Pruitt-Mentle 47
• Http://www.northernlight.com/power.html
• Combines ALL basic search features in one search
• Limits to major language or country
• Can select subject or document in advance
Power Search
EDUC 698M Davina Pruitt-Mentle 48
• http://www.northernlight.com/business.html
• All features of basic search
• Can limit by date range or industry-based subject category
• Default is ALL industries
Industry Search
EDUC 698M Davina Pruitt-Mentle 49
• http://www.northernlight.com/investext.html
• Search or browse thousands of investment research reports written by expert analysts.
Investext Search
EDUC 698M Davina Pruitt-Mentle 50
• http://www.northernlight.com/news.html
• Allows on-line news searches
News Search
EDUC 698M Davina Pruitt-Mentle 51
“Meta” Search Tools• Multi-threaded search engines• Allows access to multiple databases
simultaneously or via a single interface• (-) Do not offer the same level of control over
search interface and logic as individual engines• (+) Fast• (+) Improvements
– Results sorted by site used for search, or location of Website
– Able to select search engines to include
– ability to modify results
EDUC 698M Davina Pruitt-Mentle 52
Popular Meta-Search Engines
• Dogpile
• Metacrawler
• Profusion
• SavvySearch
EDUC 698M Davina Pruitt-Mentle 53
Subject-Specific Search Engines
• Do not index entire web
• Focus within specific Websites/pages within defined subject area, geographical area, type of resource
• Specialized search - depth rather than breath
EDUC 698M Davina Pruitt-Mentle 54
Selected Subject-Specific Engines
Companies
• Companies Online (http://www.companiesonline.com/)
• Hoover's Online (http://www.hoovers.com/)
• Wall Street Research Net (http://www.wsrn.com/)
People (E-mail and Phone) • Bigfoot (http://bigfoot.com/)
• WhoWhere? (http://www.whowhere.lycos.com)
• Yahoo! People Search (http://people.yahoo.com/)
• Switchboard.Com (http://www.switchboard.com)
EDUC 698M Davina Pruitt-Mentle 55
Selected Subject-Specific Engines
Images • The Amazing Picture Machine
(http://www.ncrtec.org/picture.htm) • Lycos Image Gallery
(http://www.lycos.com/picturethis/) • WebSeek
(http://disney.ctr.columbia.edu/webseek/) • Yahoo! Image Surfer (http://ipix.yahoo.com/)
EDUC 698M Davina Pruitt-Mentle 56
Selected Subject-Specific Engines
Jobs • Hotjobs.com (http://www.hotjobs.com/)
• Monster.com (http://www.monster.com/)
• The Riley Guide (http://www.rileyguide.com/)
Games • CNET Gamecenter.com (http://www.gamecenter.com/)
• Games Domain (http://www.gamesdomain.com/)
• Gamesmania (http://www.gamesmania.com/)
• GameSpot (http://www.gamespot.com/)
EDUC 698M Davina Pruitt-Mentle 57
Selected Subject-Specific Engines
Software • Jumbo (http://www.jumbo.com)
• Shareware.com (http://www.shareware.com)
• ZDNet Downloads (http://www.zdnet.com/downloads/)
Health/Medicine • Achoo (http://www.achoo.com/)
• BioMedNet (http://www.bmn.com/)
• Combined Health Information Database (http://chid.nih.gov/)
• Mayo Clinic Health Oasis (http://www.mayohealth.org/)
• Medical World Search (http://www.mwsearch.com/)
• OnHealth (http://www.onhealth.com)
EDUC 698M Davina Pruitt-Mentle 58
Selected Subject-Specific Engines
Education/Children's Sites • AOL NetFind Kids Only
(http://www.aol.com/netfind/kids/)
• Blue Web'n (http://www.kn.pacbell.com/wired/bluewebn/)
• Education World (http://www.education-world.com/)
• Kid Info (http://www.kidinfo.com/)
• Kids Domain (http://www.kidsdomain.com)
• KidsClick! (http://sunsite.berkeley.edu/KidsClick!/)
• Yahooligans! (http://www.yahooligans.com)
EDUC 698M Davina Pruitt-Mentle 59
Subject Directories
• Hierarchically organized indexes of subject categories
• User can browse through lists of Websites by subject in search of relevant information
• Maintained by human
• May include a search engine for searching their own database
EDUC 698M Davina Pruitt-Mentle 60
Examples of Subject Directories• INFOMINE (Academic Scholarly Subject
Directory - http://infomine.ucr.edu/)
• LookSmart
• Lycos
• Magellan (http://www.magellan.excite.com/)
• Open Directory (http://www.dmoz.org/)
• Yahoo Many of these have aspects of both search and directory
EDUC 698M Davina Pruitt-Mentle 61
Specialized Subject Directory• Guide complied by subject specialist• List important resources in his/her area of
expertise• More comprehensive than general guide• Examples
– Film: Internet Movie Database (http://www.imdb.com/)
• Includes Clearinghouses– Argus Clearinghouse (http://clearinghouse.net/)
– About.com
– WWW.Virtual Library (http://www.vlib.org/)
EDUC 698M Davina Pruitt-Mentle 62
Summary
• Search Engines• The Big Guys
– Altavista
– Yahoo
• Meta-Search Tools– Dogpile
– MetaCrawler
• Subject-Specific– The BigHub.com
– Search Engine Colossus
• Subject Directory– LookSmart
– Lycos
• Specialized Subject Directory– WWW.Virtual Library
– About.com
EDUC 698M Davina Pruitt-Mentle 63
Preparing to Search
• What’s the topic, question, area of interest?
• Identify search terms to describe your topic of interest
• Consider synonyms (echinoderm OR echinoidea OR "sea urchin")
• Consider variations of terms (restaurants, dining, gourmet)
See Handout: Practical Steps
EDUC 698M Davina Pruitt-Mentle 64
Search tips
• Enclosing a multiword phrase in quotation marks tells the search engine to list only sites that contain that exact phrase– Example: “heart disease”
EDUC 698M Davina Pruitt-Mentle 65
Boolean Logic
• Combines search terms in many databases
• AND, OR, and NOT or (+) and (-)
• Must check to see if search engines use Boolean logic
EDUC 698M Davina Pruitt-Mentle 66
Boolean Logic : ANDLimits your search
“Oral History” & Women
Only returns pages with both of these terms on them
EDUC 698M Davina Pruitt-Mentle 67
Boolean Logic : ORBroadens your search
Returns every page with either of these terms on them
“Oral
History”OR Women
EDUC 698M Davina Pruitt-Mentle 68
Boolean Logic : NOTLimits your search
Only returns pages that contain one but not the other term on them
“Oral
History”NOT Women
EDUC 698M Davina Pruitt-Mentle 69
Wildcards• Special Character that can be appended to
the root of a word so you can search for all possible endings to that root
• Good for variant spellings and common root words
• Example– rocket* will yield rocket, rockets, rocketry
psycholog* = psychology, psychological, psychologist
– colo*r = color and colour
EDUC 698M Davina Pruitt-Mentle 70
Ctrl-F
• Follow a link to a document retrieved by a search engine and don’t know how relevant
• Ctrl-F finds the relevant words in current document
• Example: women +“El Salavdor” +“Oral History”– Pick one link, then Ctrl-F
EDUC 698M Davina Pruitt-Mentle 71
Searching Summary
• Choose a search engine– Personal preference– Different engines for different purposes
• Syntax - quotations, Boolean logic, wildcards
• Ctrl-F to find search words
• Try to stay focused on your task