Upload
marjory-weaver
View
229
Download
1
Embed Size (px)
Citation preview
Chapter 3 - 2
ContentsContents
• Background• Web Search Agents• Information Filtering Agents• Notification Agents• Other Service Agents
Chapter 3 - 3
BackgroundBackground
• The dominant form of Web usage is the direct manipulation method (surfing).
• The characteristics of the Web dictate why we need Internet agents for information brokering.– The volume of information on the Internet is huge.– The type of information on the Internet varies widely.– The quality of information varies greatly.– The depth-first surfing inherently encouraged by Web
browser causes most users to be lost in Web hyper space.
• Internet agents are computer programs that reside on those servers and access the distributed on-line information on the Internet to perform tasks on behalf of users without direct user interaction.
Chapter 3 - 4
BackgroundBackground
• Categorization– Web search agents– Information filtering agents– Off-line delivery agents– Notification agents– Service agents– Web site agents– Mobile agents
Chapter 3 - 5
Web Search AgentsWeb Search Agents
WebBrowser
QueryServer
IndexDatabase
WebWebRobotRobot
Search Engine
WebWeb
User
Also known as softbot, spiders,
wanderers, crawlers.
Chapter 3 - 6
Web Search AgentsWeb Search Agents
• The performance of a search engine can be measured by its precision and recall.
• Precision
the document relevant to the query the total number of document returned
• Recall
the document relevant to the query the total number of document
Chapter 3 - 7
Web RobotsWeb Robots
• The Web robot is an autonomous agent that communicates with the Web using, for example, the HTTP protocol.
• The software robots have different strategies for traversing the Web graph.
• Robots usually use a strategy to traverse the Web graph in a prioritized manner.– Lycos uses a queue to store all the pointers in the page.– Web Crawler uses a breadth-first
• Robots may exclude different types of documents such as pictures and binary files.
Chapter 3 - 8
Web RobotsWeb Robots
Element Description
Environment I nternet
Task skillsHyperlink discovery, documentretrieval, and indexing
KnowledgeWeb, Usenet newsgroups, FTP,and Gopher sites
Communicationskills
HTTP, FTP, Gopher, query server
Chapter 3 - 9
Web RobotsWeb Robots
Feature Advantage DisadvantageKeyword query Ease of use Lost productivity due to
poor precision
I nstant response I ncreased productivityif user knows what he islooking f or
Decreased productivity,due to chasing links inquery returns
Hierarchical subjectcategories
I ncreased productivitydue to high precision
Low recall in response touser needs
I nformationdiscovery via robots
Reduced user workload Lack of scalability andbandwidth ineffi ciencydue to duplicative anduncooperative model
Chapter 3 - 10
Information Filtering AgentsInformation Filtering Agents
• While search agents are useful in finding Web sites of particular interest to a user, information filtering agents find the content of particular interest to a user using different information sources.
WebBrowser
NewsServer
Index Articles
UserProfile
IndexingEngine
Web
Filtering Agent
Chapter 3 - 11
Information Filtering AgentsInformation Filtering Agents
• The indexing engine binds keywords to each article.• Most information retrieval systems model documents as ter
ms and term frequency counts.• Document model representations can be roughly divided int
o two groups:– Vector space models, Tree structures
• Most information retrieval systems also generate the thesaurus classes by synonyms in order to index words by word stems.
• The similarity between two documents can be determined by a suitable distance metric– Term Frequency * Inverse Document Frequency of TfIDf
Chapter 3 - 12
NewsHound — Personalized NewspaperNewsHound — Personalized Newspaper
• Searches the stories in the San Jose Mercury News as well as several other newspapers to find articles that match a user’s profile.
• Uses the Verity Topic indexing engine with an email and Web form style interface.
• See Fig. 3.5, p. 61
Chapter 3 - 13
Benefits of Information Filtering Benefits of Information Filtering Agents Agents
• Benefits: see Table 3.4, p. 62• What can information filtering agents do for your
organization?– Brings the latest HW configuration and pricing information
for a purchasing manager– delivers the international, financial, political, and economic
news that impact a financial investment– Tracks news related to an ongoing investigation for law
enforcement agency personnel– Gathers news about job market conditions for a special
employment category for a human resources professional
Chapter 3 - 14
Off-line Delivery AgentsOff-line Delivery Agents
• Information filtering agents that deliver personalized information in a locally viewable format without requiring a direct Internet connection.
• When does an information filtering agent that delivered customized information via an email message become an off-line delivery agent?– When the information agent has its own information
delivery software on the desktop for automatic information delivery and management of delivered information.
Chapter 3 - 15
Off-line Delivery AgentsOff-line Delivery Agents
Feature Advantage BenefitDirect delivery Transparent delivery to
desktopUser does not need tovisit sites
Automates delivery Delivery according touser-specified schedule
Avoidance of peak traffi chours
Local visiting HTML links are locallyresolved
Avoids the need to geton-line
Disk management New informationreplaces out of date
Relieves user f rom diskmanagement
Chapter 3 - 16
Notification agentsNotification agents
• A notification agent notifies users of events of significance to them when an event is a change in the state of information such as:– Content change in a particular Web page– Search engine additions for specified keyword queries– User-specified reminders for special event such as
birthday.
• Internet notification agents are typically server-based programs that poll user-specified sites.
Chapter 3 - 17
Notification agentsNotification agents
• Methods employed– HTTP “If-Modified-Since” request: This is a special Head
Request that returns a document only if the page has been modified since the specified date.
– Text only retrieval: Notification agents will retrieve only the text of a page without the graphics and hyperlinks, and parse the retrieved text to determine any change in the published information.
– Embedded HTML extensions: These are directions to notification agents embedded in HTML document by publishers.
Chapter 3 - 18
Other Service AgentsOther Service Agents
• Announcement agents– Remind users of important occasions that are customized
for personal needs.
• Book agents:– Track newly released books that match a user’s reading
interests.
• Business information monitoring agents:– Monitor the exchange of information on the Internet
relating to services, products, industry, and companies.
• Classified agents: – Search a database of classified ads daily to find a user-
specified item, and notify the user via mail.
Chapter 3 - 19
Other Service AgentsOther Service Agents
• Direct mail agents:– Bring personalized direct mail advertising that matches the
user’s stated personal background, activities, and lifestyle.
• Financial service agents:– Deliver email messages containing price and financial news
for a personalized portfolio of securities and mutual funds.
• Food and wine agents– Remember each user’s previous purchases and tasting
notes to make customized presentation of inventory during the next visit.
• Job agents:– Serve as virtual recruiters to find employees that match
employer job profiles.
Chapter 3 - 20
Other Service AgentsOther Service Agents
• Entertainment agents:– Finds communities with similar interests to those of the
user, and recommends albums. Movies, and so on based on group evaluations
• Shopping agents:– Perform comparison shopping for user-specified items at
virtual stores.
• Site agents:– Functions as a virtual host at 3D and client sites
Chapter 3 - 21
Other Service AgentsOther Service Agents
• Grouping based on their internal architectures:– Agents that perform intelligent database queries and notify
users– Agents that use a parallel search algorithm to query Web
resources and integrate query results on behalf of the user– Agents that use collaborative filtering to find user clusters
for recommendations based on social communities– Agents that use natural language techniques to engage in
conversations with users