Upload
harry-warren
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
INFO624 -- Week 10Futures of IR Systems
Dr. Xia LinDr. Xia LinAssistant ProfessorAssistant Professor
College of Information Science and TechnologyCollege of Information Science and Technology
Drexel UniversityDrexel University
The Seven Ages of Information Retrieval Vannevar Bush's 1945 article set a Vannevar Bush's 1945 article set a
goal of fast access to the contents of goal of fast access to the contents of the world's libraries which looks like the world's libraries which looks like it will be achieved by 2010, sixty-five it will be achieved by 2010, sixty-five years later. years later. Bush’s PredictionBush’s Prediction
IR Childhood (1945-1955)
Ideas conceived Ideas conceived Information explosion after World War IIInformation explosion after World War II Possibility of information processing Possibility of information processing
machinemachine MemexMemex
The hardware seems mostly out of date. The hardware seems mostly out of date. user inserting 5000 pages per day into a user inserting 5000 pages per day into a
personal repository and it taking hundreds personal repository and it taking hundreds of years to fill it up. of years to fill it up.
the software goals have not been achieved.the software goals have not been achieved.
The Schoolboy (1960s)
Many many experimentsMany many experiments Use of Precision and RecallUse of Precision and Recall Use of relevance feedbackUse of relevance feedback
Adulthood (1970s)
The invention of The invention of word processing systemsword processing systems time-sharing systemstime-sharing systems
The beginning of information industryThe beginning of information industry OCLCOCLC DIALOGDIALOG BRSBRS
Mid-Life Crisis (1990s)
• Internet put IR to the test.
• Better understanding of the limit of IR.
• Large scale evaluations
• Digital Libraries projects
IR research
IR System
Retrieval algorithmsInterface
EvaluationUser Satisfaction
System prototyping
ContentsInteraction
User
Top Ten Research Issues
10. Relevance Feedback.10. Relevance Feedback.
9. Information Extraction.9. Information Extraction.
8. Multimedia Retrieval.8. Multimedia Retrieval.
7. Effective Retrieval.7. Effective Retrieval.
6. Routing and Filtering.6. Routing and Filtering.
Top Ten Research Issues5. Interfaces and Browsing.5. Interfaces and Browsing.4. “Magic” (Vocabulary Mapping).4. “Magic” (Vocabulary Mapping).3. Efficient, Flexible Indexing and 3. Efficient, Flexible Indexing and
Retrieval.Retrieval.2. Distributed IR.2. Distributed IR.1. Integrated Solutions.1. Integrated Solutions.
A new Industry – Content A new Industry – Content ManagementManagement
Examples of AdvancedIR Systems
Intelligent retrievalIntelligent retrieval Natural language dialogNatural language dialog Text mining Text mining Knowledge-based indexingKnowledge-based indexing
Intelligent Retrieval Expert systemsExpert systems Case-based reasoningCase-based reasoning Natural language understandingNatural language understanding Automatic abstracting/summarizingAutomatic abstracting/summarizing
Natural Dialog with the Computer Use natural languageUse natural language Let the computer keep a profile of your Let the computer keep a profile of your
life-long information seeking patternslife-long information seeking patterns Understand information receivedUnderstand information received understand user’s information needsunderstand user’s information needs
Text mining Information SummationInformation Summation
Automatic summaryAutomatic summary Automatic translationAutomatic translation Automatic clustering and Automatic clustering and
classificationclassification Knowledge Discovery Knowledge Discovery
Associative query expansion Associative query expansion Use of domain knowledge Use of domain knowledge explore document associationexplore document association Generate visualization mapsGenerate visualization maps
IR & Text Mining
IR helps users find information that is IR helps users find information that is already known and that exists in a already known and that exists in a document.document.
TM attempts to examine a collection of TM attempts to examine a collection of documents and discover information not documents and discover information not contained in any individual document in contained in any individual document in the collection.the collection.
Neural Network Computing Learn how the human brain worksLearn how the human brain works Create massive associative networks where Create massive associative networks where
each node has limited process capability each node has limited process capability (intelligence)(intelligence)
Make the network learns by its own Make the network learns by its own processing patternsprocessing patterns
Knowledge-based Indexing
Automatic indexing + concept indexing.Automatic indexing + concept indexing.
Index not only documents but also ideas.Index not only documents but also ideas.
From Internet to Semantic Web
Internet Internet Digital files are connected togetherDigital files are connected together
WWWWWW web pages are connectedweb pages are connected
Digital LibraryDigital Library Information collections are connectedInformation collections are connected
Knowledge NetKnowledge Net Ideas are connected Ideas are connected
Personal Information Agent ? Find me anything I ask it to find.Find me anything I ask it to find. Help me clarify what I really need.Help me clarify what I really need. Let me know what information is Let me know what information is
availableavailable Help me convert information into Help me convert information into
knowledgeknowledge
Human-Computer Collaboration The nature of information seeking requires The nature of information seeking requires
the dialog between the user and the computer the dialog between the user and the computer to be deeper than what HCI describes.to be deeper than what HCI describes. Interaction -- to act upon one another Interaction -- to act upon one another Collaboration -- work together purposely Collaboration -- work together purposely
in an intellectual endeavor in an intellectual endeavor Let the computer amplify and augment Let the computer amplify and augment
human’s intellect. human’s intellect.
Search Engines in the News
Searching the Web gets easier with enginSearching the Web gets easier with engines that try to read your mindes that try to read your mind
US News & World Report, April 16, 2001US News & World Report, April 16, 2001 Now Google saves lives! Someone Now Google saves lives! Someone
wondering if they were having a heart wondering if they were having a heart attack did a search on Google, found a attack did a search on Google, found a page explaining the symptoms, then got page explaining the symptoms, then got to a hospital for help. Without Google, to a hospital for help. Without Google, "I'd be dead today," he's quoted as "I'd be dead today," he's quoted as saying. A look at various search saying. A look at various search engines, search technology and tips.engines, search technology and tips.
Search Engines in the news
Desperately Seeking Search TechnologyDesperately Seeking Search Technology (Commentary) BusinessWeek Online, September 24, 2001 by Robert D. Hof After e-mail, people spend the most time online searching for
stuff. Market researcher Jupiter Media Metrix Inc. found that 80% of online users will abandon a site if the search function doesn't work well. Says Martha M. Frey, an analyst at market researcher Patricia Seybold Group: "You could make a case that the main reason e-commerce is unprofitable is that the power of search has been overlooked."
Building Web Sites With DepthBuilding Web Sites With Depth Web Techniques, February 2001 by Jakob Nielsen and Marie Tahir
search is one of the most common, and one of the least successful ways that users look for things on the Web. Search is often as bad as the worst salesperson or customer service representative.
Google vs. SearchKing An ad marketer took Google to An ad marketer took Google to
court over changes the popular court over changes the popular search engine made to its page-search engine made to its page-ranking system.ranking system. When Google changed its PageRank, When Google changed its PageRank,
SearchKing lost businessSearchKing lost business Should the power of Google be regulated? Should the power of Google be regulated?
Like public utility or highway?Like public utility or highway?
Google vs Searchking Google vs Searchking: United States District Court Google vs Searchking: United States District Court
in Oklahoma gave Google First Amendment in Oklahoma gave Google First Amendment protection in SearchKing's suit against Google for protection in SearchKing's suit against Google for harm to its business from changes in Google's page harm to its business from changes in Google's page ranking algorithms. SearchKing is a link farm for its ranking algorithms. SearchKing is a link farm for its customers. Last year its position in Google results customers. Last year its position in Google results dropped dramatically as Google adjusted its ranking dropped dramatically as Google adjusted its ranking rules. rules. Google argued that PageRanks are opinions Google argued that PageRanks are opinions and protected by free speechand protected by free speech. It won that point, but . It won that point, but the Court did say that Google manually lowered the Court did say that Google manually lowered SearchKing's PageRank. --Jan. 27.2003SearchKing's PageRank. --Jan. 27.2003
Google vs. Evil Daniel Brandt, who runs google-Daniel Brandt, who runs google-
watch.org, attacked PageRank, watch.org, attacked PageRank, accusing the company of being accusing the company of being unfair and undemocratic. Brandt unfair and undemocratic. Brandt urged the FTC to investigate Google urged the FTC to investigate Google and regulate it as a public utility - and regulate it as a public utility - as a company that, in effect, as a company that, in effect, controls access to the Internet's controls access to the Internet's natural resources. natural resources.
Google vs. Evil Google vs. Evil