Click here to load reader
View
234
Download
8
Tags:
Embed Size (px)
Indexing & retrieval
Approaches to indexingKey word indexingConcept indexingSocial indexingNon-text indexing
Keyword Indexing
Keyword indexing (1) QuickAdvantages:Entity-oriented - draw terms from entity itselfHowtosucceedingraduateschool
Keyword indexing (1) QuickAdvantages:Entity-oriented - draw terms from entity itself Inexpensive No vocabulary lag Multiple access points Accuracy No intellectual effort needed
Keyword indexing (2)No control over synonyms, near synonymsDisadvantages:No control over homographs
Keyword indexing (3)Dependent on authors for informative and accurate titlesDisadvantages:Artificial metalloenzymes based on the biotinavidin technology: enantioselective catalysis and beyondThe golden peaches of Samarkhand
Keyword indexing (4)No control over word formsDisadvantages:Communicating in the libraryorCommunications in libraries
Keyword indexing (5)No cross reference structureDisadvantages:
Historical key word indexing methodologiesUniterm cardsEdge-notched cardsOptical coincidence cardsKey word in context (KWIC)Spatial indexing
Pre- versus post-coordinate indexingMortimer TaubeChinaFolkloreChinaHistoryChina PoliticsFrance FolkloreFrance HistoryFrance PoliticsGermany FolkloreGermany HistoryGermany PoliticsRussia FolkloreRussia HistoryRussia Politics(12 terms)China, France, Germany, Russia, Folklore, History, Politics(7 terms)
Post-coordinate index searchingHistory of France France + HistoryTwo sets of documentsBoolean AND search yields intersection of the two setsFranceHistoryFrance AND History
Advantages to Taube's systemNo need to develop a list of authorized termspulling terms from documents themselvesNo need to articulate rules of punctuation for representing complex concepts (FranceHistory)No need to delineate citation order (Francehistory v. HistoryFrance)No need to formulate rules for subheadings ("May subdivide geog.")
Uniterm cardsOne card per termDocument no. 102"Arrest statistics of the Arizona State Police"state31 102 53 24 75 96 107 68 49 7034 95 117 59 115 147 109police11 102 23 85 96 87 68 49 6091 115 107 79
Searching with uniterm cardsQuery: looking for documents about state police102Arrest statistics of the Arizona State Police.state31 102 53 24 75 96 107 68 49 7034 95 117 59 115 147 109police11 102 23 85 96 87 68 49 6091 115 107 79107A short history of the Wisconsin State Police.115The modern police state.
Edge-notched cardsOne card per bibliographic itembearsWhirdeaux, ImaCaring for your pet pterodactyl / by Ima Whirdeaux Call no. Q54321 .W45Turner, PaigeCaring for your pet grizzly / by Paige Turner
Call no. Q12345 .T8pet-carepterodactyls
Pyramid coding for edge-notched cardsCoding the year 1947*20 dots0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 *They hadn't heard of the Y2K problem yet.10 dots9 5 2 0 9 5 2 0 8 4 1 8 4 1 7 3 7 3 6 6
Optical coincidence cardsPre-printed cards with numbers for entire database 0 1 2 3 4 5 6 7 8 9101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899fleas
Key Word in Context (KWIC) IndexDoc 15 title: "A comparison of OCLC and WLN hit rates for monographs and an analysis of the types of records retrieved"CONTEXTttems of remote users: anhit rates for monograph/Acomparison of OCLC and WLNOCLC and WLN hit rates foronographs/ A comparison ofarison of OCLC and WLN hitn analysis of the types of s of the types of recordsphs and an analysis of theA comparison of OCLC andKEY WORDSanalysis of the types ofcomparison of OCLC and WLNhit rates for monographs and /monographs and an analysi/OCLC and WLN hit rates forrates for monographs and /records retrieved. A com/retrieved. A comparison /types of records retrieve/WLN hit rates for monogra/POINTER15151515151515151515StopwordStopword
Key Word Out of Context (KWOC) Indexaardvark101baggage123banyan128, 159, 179coconut955, 654driving196, 488, 788elementary455, 785elephant128, 465, 783garage678, 398hardware849, 483, 399meter768nadir877noxious112opium289opus985, 159, 849people629, 458quark137, 492radar968, 295radio430, 206, 749stereo294, 837, 873television745, 727, 883ultraviolet958, 774zebra276
Vector space model (VSM)Each document represented by a vectortechnologylibrariesassistiveVector for document entitled "Assistive technology for libraries"
Vector space model matchingSimilarity between query and document vectorstechnologylibrariesassistiveVector for document 2Vector for queryVector for document 1
VSM term weightingAssign high weights to terms that appear frequently in the document but infrequently in the databaseQuery: "I'm looking for articles about assistive technology for the blind."
Termconclusioninformationblind
Freq. w/indocumentlowhighhighNo. ofdocumentswith termhighhighlow
VSM refinementsAdding semantic and syntactical parsing.Bill is going to the store to make a purchase.Bill is going to purchase the store.Bill is going to store his purchase.
Concept indexing
Concept indexingRather than pulling terms from documents, assign concept identifier (e.g. FranceHistory) to documents dealing with history of FranceRequires intellectual effortTakes more time than key word indexing so less economicalAvoids problems of false coordination and synonymy through use of vocabulary control
Vocabulary control (1)One indexing term or phrase to represent a conceptUnidentified flying objects not flying saucersPoint user to correct term with "use" referenceReduces number of searches needed to find items about a particular topic
Vocabulary control (2)One form of a word to represent the conceptDictionaries not dictionary
Vocabulary control (3)One usage of a homographic termFault (geologic) not fault (responsibility for error)Usage identified though scope noteConsistency among indexers as well as one indexer over timeHelps user to avoid false drops
Vocabulary control (4)Syndetic structureBroader termsNarrower termsRelated terms (see also)User can negotiate structure to find most appropriate term, as well as identify additional related terms of potential use in finding relevant documents
Social network indexingTagsTag cloudsUser-created tags providing access to library resources
flickrhttp://www.flickr.com/
Tags
TagsTags architecture Bohemian South Country Czech Republic Europe European historical medieval old Old Town Other Keywords River Snow town Vltava
Tags
Tags
Tags(177,583 photos)
Tags
Tag clouds
Geotagging
Librarian tagging
Library using flickr
Peace Palace Library (PPL)
Social bookmarking: http://www.delicious.com
http://www.delicious.com/mauicclibrary
http://www.delicious.com/mauicclibraryThe economic case for open access in academic publishing
technology Portable software for USB drivesCU Researcher Finds 10,000-Year-Old Hunting Weapon in Melting Ice Patch
University of Pennsylvaniahttp://www.library.upenn.edu/
PennTags
Item list with PennTags
Adding a PennTagAdd to PennTags
Non-text indexing
Indexing Music
Indexing music - transcription1 1 5 5 6 6 5
Indexing Music - melodic contour*RU*-/-/-\RURD
Query by humming
Query by humming (2)Hummed QueriesDigitalAudioMelodiccontourRanked ListOfMatching MelodiesPitch TrackerQuery EngineMIDI SongsMelody DatabaseSource: Ghias, Asif; Logan, Jonathan; Chamberlin, David; and Brian C. Smith. 1995. Query by humming--musical Information retrieval in an audio database. ACM Multimedia 95 - Electronic Proceedings. http://www.cs.cornell.edu/Info/Faculty/bsmith/query-by-humming.html
Indexing Music - melodic contour*RURURDhttp://www.musipedia.org/
Indexing Music - melodic contour*RURURDhttp://www.musipedia.org/RURURD
Indexing Music - melodic contour*RURURDhttp://www.musipedia.org/
Indexing imagesSource: Trust Territory archives.
Indexing images - chair (1)
Indexing images - ?
Indexing images - chair (2)
Biometrics - face
Biometrics - differences
Biometrics - similaritiesLook at ratios of distances between marker points
Indexing imagesColorLayoutShape
Indexing images by colorhttp://www.hermitagemuseum.org/fcgi-bin/db2www/qbicColor.mac/qbic?selLang=English
Indexing images by colorhttp://www.hermitagemuseum.org/fcgi-bin/db2www/qbicColor.mac/qbic?selLang=English
Indexing images by colorhttp://www.hermitagemuseum.org/fcgi-bin/db2www/qbicColor.mac/qbic?selLang=English
Indexing images by colorhttp://www.hermitagemuseum.org/fcgi-bin/db2www/qbicColor.mac/qbic?selLang=English
Indexing images by colorhttp://www.hermitagemuseum.org/fcgi-bin/db2www/qbicColor.mac/qbic?selLang=English
Indexing images by colorhttp://www.hermitagemuseum.org/fcgi-bin/db2www/qbicColor.mac/qbic?selLang=English
Indexing images by colorhttp://w