Upload
daryl-superio
View
1.224
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
Introduction to Indexing
Prepared by:Daryl L. Superio
Central Philippine UniversityIloilo City, Philippines
Sumer, 3000 B.C.- where the first systematic organization of written records was found
2000 B.C., China & India- when record keeping became part of the society◦ an orderly society is parallel to the orderly record
of what has occurred◦ laws had been passed requiring that all business
transactions be recorded and authorized 900 A.D.- when encyclopedia was arranged
in alphabetical order
Historical Development
early indexes were concordance indexes, were limited to personal names or were indexes to the occurrence of words on text
marginal summaries were around as early as the 9th century
indexes took a major step forward with the development of codex
blank pages binding at the back of the book were utilized to be written references◦ known as do-it-yourself indexes ◦ indexes were usually at the front of the book, lifted
verbatim from the text, simple but not easy to use
1850s- W.F. Poole published an index that cut across many journals◦ the beginning of a single publication indexing
numerous issues of many journals 1900- H.W. Wilson, first published the
Readers’ Guide to Periodical Literature ◦ notable for the emphasis it placed on subject
access and cross-referencing ◦ each periodicals were indexed under its author
and its specific subject
19th Century- when Paul Otlet and Henri La Fontaine founded the International Institute of Bibliography◦ one of the purpose was to improve indexing
approaches to scholarly literature◦ title-word indexing was proposed, which led to
keyword and free indexing book indexing continued to improve;
indexes began to have subdivisions of terms, and slowly cross-references began to appear
1950s- when computers were utilized in indexing and abstracting◦ Hans Peter Luhn, of IBM introduced a mechanized form of
derived title-word indexing schemes 1960s- brought the third generation computers,
indexes and abstracts began to publish with computers using batch processing methods
1990s- when keyword searching of computer-stored indexes had been perfected
20th Century- greater progress in the development of indexing methods; indexes to individual work, through indexes to several volumes, to cooperative and massive indexes and currently, the web indexes
Index◦ a systematic arrangement of entries designed to
enable users to locate information in a document◦ an alphabetically arranged list of headings consisting
of the personal names, places, and subjects treated in a written work, with page numbers to refer the reader to the point in the text at which information pertaining to the heading is found in single-volume works of reference and nonfiction, any
indexes appear at the end of the back matter in a multivolume work, they are found at the end of the
last volume in very large multivolume reference works, the last
volume may be devoted entirely to indexes
Definition of terms
index also refers to: an open-end finding guide to the literature of an
academic field or discipline ex. Philosopher's Index
works of a specific literary form ex. Biography Index
published in a specific format ex. Reader's Guide to Periodical Literature
analyzed contents of a serial publication ex. New York Times Index
Indexing◦ the operation of creating an index for information
retrieval◦ the process of:
compiling one or more indexes for a single publication, such as a monograph or multivolume reference work,
adding entries for new documents to an open-end index covering a particular publication format (example: newspapers), works of a specific literary form (biography, book reviews, etc.), or the literature of an academic field, discipline, or group of disciplines.
Indexer◦ a person who does indexing
Indexable matter◦ the portions of documents that are actually analyzed and
indexed
Indexing language◦ in a broad sense, any vocabulary, including uncontrolled
vocabulary, used for indexing and the rules of syntax for its application
◦ in a narrower sense, a controlled vocabulary or classification system and the rules of syntax for its application
minimize the time and effort in finding information and maximize the searching success of users
identify potentially relevant information in the document or collection being indexed
analyze concepts treated in a document to produce appropriate index headings based on the indexing language assigned
indicate relationship among terms group together related topics scattered due to the
arrangement used in a document or collection direct the users seeking information under terms not chosen
as index headings to headings that have been chosen, by means of See reference
suggest related topics by means of see also reference tools for current awareness services
Purposes and Uses of Indexes
Anderson, James D. 1997. NISO-TR02, Guidelines for indexes and related information retrieval devices.
◦ provides guidelines for the content, organization, and presentation of indexes used for the retrieval of documents and parts of documents
◦ deals with the principles of indexing, regardless of the type of material indexed, the indexing method used (intellectual analysis, machine algorithm, or both), the medium of the index, or the method of presentation for searching
◦ it emphasizes three processes essential for all indexes: comprehensive design, vocabulary management, and the provision of syntax
Indexing Standards
Wellisch, Hans 1999. NISO-TR03, Guidelines for alphabetical arrangement of letters and sorting of numerals and other symbols.
◦ provides rules for the alphabetical arrangement of headings in lists of all kinds, such as bibliographies, indexes, dictionaries, directories, inventories, etc.
◦ it also covers the sorting of Arabic or Roman numbers, and other symbols
◦ it consists of seven rules that cover problems which may arise in alphanumeric arrangement of headings
◦ is based on the traditional order of letters in the English alphabet and that of numerals in ascending arithmetical order
ISO 999:1996, Information and documentation—guidelines for the content, organization and presentation of indexes
◦ gives guidelines for the content, arrangement and presentation of indexes to books, periodicals, reports, patent documents and other written documents, also to non-print materials, such as electronic documents, films, sound and video recordings.
◦ concerned with basic indexing principles and practice rather than with the detailed procedures of indexing that vary according to type of matter indexed and the users for whom the index is intended
◦ covers the choice, form and arrangement of headings and subheadings used in index entries once the subjects to be indexed have been determined
ISO 25964-1: 2011 Information and documentation – Thesauri and interoperability with other vocabularies – Part 1: Thesauri for information retrieval
◦ gives recommendations for the development and maintenance of thesauri intended for information retrieval applications
◦ applicable to vocabularies used for retrieving information about all types of information resources, irrespective of the media used (text, sound, still or moving image, physical object or multimedia) including knowledge bases and portals, bibliographic databases, text, museum or multimedia collections, and the items within them
◦ provides a data model and recommended format for the import and export of thesaurus data
◦ applicable to monolingual and multilingual thesauri◦ not applicable to the preparation of back-of-the-book indexes,
although many of its recommendations could be useful for that purpose
◦ not applicable to the databases or software used directly in search or indexing applications, but does anticipate the needs of such applications among its recommendations for thesaurus management
ASI/H.W. Wilson Award◦ was established in 1978 to honor excellence in
indexing of an English language monograph or other non-serial work published in the United States during the previous calendar year
◦ its purpose is two-fold: for indexers, to provide and publicize models of
excellence in indexing; for publishers, to encourage greater recognition of
the importance of quality in book indexing.
Indexing Awards
The Theodore C. Hines Award or Hines Award◦ was established in 1993 to honor those members
who have provided exceptional service to American Society for Indexers.
◦ ASI’s highest honor to its own, and was named for Ted Hines, who played a large part in the establishment of the Society
Web Indexing Awards
to encourage high quality web site indexes and to promote the web indexing work of professional indexers, the Web & Electronic Indexing Special Interest Group of the American Society for Indexing awards a deserving indexer the annual Web & Electronic Indexing SIG Award for excellence in web site indexing
Indexes by type of object referred toa. authors: all types of document creators such as
writers, composers, illustrators, translators, editors, choreographers, artists, sculptors, painters, inventors
b. subjects (topics or features): topics treated in documents and/ or features of documentary units (for example, genre, format, methodological approach). Separate indexes are often devoted to special types of topics such as persons, places, or corporate bodies; features, such as genres (for example, poetry, drama); or notations, such as International Standard Book Numbers (ISBN).
Types of IndexesNISO-TR02-1997
Indexes by type of term used for headingsa. names: proper nouns, such as names of
persons, places, corporate bodies.b. numbers or notations: numerical or coded
designations, such as classification notation, patent number, ISBN, date.
c. words and phrases: common words and phrases (as opposed to names or proper nouns).
Indexes by type or extent of indexable matter on which an index is baseda. full text of documentb. abstractsc. titles onlyd. first lines only (for example, first lines of poems)e. citations(reference citations to other documents
Indexes by arrangement of entriesa. alphabetical or alphanumericb. classified: headings arranged on the basis of
relations among concepts represented by headings, for example, hierarchy, inclusion, chronology, or other association. Classified indexes are often based on existing classification schemes, such as the Dewey Decimal Classification.
c. alphabetico-classed: broad headings arranged alphabetically. Narrower headings are grouped under broad headings and arranged alphanumerically or relationally on the basis of hierarchy, inclusion, chronology, or other association.
Indexes by method of document analysisa. human intellectual analysis and identification of
topics and concepts expressed and/ or features manifested
b. computer algorithms designed to identify useful terms, phrases, or features
c. combination of computer-based and human analysis.
Indexes by method of term selectiona. assignment of terms to represent topics and
features (whether or not the term is in the documentary unit being indexed)
b. extraction of terms from the documentary unitc. a combination of assignment and extraction
methods
Indexes by method of term coordinationa. pre-coordinate combination: such as subject
heading indexes, string indexes, chain indexes, keyword indexes (including KWIC, KWOC, KWAC indexes), rotated, and permuted indexes
b. post-coordinate combination: includes the use of Boolean operators, proximity measures, and the combination of weighted terms.
Indexes by type, periodicity, format, genre, or medium of document(s) being indexed◦ Examples are: books, monographs, periodicals, serials,
poetry, fiction, short stories, films, videos, illustrations, pictures, paintings, artifacts, software, computer readable texts, maps, and sound recordings
Indexes by medium of indexa. printed or writtenb. microformc. electronic media, including online, CD-ROMd. braille
Indexes by periodicity of the indexa. one-time, closed-end indexesb. continuing, open-end indexes
Indexes by authorshipc. authored: an authored index; a separately authored
document distinct from the document(s) that is (are) being indexed. It is created independently by one or more persons through intellectual analysis of text, as distinguished from indexes that are created solely through algorithmic analysis of text carried out electronically
d. automatically generated