51
Revisions to ICAT’s “Suggested Priorities for Bibliographic, Holdings, and Item Record Maintenance” Document Casey Sutherland CARLI Office, on behalf of ICAT ICAT Fall Forum, December 2 and 3, 2009

Revisions to ICAT’s “Suggested Priorities for Bibliographic, Holdings, and Item Record Maintenance” Document Casey Sutherland CARLI Office, on behalf of

Embed Size (px)

Citation preview

Revisions to ICAT’s “Suggested Priorities for Bibliographic, Holdings, and Item Record Maintenance” Document

Casey Sutherland

CARLI Office, on behalf of ICAT

ICAT Fall Forum, December 2 and 3, 2009

2

I-Share Cataloging and Authority Control Team (ICAT)

Current ICAT roster: Daren Callahan (SIC) Kristin Martin (UIC) Priscilla Matthews (ISU), Chair Gayle Porter (CSU) Emily Prather-Rodgers (NCC) Mary Rose (SIE) Cason Snow (NIU) Pamela Thomas (ICC) Cheryl Wegner (NBY)

3

I-Share Cataloging and Authority Control Team (ICAT)

Complete ICAT charge available from:http://www.carli.illinois.edu/comms/iug/iug-cat.html

One of ICAT’s specific charges is:“Suggest database clean-up and other projects to

improve overall bibliographic quality in the I-Share catalogs.”

4

Summary of today’s presentation

Background of “Priorities” document New “conceptual priorities” What’s New: the Details Future plans Questions? Discussion?

We should have plenty of time today, so ask questions as they arise, by raising virtual hand or sending your questions via chat.

5

Background of “Priorities” document

The document entitled Suggested Priorities for Bibliographic, Holdings, and Item Record Maintenance was originally authored by ICAT’s predecessor body (CCAC) in 2005. Only minor revisions in 2006, and none since then. Feedback on original document has been positive, so

a revision was desirable. Major revision undertaken in fall 2009:

http://www.carli.illinois.edu/mem-prod/I-Share/cat/maintpriority.pdf

6

Background of “Priorities” document (cont.)

Impetuses for fall 2009 revision: Many new Shared SQL queries and Shared Macros added to

CARLI website since 2006 New public catalog interfaces in use and in planning stages

since 2006: WebVoyáge upgrades

o In summer 2006, we had just upgraded from Voyager version 2001.2 to 6.1 (including the Unicode conversion)

VuFind (in production summer 2008) The eXtensible Catalog (XC) project (in 2010?) “New views of dirty data”

7

Background of “Priorities” document (cont.)

As with original version, these are suggested priorities for database maintenance. Every library is different, so it is expected that local

needs may vary from the priorities outlined in this document.

<CASEY: insert link to ISU chart here> Frequency categories remain the same:

Frequently Occasionally Once

8

Background of “Priorities” document (cont.)

Most of the projects included in the document are based on queries posted to the I-Share Shared SQL page, to help libraries find “dirty data”:http://www.carli.illinois.edu/mem-prod/I-Share/secure/sql.html

Some of the projects included in the document have corresponding macros posted to the I-Share Shared Macro page, to help libraries fix the “dirty data” in quasi-batch mode:http://www.carli.illinois.edu/mem-prod/I-Share/secure/macros.html

9

Background of “Priorities” document (cont.)

Another potential technique for fixing some kinds of dirty data is to download “fresh” copies of bibs (based on query results) from OCLC, using Connexion client batch functions.http://www.carli.illinois.edu/mem-serv/mem-train/090430cat/

090430JW_RepairorTrade.pdf

Some projects can be fixed using Voyager’s Pick and Scan functionality Available in both the cataloging and circulation clients. See chapter 7 of Voyager Cataloging User’s Guide.

10

Background of “Priorities” document (cont.)

Deciding which fix is best is up to the library, based on the particular project, local cataloging policies and procedures, and staff resources and expertise.

The “Priorities” document is designed to give you information for making these determinations. Alas, some projects can only be fixed with manual editing

in the cataloging client.

11

New “conceptual priorities” inthe document

A major change in the revision to this document was adoption of a set of “conceptual priorities” to help determine if Project A is more important than Project B.

Conceptual priorities used “Reducing Patron Annoyance” as a guiding principle.

New conceptual priorities resulted in a re-ordering of priorities for many existing projects. Having these conceptual priorities will also enable easier modification of

the document, when new maintenance tasks are discovered/determined.

12

New “conceptual priorities” inthe document (cont.)

#1 priority = Delivery These projects correct data problems that prevent efficient charging at the

circulation desk, the placing of requests, or the efficient processing of requested items.

The patron has already identified the item she needs, and wants to get out of the door with the materials in hand with minimal difficulty at the circ desk.

o For example, “Eliminate duplicate item barcodes” These projects are in the “Frequently” category, and at the top of that

category. Individual libraries determine what “Frequently” means.

13

New “conceptual priorities” inthe document (cont.)

#2 priority = Locating These projects correct data problems that prevent the patron from

finding the known/desired item within the library or on the web (in the case of e-resources).

o For example, “Items with perm locs different than MFHD locs” These projects are in the “Frequently” category, but follow the Delivery-

type of projects. Individual libraries determine what “Frequently” means.

14

New “conceptual priorities” inthe document (cont.)

#3 priority = Discovery These projects correct data problems that will help the

patron’s searching activities. These projects are in the “Occasionally” category, in

priority order by subcategories that follow. Individual libraries determine what “Occasionally”

means.

15

New “conceptual priorities” inthe document (cont.)

#3 priority = Discovery, subcategories: #3A = Data problems that prevent overall access in the

local Voyager database, the Universal Catalog, and/or other public catalog interfaces (such as VuFind).

o For example, “Individual bibs with more than one OCLC number”

16

New “conceptual priorities” inthe document (cont.)

#3 priority = Discovery, subcategories (cont.): #3B = Adding/correcting access points to bibs, in this order: Titles Subjects Authors Control numbers “Other,” including data elements such as language and format used for sorting/limiting

o For example, “Ampersands in Titles” and “Bibs with MeSH but no LCSH” and “Serial bibs without ISSN.”

17

New “conceptual priorities” inthe document (cont.)

#3 priority = Discovery, subcategories (cont.): #3C = Eliminating duplicate records in local databases and/or in the

Universal Catalogo For example, “Duplicate OCLC#s”

#3D = Enhancements to description, or projects to bring records up to MARC standard coding, when the MARC coding errors don’t cause problems with a higher priority above

o For example, “Identifying Cataloging in Publication level bibliographic records”

18

New “conceptual priorities” inthe document (cont.)

#4 = Legacy System data problems Some LCS/FBR system limitations had work-arounds that now conflict

with MARC standards. A few problems from the DRA environment. Not all I-Share libraries will have these problems. In theory, once fixed, the problem should not recur. These projects are in the “Once” category, and use the same conceptual

priorities as above. o For example, “Item barcodes that do not belong to your library”

19

What’s New: the Details

These are the highlights. We don’t have time to cover each and every project today,

but we did want to mention much of what is “new” since the last revision of the document (in 2006). Within the document, new projects are marked as “<NEW>” next

to project name. Within the project description, new shared macros are marked as

“<NEW>” next to the macro name.

20

What’s New: the Details (cont.)

These are the highlights (cont.). They are not specifically marked in the document, but many shared SQL queries

have been revised, usually to make their output easier to use to fix the records.o For example, the query “Items with perm locs different than MFHD locs” was revised to

include the item barcode number in the query results, so more of the records could be fixed with Pick and Scan.

A new CARLI_reports_2009.mdb file, containing the current version of all shared SQL queries (as of 11/16/2009), is now available from the URL below:

http://www.carli.illinois.edu/mem-prod/I-Share/secure/sql.html

21

What’s New: the Details (cont.)

New category: “General good practice” “Projects” listed there were under “Frequently” category previously, but

they seemed less like a project than good practice, so we changed the category name.

Projects given alpha-numeric designations to help distinguish them from each other.

Shared SQL and Shared Macro information now includes categories on those web pages, to help make them easier to find.

22

What’s New: the Details (cont.)

In the Frequently category: F1. Top priority project remains “Eliminate duplicate

item barcodes” (no change from 2006 document). Reminder added to document about cataloging and circ

client preference to check for duplicate item barcodes, to help prevent future instances of this problem.

23

What’s New: the Details (cont.)

In the Frequently category: F2. New addition (and slightly higher priority) to project dealing with incorrect item

barcodes: Add item barcodes to records that lack them.

o Of course, items in some locations legitimately don’t have barcodes (e.g., rare books).o But items that are eligible for Universal Borrowing transactions require a barcode for processing

at the other I-Share library.o New SQL query will help identify barcode-less items. It includes location and item type, so

legitimate records without barcodes can be easily ignored, or omitted from query criteria.o Query may also reveal other “errors” (e.g., item records for materials in e-resource locations).

24

What’s New: the Details (cont.)

In the Frequently category: F4. New project to deal with item records that contain the double quote character in

variable fields.o Newly-discovered WebVoyáge 6 bug results in bibs linked to items with double quote in the item

record fields below being non-requestable in the UC:

o Enumeration, Chronology, Year, Caption or Free texto Use single quote character or text such as “inch” instead of double quoteo New SQL query to find new instances of the datao New macro to fix the records, if too many to correct manually

25

What’s New: the Details (cont.)

In the Frequently category: F6. Much higher priority for project to deal with discrepancies between item permanent

location and MFHD location.o WebVoyáge 7 reveals these discrepancies to patrons much more obviously than previous versions:o Results page (hit list) displays Item permanent locationo Single record page displays MFHD locationo Revised SQL query to find these discrepancies now includes item’s barcode, so many records can be

fixed by using Pick and Scano Reminder added to change both item perm loc and holdings loc when running Pick and Scan for

location changes.

26

What’s New: the Details (cont.)

In the Frequently category: F9. New addition and slightly higher priority (“locating”) for project to

perform link checking and maintenance on URLs, for electronic resource records.

o New SQL query added to find common typos at the beginning of URLs in both bibs and MFHDs.

o Recommendation on using Xenu or other tools for more extensive link checking remains unchanged.

27

What’s New: the Details (cont.)

In the Frequently category: F11. New project to correct typos or miscoded data in call number prefixes.

o New suite of SQL queries to find the problematic records: “Odd Prefixes” queries 1-4.o Queries 1 and 2 are make-table queries designed to be used by Queries 3 and 4. Run 1 and

2, but don’t need to do anything with their results.o Query 3 designed to find typographical errors in call number prefixes. Run the query and

use the results to fix MFHDs.o New macro can change the MFHD 852 $k to a new value, if there are too many records to fix

manually.o Query 4 designed to find prefixes in 852 $h instead of coded as separate $k.

28

What’s New: the Details (cont.)

In the Frequently category: F12. New project to add missing call number suffix.

o Only applicable to libraries that use suffixes for locating purposes.o New SQL query that prompts for location code and correct suffix for that

location. Will report MFHDs that are missing the desired suffix.o New macro that will add a new 852 $m to MFHDs.

29

What’s New: the Details (cont.)

In the Frequently category: F13. New project to evaluate notes fields in MFHDs.

o Some libraries use public notes in holding records to help patrons locate materials in the collection.

o Revised SQL queries that prompt for location code, then outputs any 852 $z or 852 $x for records in that location.

o New SQL queries that prompt for location code, then outputs any MFHDs that have no 852 $z in that location. Output also includes 852 $x, in case the note was miscoded.

o New macro that will add a new 852 $z to MFHDs.

30

What’s New: the Details (cont.)

In the Occasionally category: OC1. New project to correct bibs that are unsuccessfully indexed by Voyager. Three options to find various problematic records:

o New SQL query to find bibs without indexed titles (limited to Title index – no 245 $a, errors in subfield coding, or diacritics in the non-filing indicator character positions)

o New SQL query to find bibs with empty index entries (looks at all indexed fields in the bibs, not just title – various coding errors)

o Requestable via WRO: a re-run of the “Bad tag list” that will identify bibs with invalid field tags. This server-side report was originally distributed to all I-Share libraries in April 2009, and discussed at ICAT Spring 2009 forum.

31

What’s New: the Details (cont.)

In the Occasionally category: OC2. Slightly lower priority for project to evaluate suppressed

bibs with items attached.o Revised SQL query to find these records now includes item status, item

barcode, and location/call number data, to make processing the query results more efficient.

o Reminder added that Pick and Scan in Voyager version 7 can be used to unsuppress bibs in batch, when linked item has a barcode.

32

What’s New: the Details (cont.)

In the Occasionally category: OC3. Slightly lower priority for project to evaluate suppressed

MFHDs with items attached.o Revised SQL query to find these records now includes item status, item

barcode, and location/call number data, to make processing the query results more efficient.

o Reminder added that Pick and Scan in Voyager version 7 can be used to unsuppress MFHDs in batch, when linked item has a barcode.

33

What’s New: the Details (cont.)

In the Occasionally category: OC4. Much higher priority and expanded focus for project to correct invalid

indicators in bibs.o Records with invalid indicators are (by default) rejected from the process to extract/convert

MARC records to MARCXML, for projects such as the eXtensible Catalog.o No changes to SQL queries that look at 245 field for invalid or missing filing indicators.o Requestable via WRO: a re-run of “Bad indicator count” and “Bad indicator list” that will identify

bibs with invalid characters in indicators in all bib record fields. These server-side reports were originally distributed to all I-Share libraries in April 2009, and discussed at ICAT Spring 2009 forum.

o New macros that will change indicators 1 and/or 2 in designated bib record fields.

34

What’s New: the Details (cont.)

In the Occasionally category: OC6. New project to delete 035 $a data in the format: (XXXdb)NNNNNNN.

o This 035 $a is added automatically to bibs copied from the Universal Catalog to the local databases, and represents the database code and Voyager bib ID from the library that contributed the bib to the UC. It should be deleted when the new bib is saved to database, but that doesn’t always happen.

o This particular 035 $a data contributes to “discards” from the daily feed of records into the Universal Catalog.

o New SQL query will find instances of this data. o New macro that will delete the 035 $a beginning with (XXXdb), while not touching valid OCLC

numbers in 035 $a.

35

What’s New: the Details (cont.)

In the Occasionally category: OC8. Slightly higher priority for the project to look for common

typographical errors in bib records.o Revised URL for Terry Ballard’s list of common typos.o List is organized by probability of the appearance of the

misspelled words in library catalogs.

36

What’s New: the Details (cont.)

In the Occasionally category: OC9. New project to add alternate title fields to bibs.

o New SQL query that looks for an ampersand in the 245 field, but there is no 246, 247, or 740 field with the word “and”.

o New macro that will copy the 245 $a (only) into a new 246, and change the “&” to “and”.o Libraries may wish to re-run the query after the macro processes the ampersands in 245 $a,

to determine the need for alternate titles when the ampersand is in 245 $b, or other subfields.o New SQL query that prompts for a particular symbol found in the title field (e.g., “+”), and

prompts for the corresponding spelled-out version of the symbol (e.g., “plus”) and looks for records lacking that translation in a 246, 247, or 740 field.

37

What’s New: the Details (cont.)

In the Occasionally category: OC12. New project to identify bibs that have MeSH subject headings, but lack LC

subject headings.o New suite of SQL queries to find the problematic records: “Missing LC subject” queries 1-4.o Queries 1, 2, and 3 are make-table queries designed to be used by Query 4. Run queries 1

through 3, but don’t need to do anything with their results.o Query 4 designed to present the results. Run the query and use the results to manually add

LCSH as needed to the bibs.o Query 2 can be modified to look for subject thesauri other than MeSH.

38

What’s New: the Details (cont.)

In the Occasionally category: OC13. New project to update name headings that lack an author’s death

date.o In February 2006, Library of Congress revised their policy about adding death dates

to personal name headings.o URL provided to LC’s weekly “Closed Dates in Authority Records” lists; also

available as an RSS feed.o Often, review of the death dates reveals other changes needed to name headings,

to conform with authority record.

39

What’s New: the Details (cont.)

In the Occasionally category: OC14. New project to correct MFHDs that have unindexed call numbers.

o The 852 $h will display to the user, but the call number is not searchable due to lack of indexing, due to lack of indicator 1.

o New SQL query to find MFHDs that contain a call number in 852 $h, but lack an 852 indicator 1.

o New macro that will add a new 852 indicator 1 value, if there are too many records to fix manually.

40

What’s New: the Details (cont.)

In the Occasionally category: OC18. New project to correct errors in bib record fixed field “format” codes.

o Fixing these bibs will improve limiting by format in the public catalog(s).

o New suite of SQL queries to find bibs with problematic bib level/record type combinations (“format” in Voyager tables).

o Two queries start with GMD in 245 field and compare with format. These are usually errors in the format.o New macro that changes the bib level and record type, if there are too many records to fix manually.o Two queries start with format and compare with GMD in 245. These are usually records with a missing or

invalid GMD, but some format errors can be detected as well.o Libraries that use VuFind as their public interface may want to place a higher priority on this project.

41

What’s New: the Details (cont.)

In the Occasionally category: OC19. New project to correct errors in bib record fixed field Language codes.

o Fixing these bibs will improve limiting by language in the public catalog(s).o New suite of SQL queries to find bibs with problematic or obsolete Language codes in the

fixed field (008).o New macro that changes the 008 Language code, when there are too many records to fix

manually.o Libraries that use VuFind as their public interface may want to place a higher priority on this

project.

42

What’s New: the Details (cont.)

In the Occasionally category: OC20. New project to correct bibs that lack publication date in 260 $c.

o Fixing these records will improve display of publication date to patrons, as well as improve results of other queries that compare dates in 260 $c with dates in fixed field (008).

o New suite of SQL queries to find bibs without a 260 $c, but the publication date is found in other fields/subfields.

o New macro that will parse date information from 260 $a or $b into a new 260 $c.o New macro that will change obsolete 260 $d with date to 260 $c.

43

What’s New: the Details (cont.)

In the Occasionally category: OC21. New project to correct bibs that contain letter EL instead of digit 1 in

260 $c.o Fixing these records may improve display of publication date to patrons, and will improve

results of other queries that compare dates in 260 $c with dates in fixed field (008).o Two new SQL queries to find bibs with a 260 $c that either begin with or contain dates

such as l970, rather than 1970.o New macro that will change 260 $c beginning with letter L to digit 1.

44

What’s New: the Details (cont.)

In the Occasionally category: OC22. New project to correct errors in bib record fixed field Date 1 value.

o Fixing these bibs will improve sorting and limiting by date in the public catalog(s).o New suite of SQL queries to find bibs with incomplete, invalid, or missing Date 1 in fixed

field (008).o Several queries compare 008 Date 1 with 260 $c, so fixing the bibs in projects OC20 and

OC21 should be completed before this project is begun.o New macro that will add or change the 008 Date 1 value, when there are too many records

to fix manually.

45

What’s New: the Details (cont.)

In the Occasionally category: OC23. Much lower priority for project to evaluate item records with copy number

zero.o When item record copy number is zero, no copy number displays to patrons with item

details, and no copy number is included in overdue or other patron notices.o Revised SQL query to find these records now includes MFHD 852 $t value.o Existing macro that changes item record copy zero to (by default) 1 is unchanged.o New macro that will add an 852 $t can be used to correct missing copy number at the

MFHD level.

46

What’s New: the Details (cont.)

In the Occasionally category: OC24. New project to evaluate MFHDs that lack an 852 $t.

o WebVoyáge defaults to display of “Copy 1” when MFHD lacks an 852 $t; VuFind does not currently have this default.

o New SQL query to find these records; output includes item record copy number, for analysis.o New macro that will add an 852 $t.o Libraries that use VuFind as their public interface may want to place a higher priority on this

project.

47

What’s New: the Details (cont.)

In the Occasionally category: OC29. New additions (and slightly lower priority) for project to correct

errors in MFHD record type codes.o No current Voyager functionality uses MFHD record type, but DRA did display

holdings differently based on this value.o Future implementations may use this data for display or other functions.o Three new SQL queries to find additional scenarios where MFHD record type may be

in error, based on data in bibs and/or linked item records.o No changes to existing macro to change MFHD record type.

48

What’s New: the Details (cont.)

In the Once category: ON5. New project to evaluate bib 028 field for miscoded corporate authors.

o Libraries that were not ILCSO members during the FBR days should not have this problem.

o New SQL query that looks for the presence of an 028 field in bib records for print monographs or serials.

o Query results should be evaluated and data placed in the correct field (as appropriate) or possibly deleted.

49

Future Plans for this document

Update it more regularly as new maintenance projects are discovered/determined.

Convert it from print document to a web-page, so connections between project descriptions, shared SQL and shared macros are easier to navigate.

Your suggestions ???

Online Evaluation Form

ICAT would appreciate your evaluation of this presentation.

Please go to the URL below for an online evaluation form.

<CASEY INSERT URL>

50

51

THANK YOU!

Thank you for your attention! Reminder that there is much more information

about many of the issues discussed today available from the Cataloging Documents page of the CARLI website:http://www.carli.illinois.edu/mem-prod/I-Share/cat.html

Questions? Send email to: [email protected]