32
A Unit of the University System of Georgia A Unit of the University System of Georgia

Bibliographic Database Integrity

Embed Size (px)

DESCRIPTION

A presentation by Elaine Hardy and Bin Lin of Georgia PINES for Evergreen International Conference 2009.

Citation preview

Page 1: Bibliographic Database Integrity

A Unit of the University System of GeorgiaA Unit of the University System of Georgia

Page 2: Bibliographic Database Integrity

Bibliographic database integrity in a consortial environment

Evergreen International ConferenceMay 21, 2009

• Elaine Hardy• PINES Bibliographic Projects and Metadata Manager

Page 3: Bibliographic Database Integrity
Page 4: Bibliographic Database Integrity
Page 5: Bibliographic Database Integrity

Twentieth Century Literary Criticism: illustration of single record for each serial

volume

Page 6: Bibliographic Database Integrity

GPLS Intern’s statistics

Before AfterAlexander McCall Smith 245 172Grace Livingston Hill 1119 549Mary Higgins Clark 771 386Magic School Bus (print) 554 218Danielle Steel 1235 718

Page 7: Bibliographic Database Integrity

Duplicate records cause – “User information overload”– “Reduced system efficiency”– “Low cataloging productivity”– “Increased cost for database maintenance”

Sitas and Kapidakis, 2008

“There is no question that merging such records is vital to effective user services in a cooperative environment.”

Tennant, 2002

Page 8: Bibliographic Database Integrity

What patrons think ---• wish that you would list the most current book first and have only

one entry for each book instead of showing multiple entries. Sometimes I have to look through 50 - 100 entries to see 20 books and the newest book by the author is entry 80. There should be a way to stream line this procedure.

• Consolidate entries for the same title. There are numerous entries on some titles beyond the breakdown of hard cover, PB, large print,audio, etc.”

• Why so many listings for the same books--that's confusing• When I look up a book, many times I get two pages all of the same

title with the same cover. It confuses me because I see that my library system doesn't have it, but if I scroll down...Whoops! We do have it. What is that all about? It sucks.

• Creating a standard for the way an items information is entered. Some books only have half the title entered and this can create problems when searching for specific materials

Page 9: Bibliographic Database Integrity

Why?

Page 10: Bibliographic Database Integrity
Page 11: Bibliographic Database Integrity
Page 12: Bibliographic Database Integrity
Page 13: Bibliographic Database Integrity
Page 14: Bibliographic Database Integrity

• Big library does not equal good data• A large library does not always follow rules and adhere

to standards• Size can they cut corners for “efficiency”• Local notes don’t belong in subject fields• Make the time to check your data• Publishers are not catalogers’ friends

Page 15: Bibliographic Database Integrity

Examples of problem reference library records

Page 16: Bibliographic Database Integrity

.http://www-03.ibm.com/ibm/history/exhibits/mainframe/mainframe_2423PH3090.html

.

Page 17: Bibliographic Database Integrity

Legacy system characteristics• All were IBM based systems• No tags, thus no definition of fields• All fields fixed length

– allotted so many characters for each field• No standards

– Not required to enter pagination or publisher• Extraction of data a problem

– had to count in to find beginning of next field– In many cases, had to supply a pub date. One lib has 1901 as a

pub date on most of their extracted records 

Page 18: Bibliographic Database Integrity

Records from a nonMARC system

Page 19: Bibliographic Database Integrity
Page 20: Bibliographic Database Integrity

Phase II

http://commons.wikimedia.org/wiki/Template:Potd/2007-01

Page 21: Bibliographic Database Integrity

Records with corrupted headings

Page 22: Bibliographic Database Integrity
Page 23: Bibliographic Database Integrity
Page 24: Bibliographic Database Integrity
Page 25: Bibliographic Database Integrity
Page 26: Bibliographic Database Integrity
Page 27: Bibliographic Database Integrity
Page 28: Bibliographic Database Integrity
Page 29: Bibliographic Database Integrity
Page 30: Bibliographic Database Integrity

Lessons learned• Big library does not equal good data• Make the time to check your data• Publishers are not catalogers’ friends• Be careful about CIPs with no description and records with multiple ISBNs• Come up with realistic match when records are same but information differs• One library will not have the same good records across all their collections

– may have good print but bad AV• LOTS of programming if multiple sources of records.• No matter -- budget, personnel, time -- is as important as concentrating on

clean-up prior to migration• Be as specific as possible with vendors, test and have a penalty phase.• Have the right people in place from day one

Page 31: Bibliographic Database Integrity

Enable discovery

Page 32: Bibliographic Database Integrity

Goodbye