Upload
nasig
View
154
Download
1
Tags:
Embed Size (px)
Citation preview
Envisioning E-Resource Holdings
ManagementMarlene van Ballegooie
University of Toronto LibrariesNASIG 2015
Outline
• Flashback to the dawn of NASIG – What were we thinking about e-resource holdings management then?
• Current state of ERM • OCLC’s automated holdings
management services• The study results• Benefits/challenges of the service• A look to the future of e-resource
holdings management
The Hits…“In ten years, the library that we know today will be augmented by virtual libraries... Resources that seem to be locally available will actually be held at remote locations…A library’s holdings will be defined by access, not by possession.”
Lucy Seifert Wegner, “The Research Library and Emerging Information Technology.” (1992)
“Staff will need to change from pointers and retrievers to organizers and facilitators. They must accept that the library must change from a fortress to a pipeline and realize that the collections must be dealt with “en masse” rather than one at a time.”
Kenneth E. Dowlin, “The Neographic Library: A 30-Year Perspective on Public Libraries.” (1993)
“As in-house technical processing recedes into the afterglow of shared-cataloging nirvana, catalogers and other technical processing staff will move toward being managers – rather than producers – of online records.”
Richard D. Hacken, “Tomorrow’s research library: vigor or rigor mortis?” (1988)
“Providing cataloging descriptions for ‘moving targets’ will soon become a familiar problem.”
Karen L. Horny, “New Turns for a New Century: Library Services in the Information Age.” (1987)
“Cataloging may not take place entirely within libraries. Publishers of electronic manuscripts may have their own staffs provide standardized bibliographic records with a variety of subject access points.”
And the Misses…
“Few of these new kinds of journals will come from existing journal publishers, at least not if the new journals would compete with existing products.”
“Librarians’ favorite media after print will continue to be microform…”
Brett Butler, “Scholarly Journals, Electronic Publishing, and Library Networks: From 1986 to 2000.” (1986)
“Primary research – journals articles, proceedings, reports, and other published literature – that is the province of today’s research library does not have a good channel for distribution of electronic information.”
“It would be a mistake, however, to believe that electronic journals are going to replace present printed journals, anymore than television replaced motion pictures … While a few new electronic journals have appeared, they are being created at the very margins of scholarship.”
Harold Billings, “Romancing the information flow: solving the information crisis.” (1991)
“If one assumed that the number of electronic journals would grow to 100 by 1995 and 1,000 by the year 2000, they will still account for only a small proportion of the estimated 7,000 to 15,000 scholarly journals in existence. This is not something … that is going to inundate us anytime soon.”
Martin J. Dillon cited by Kim McDonald. “Despite benefits, electronic journals will not replace print, experts say.” (1991)
Proliferation of E-Content in Libraries
• University of Toronto Libraries– $29 million acquisition
budget– $17.5 million devoted to
electronic resources (60% of total acquisition budget)
– Ongoing electronic subscriptions (serials, databases, etc.)$15 million (86% of e-resource budget)
• Libraries are making substantial investments in electronic resources
• Several players in providing access to e-resources– Libraries– Content providers– Knowledgebase vendors / Link resolver vendors– Subscription agents
• More interdependencies than ever…all based on…
A Changing Environment
E-Resource Data Supply Chain
Library activates purchased content in KB to make content available for discovery
Content provider supplies knowledgebase provider with metadata for all electronic content available for purchase
Library purchases electronic resources. Content provider supplies library with title list of purchased materials. (hopefully!)
ContentProvider
Knowledgebase Provider
Library
Manual Processing• Holdings maintenance is a time consuming and
manual process• Constant ‘tweaking’ of metadata in ERM– Serial coverage dates– Individual title purchases– Non-standard packages
TT
Metadata supplied by content providers is often incomplete or erroneous• Title changes• Title transfers• Ceased titles
ProblematicMetadata
TT
Time Lags• Getting content provider metadata
into knowledgebase• Getting title list from content provider• Getting holdings registered in ERM• The more time goes by, the greater
chance it will get neglected
Electronic resources exist in remote locations, yet we rely on people in libraries to pass around information about their holdings.
Metadata is passed through many hands…Sometimes, the baton gets dropped…
Too Many Intermediaries
To overcome current shortcomings in ERM, we need to change the way the data flows.
How should data travel?
…As the crow flies
Automated Holdings Management
ContentProvider
Library
Content provider supplies knowledgebase provider with metadata for all electronic content available for purchase
Knowledgebase Provider
Content provider supplies knowledgebase provider with metadata for institution-specific holdings.
Knowledgebase provider activates institution-specific holdings in content packages.
Electronic resources are available for discovery without library intervention.
Behind the Curtain• To enable autoload, providers supply
OCLC with the following files:– Collections File: KBART format file for
each collection/package offered by the content provider
– If applicable, KBART format file for PDA e-books
– Collections Description File: Listing of all collections being transferred
– Holdings Data File: Includes the institution holdings by collection/title with customer identifier
– Customer Map: Includes the provider’s customer identifier and the corresponding OCLC cataloging symbol
Research Questions• How well do automated loads reflect the
library’s purchased electronic content?• What types of collections are ideal for
automated holdings maintenance?• How quickly do titles get in the system
using the automated service?• How is the loaded content organized in
relation to the library’s licensing agreements?
• Does the service provide adequate reporting to enable libraries to monitor their collections?
The Study
• Study duration: September 2014 – May 2015• Signed up for as many automated feeds as possible,
no matter how big or small• Each time a file was uploaded in WorldCat
knowledge base, a corresponding access report was retrieved from the content provider site
• Data uploaded to a MySQL database and manipulated to make it suitable for comparison
• Custom scripting to determine matched and non-matched titles
ebrary
• Service Profile–Collection in KB: ebrary All Purchased– Frequency: Every two weeks–OCLC number coverage: 95%–Available for PDA: Yes
ebrary Results
9/11/2014
10/1/2014
10/31/2014
11/11/2014
11/26/2014
12/26/2014
1/23/2015
2/4/2015
2/26/2015
3/3/2015
0 2000 4000 6000 8000 10000 12000 14000 16000
12418
12417
12417
12424
12427
12435
12447
14524
14571
14571
12120
12121
12116
12110
12117
12127
12435
12441
14533
14571
298
296
301
314
310
308
12
2083
38
0
Unmatched URLs Matched URLs All ebrary URLs
ebrary Observations
• Irregular frequency (between Sept 2014 and May 2015, only 10 uploads)
• Single title orders are often the most anxiously awaited…monthly load too long to wait
• Majority of missing titles showed up in the next subsequent upload
• KB initially represented a fraction of our ebrary titles…later additional collections were added to the knowledgebase
MyiLibrary
• Service Profile–Collection in KB: MyiLibrary Collection– Frequency: Weekly–OCLC number coverage: 96%–Available for PDA: No
MyiLibrary Results
9/24/2014
10/31/2014
11/5/2014
0 5000 10000 15000 20000 25000 30000 35000
30037
30037
30037
30034
30035
30036
3
2
1
Unmatched URLs Matched URLs All MyiLibrary URLs
MyiLibrary Observations
• Load frequency does not live up to expectations (between Sept 2014 and May 2015 there were 3 uploads)
• List provided by content provider missing a large number of purchased titles (approximately 30,000 titles uploaded; 39,636 titles available on website)
• All MyiLibrary content in one collection. Does not account for separately licensed content.
Postscript to MyiLibrary Story
• After contacting MyiLibrary about the missing titles, a list was produced containing ALL 39,636 titles we subscribe to on the platform.
…for the MyiLibrary collection to be updated in the WorldCat Knowledge Base…
EBL Ebook Library
• Service Profile–Collection in KB: Ebook Library Catalogue– Frequency: Once a week–OCLC number coverage: 99.8%–Available for PDA: Yes
EBL Book Library Results
2/28/2015
3/8/2015
3/24/2015
0 2 4 6 8 10 12 14
8
10
12
8
10
12
Unmatched URLs Matched URLs All EBL URLs
EBL Book Library Observations
• New content provider for University of Toronto Libaries
• Perfect results, though sample was extremely small
• Close to weekly uploads (three loads in a one month span, though nothing since end of March)
Elsevier ScienceDirect• Service Profile– Collections in KB:• Elsevier ScienceDirect Journals• ScienceDirect Book Series• ScienceDirect All Books
– Frequency: Weekly– OCLC number coverage:• Elsevier ScienceDirect Journals – 91.6%• ScienceDirect Book Series – 96.7%• ScienceDirect All Books – 98.9%
– Available for PDA: No
ScienceDirect Access Report
• The ScienceDirect access report includes:– Subscribed titles– Complimentary titles– Free-to-read titles– Non-Subscribed titles
• Much duplication in report, mainly attributed to differing access types.
• All categories, except for the non-subscribed titles, are represented in the data feed to OCLC.
Six Publication Types – Three Collections
• Journal• Book• Book Series
• Book Series Volume• Reference Work• Handbooks Series
BooksBookSeriesJournals
ScienceDirect AnalysisA Game of Hide and Seek
• Over the course of the study, some content was missing or moved from one collection to another.– Many book series volumes missing
from collections– Handbook series moved from book
series collection to serials collection– E-books were often contained in
more than one package
Changing Directions• Due to difficulties in data
matching through time, a new approach was needed
• Treat ScienceDirect as a single collection and compare distinct URLs
• Led to a more accurate picture of the uploaded content
Elsevier ScienceDirect Results
9/17/2014
10/25/2014
11/2/2014
11/16/2014
11/30/2014
12/10/2014
12/14/2014
12/22/2014
1/11/2015
1/19/2015
1/25/2015
2/8/2015
2/15/2015
3/2/2015
3/8/2015
3/16/2015
3/22/2015
3/29/2015
4/5/2015
4/12/2015
4/18/2015
4/29/2015
0 2000 4000 6000 8000 10000 12000 14000 16000
14178
14299
14327
14390
14437
14506
14538
14557
14535
14568
14578
14615
14626
14859
14876
14888
14935
14954
14978
15043
15052
15097
13049
13091
13093
14321
14321
14324
14449
14449
14489
14496
14419
14569
14607
14766
14766
14765
14869
14883
14927
14946
15010
15044
1129
1208
1234
69
116
182
89
108
46
72
159
46
19
93
110
123
66
71
51
97
42
53
Unmatched URLs Matched URLs All Elsevier URLs
Elsevier ScienceDirect Results
9/17/2014
10/25/2014
11/2/2014
11/16/2014
11/30/2014
12/10/2014
12/14/2014
12/22/2014
1/11/2015
1/19/2015
1/25/2015
2/8/2015
2/15/2015
3/2/2015
3/8/2015
3/16/2015
3/22/2015
3/29/2015
4/5/2015
4/12/2015
4/18/2015
4/29/2015
0 2000 4000 6000 8000 10000 12000 14000 16000
1129
1208
1234
69
116
182
89
108
46
72
159
46
19
93
110
123
66
71
51
97
42
53
Unmatched URLs
What we really want to know is how many titles DID NOT get into the knowledgebase.
ScienceDirect Observations
• In early uploads, many book series volumes did not get loaded into the knowledgebase
• Change in definition of ‘ScienceDirect Book Series’ collection largely resolved missing title issue
• In most cases, e-resources that were missing in one load, showed up in the subsequent load
• Frequency is generally consistent, with a few minor hiccups
Of all the titles NOT matched throughout the study…
…there were only 20 titles not represented in the KB…
…That’s only 0.1% of all titles in our Elsevier account…
Autoload vs. ‘Traditional’ ERM Techniques
• Comparison between UTL’s ‘subscribed’ ScienceDirect titles in ERM and Elsevier entitlements
• Misalignment between selected packages and actual purchases– 879 titles we are entitled to were not represented
in subscribed content packages– 247 titles in the subscribed packages were titles
we did not have access to
An ERM Promise Fulfilled?
• Time saving for librarians• Well suited for “cherry-picked”
collections where manual selection is necessary (i.e. aggregator platforms)
• Increased accuracy• Excellent compatibility with PDA
programs
Some Remaining Challenges• Completely reliant on accuracy of
content provider metadata– Any problems need to be addressed
by the content provider– Manual corrections will be overwritten
each time data is reloaded• Length of time between uploads can
be long (monthly or more)• Difficult to spot when things do go
wrong and content does not get loaded.
Seamless Updates
• Will there ever be a time when activation on content provider site and knowledgebase is synched daily?
Better Reporting Capabilities
• Increased reporting capabilities– Alerts/notifications when uploads occur– Libraries need to know what content could not
be loaded
• Feedback loop– Ability to analyze data and
report inconsistencies leads to better product development
Help With Single Journal Subscriptions
• Managing single e-journals is like trying to herd cats– Consolidation of registration/activation• Do I really need to activate a title on the vendor site
AND in the ERM system?
– New opportunity for subscription agents?
Concurrent Users
• Ability to determine concurrent user limit• Particularly important for aggregator
packages that have multiple purchasing options– i.e. ebrary MUPO and SUPO collections
Greater Participation
• This is only the tip of the iceberg
• Libraries need to advocate for autoloaded collections …LOUDLY!
How Do We Get There From Here?
Standardization
TechnologicalSophisticationCo-operation
CustomerFeedback
ProgressiveLicensing Terms
Data Integrity
A Common Purpose
and knowledgebase providers
Above all, perhaps, librarians and publishers should sit down at a table of common purpose and join again in what has always been a necessary partnership: to publish and make available the ideas and creative works of authors.
Harold Billings, “Supping with the devil: new library alliances in the information age.” (1993)