21
From Digitization to Discoverability: Accomplishments and New Challenges A Case Study of the JDC Archives Linda G. Levi, Director of JDC Global Archives Jeffrey Edelstein, Digitization Project Manager November 2015

C06 linda levi_jeffrey_edelstein_jdc_archives

Embed Size (px)

Citation preview

From Digitization to Discoverability:

Accomplishments and New

Challenges

A Case Study of the JDC Archives

Linda G. Levi, Director of JDC Global Archives

Jeffrey Edelstein, Digitization Project Manager

November 2015

The JDC Archives Online

Main site:

http://archives.jdc.org

Collections database:

http://search.archives.jdc.org

Digitization of Text Collections

• Nearly 2.75 million pages digitized to date

• All collections from 1914 through 1954, plus some 1955-1989 collections

• Digitization, by period: World War I Era: 100,089 pages

Interwar Period: 155,973 pages

World War II Era and Aftermath: 2,048,783 pages

Israel Collection: 87,809 pages

More Contemporary Collections: 352,954

Projects • Judaica Europeana: Shared file-level XML for 1914-1918 collection

• Yad Vashem: Sharing complete XML to item level and digital assets for Geneva 1945-1954 collection

• European Holocaust Research Infrastructure (EHRI): Shared descriptions (with finding aid links) of 7 Holocaust-era collections

• CENDARI: Shared file-level XML for 1914-18 and 1919-21 collections

• World Digital Library: Provided descriptive metadata (via spreadsheet) for 36 selected images

• Empire State Digital Network: Provided descriptive metadata (via XML output) for selected photos

• Digital Library of the Caribbean: Shared file-level XML for Dominican Republic Settlement Association (DORSA) collection

• Atlit: Shared lists of names of detainees for indexing and entry into Atlit’s database.

• Beit Hatfutsot: Goal is to provide access to Names Index API so that names in our database are returned as search results in their interface

Dissemination of Digitized Text Collections

World War I Era World War II Era and Aftermath Israel Collections

Digitization & Dissemination of Other Collections

• Photo Collection: Over 65,000 photos digitized

• Names Index: 500,000 names from lists and index cards in the text collections

• Oral History Collection: AV recordings and transcripts of interviews with 155 JDC staff and lay leaders, 1961-2010

What We Have Learned

1. Digitization is step #1. Discoverability is step #2.

2. Steps to drive traffic to our site – Quarterly JDC Archives eNewsletter

– Social media: facebook, Instagram

– Linking to other sites

– Google search engine optimization

– JDC website to drive traffic

3. Curated material is popular – Online exhibits

– Topic guides

– Photo Galleries

– Names Index

Data-Sharing Collaborations

Impetus:

• Completion of initial digitization grant: 1.8 million pages online

• Desire to increase awareness of online availability and use of the material/site traffic (donor mandate)

• Successful pilot project with Judaica Europeana

Types of Collaboration

1. Collection descriptions: Shared descriptive information from our finding aids, with link to full finding aid on the JDC Archives site (EHRI)

Types of Collaboration

2. File-level XML. Shared XML for file records for each collection (Europeana, CENDARI, Digital Library of the Caribbean)

Types of Collaboration

3. Groups of images. Shared selected images with complete metadata (World Digital Library, Empire State Digital Network/DPLA)

Timeline of Project Phases

Issues

• Technical issues

• Staff time/resources

• Legal matters

• Project management

Technical Issues

• XML output – Need to map our fields to partner’s

schema

– Work with our database provider to modify export

• Vocabulary – In-house subject terms may not be

from a standard authority (e.g., LOC subject headings)

– Vocabulary required by partner (e.g., DDC codes) may be difficult to apply, may not fully describe JDC items (WDL project)

– May need to add broader terms for general audience (WDL; ESDN)

Technical Issues

• Display – How will the records look? Image-based

projects will display a thumbnail, but document-based projects may not accommodate a logo or icon at file level

– Even after your data has been published, there may be follow-up questions and issues

Technical Issues

• Usage – Will the portal/partner be able to provide statistics on use of your

material? If so, how frequent will the reporting be?

Staff Time/Resources

• Research to identify portals/projects, determine their suitability, and establish initial contact

• Image-sharing projects require individual selection of items

• Descriptions/captions need to be rewritten or expanded to reflect project context/audience

• As noted, descriptive metadata (subject terms) may need to be added/revised

• Submission format: project-supplied spreadsheets are time-consuming to complete

Legal Matters

• Data-sharing agreements require review/approval by legal staff – Special collaborations may require drafting individual agreement

• Proposed modifications to standard agreements generally accepted without difficult negotiations, but response time may be slow – Some projects require formal application to participate; review and

approval performed only when partner’s panel meets

• Copyright concerns – Where and how will credits/acknowledgments appear?

– Will we lose control of our assets? How much should we share?

– Photographs: we have so far limited sharing to public-domain items

Project Management

• Response time at each step can be slow

• Complexity: some projects involve many data providers; some projects are developing new technical tools

• Some projects are better than others about issuing general updates to all participants

Findings

• Except where providing general descriptions only (e.g., EHRI), data-sharing projects will take longer than expected

• Preparing your output takes more than just “pushing a button”

• Although it is too soon to have solid evidence that traffic is coming to us from these sites, we believe that there is value to participating in data-sharing projects