15
Enabling access to the David Scott Mitchell digital collection for digital humanities research 21 Oct 2019 Euwe Ermita State Library of New South Wales

Enabling access to the David Scott Mitchell digital

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Enabling access to the David Scott Mitchell digital

Enabling access to the David Scott Mitchell digital collection

for digital humanities research

21 Oct 2019

Euwe Ermita

State Library of New South Wales

Page 2: Enabling access to the David Scott Mitchell digital

Project overview

The David Scott Mitchell collection is the Library’s most renowned collection. Formats include books, maps, photos, coins, sound recordings.

Objective is to make this collection more accessible to researchers, by understanding their needs and use of eResearch tools/platforms

Pilot the transformation and delivery of DSM digital datasets (metadata, full-text and digitised pages of books only) onto select researcher platforms.

Page 3: Enabling access to the David Scott Mitchell digital

Key issues

1. 3 months to deliver outcomes

2. Quickly understanding context of cultural collections within eResearch

3. Restrictions on access, including Indigenous Cultural and Intellectual Property (ICIP)

4. Data is currently Findable, but lacks easy Accessibility by machines

5. Data is currently Interoperable and is well structured.

6. Not currently licensing our datasets, making Reusability difficult

Page 4: Enabling access to the David Scott Mitchell digital

Approaches

1. Establishment of multi-disciplinary project team - completed

2. Workshops with partners and stakeholders – completed

3. Pilot datasets on select research tools/platforms – completed

4. Post-pilot review - underway

Page 5: Enabling access to the David Scott Mitchell digital

1. Jupyter Notebooks for interacting with Library APIs

Page 6: Enabling access to the David Scott Mitchell digital

Retrieving book metadata via ALMA (catalogue) API

1. Access ALMA records with an authorised API key

2. Load the metadata for multiple books in a dataframe

Page 7: Enabling access to the David Scott Mitchell digital

Retrieve specific book

metadata and cover page

for context and research

Retrieving book metadata through ALMA API

Page 8: Enabling access to the David Scott Mitchell digital

Rosetta API login credentials with an in-house Python API for access to METS metadata

Retrieving Book Data Through Rosetta (DAM) API

Page 9: Enabling access to the David Scott Mitchell digital

2. Jupyter Notebook integration with Voyant-Tools

Page 10: Enabling access to the David Scott Mitchell digital

Visualising and analysing book contents with Voyant-Tools

Page 11: Enabling access to the David Scott Mitchell digital

Whether in Voyant-Tools or in Jupyter, named entity recognition provides more insightful visualisations of corpus.

Named Entity Recognition Within Book Text

Page 12: Enabling access to the David Scott Mitchell digital

3. UTS Collaboration: RO-Crates for Researchers

Page 13: Enabling access to the David Scott Mitchell digital

RO-Crates for researchers

ALTOs and

scanned images

● Research Object Crate (RO-

Crate) is a community effort to

establish a lightweight approach

to packaging research data with

their metadata.

● Based on schema.org

annotations in JSON-LD.

Page 14: Enabling access to the David Scott Mitchell digital

Lessons

1. Timeframe and scope – more time to undertake research to determine

information and data needs of researchers

2. Organisational change management – appreciating the benefits/value in

undertaking and investing in similar initiatives is yet to be determined

3. Technology and capabilities – developers with capabilities across data,

information management and ETL processes are still emerging.

Page 15: Enabling access to the David Scott Mitchell digital

Acknowledgements

• UTS eResearch Team

• Peter S., Moises S., Michael L.

• Macquarie Uni – Steve Cassidy

• ARDC – Rowan Brownlee

• State Library of New South Wales

• Salek A., Peter B., Robin P., Richard N., Maggie P., Brendan S.