Mirror Mirror on the wall does your repository reflect it all? Peter West and Timothy Miles-Board...

Preview:

Citation preview

Mirror Mirror on the wall does your repository reflect it all?

Peter West and Timothy Miles-BoardEPrints Services

University of SouthamptonSouthampton, UK

pjw@ecs.soton.ac.uk tmb@ecs.soton.ac.uk1

Introduction

How can we help repository administrators validate the completeness and accuracy of their repository holdings?

Enquiry, Development and Support. Understand the problems faced by

repository owners.

2

Community Engagement

Concerns:

1. Does our repository content accurately reflect the published output of our institution?

2. Is our bibliographic metadata accurate and complete?

3. Are our publications correctly and unambiguously associated with the right authors, editors, contributors?

3

Case Studies

We have been involved in investigating solutions to specific instances of the three concerns.

Does content accurately reflect published output?

→Publication Matching

Is bibliographic metadata accurate?

→Authority Lists

Are publications associated with the right authors?

→Author Disambiguation

4

Case Study 1 – Publication Matching Does our repository content accurately

reflect the published output of our institution?

The repository is used to drive an internal approval workflow.

Approval is required before submission to publishers.

Very early deposit. Problems:

Are published items approved? Are approved items published?

5

Publication Matching

Collate lists of known published work Import list into the repository and run

a publication match process. Generate a report for the

administrator Provide tools to act on the data

generated.

6

Publication Matching

7

Publication Matching

8

Publication Matching

9

Publication Matching

Future work: Generalise the framework to support the

requirements for another organisation. Merge with the concepts behind the meta data

update script. End Goal:

Validation tool framework that will allow for matching a dataset using a comparison function.

Plugin support for custom lists (held in a reference manager database), existing services (using DOIs or Pub Med Ids) and new emerging sources (ORCiD).

Integrate the reporting with IRStats2.

10

Case Study 2 – Authority Lists Is our bibliographic metadata accurate

and complete? Accuracy of journal and publisher

information was affecting the efficiency of both the repository's editorial team and its submitters.

Direct impact on funding allocation. The data collected by the editorial team

could be utilised more effectively if it was integrated into the submission process.

11

Authority Lists

A database of journal information (JDB) was developed.

Retrieve journal and publisher data for an item via an interactive dialog.

Users can search other external databases.

In the worst case the user can manually enter the data.

12

Authority Lists

13

Authority Lists

14

Authority Lists

15

Issues

Data Integrity. Duplicate entries Broken links

Search Performance. Reduce similar/duplicate entries Did you mean? Order results based on popularity

Future work Multi user support

16

Authority Lists

Journal Database

ClientUni A

ClientUni B

ClientUni C

Publisher Data

Core Journal Data

Uni A Data Uni B Data

17

Authority Lists

Journal Database

ClientUni A

ClientUni B

ClientUni C

Publisher Data

Core Journal Data

Uni A Data Uni B Data

18

Authority Lists

Journal Database

ClientUni A

ClientUni B

ClientUni C

Publisher Data

Core Journal Data

Uni A Data Uni B Data

19

Case Study 3 – Author Disambiguation Are our publications correctly and

unambiguously associated with the right authors, editors, contributors?

Common problem. Leverage “single sign-on data”. Replace free-text input fields Possibility of utilising contributor data in

other ways: Subjects Affiliations

20

Case Study 3 – Author Disambiguation

21

Case Study 3 – Author Disambiguation

22

Case Study 3 – Author Disambiguation Temporal nature of roles.

23

Case Study 3 – Author Disambiguation Staff Identifiers Vs ORCiD.

24

Road Map

Our goal is to produce a set of tools and procedures for the repository community.

Q3 Publication Matching Author disambiguation

Q4 Authority Lists Consolidation Reflection Generalisation

25

Recommended