32
How dinosaurs broke our system Challenges in building national researcher identifier services JISC Conference, 2010 Amanda Hill Names Project

How dinosaurs broke our system: challenges in building national researcher identifier services

Embed Size (px)

DESCRIPTION

Presentation on the Names Project given at the Open Repositories conference, 11 July 2012. Video of this talk on YouTube.

Citation preview

Page 1: How dinosaurs broke our system: challenges in building national researcher identifier services

How dinosaurs broke our system

Challenges in building national researcher identifier services

JISC Conference, 2010

Amanda Hill

Names Project

Page 2: How dinosaurs broke our system: challenges in building national researcher identifier services

Hoping that…

…Simeon has explained all about the name authority problem

I’d like to talk about some of the work that we’ve done as part of the Names Project recently…

…and how that fits into today’s researcher identification landscape

Page 3: How dinosaurs broke our system: challenges in building national researcher identifier services

Gross generalisation about past approaches to author identifiers

Libraries

Book-level data

Labour intensive: disambiguation first

Authors not involved

Open

Publishers

Article-level data

Automatically generated: disambiguation later

Authors can edit

Proprietary

Page 4: How dinosaurs broke our system: challenges in building national researcher identifier services

Current international activity

ISNI ORCID

JISC Conference, 2010

Library-instigated

Disambiguation first

Authors not involved

Broad scope

Publisher-instigated

Disambiguation later

Authors can submit/edit

Current researchers

Page 5: How dinosaurs broke our system: challenges in building national researcher identifier services

Signs of convergence?

Knowledge Exchange meeting on Digital Author Identifiers in March 2012 encouraged alignment of ISNI and ORCID approaches

ISNI has reserved a block of identifiers for use by ORCID

JISC Conference, 2010

Page 6: How dinosaurs broke our system: challenges in building national researcher identifier services

Sources of information

Both ORCID and ISNI will use existing pools of information to populate their systems

ISNI: “Leveraging high confidence data from different domains”

“ORCID will link to other name identifier systems” JISC Conference, 2010

Page 7: How dinosaurs broke our system: challenges in building national researcher identifier services

National author ID systems

2011: JISC-funded survey and report on national author/researcher identifier systems around the world

Report published November 2011

http://ie-repository.jisc.ac.uk/567/

Page 8: How dinosaurs broke our system: challenges in building national researcher identifier services

Maturity of systems (late 2011)

System In development since Number of identities

Lattes (Brazil) 1999 1,600,000

Frida/Cristin (Norway) 200331,000 researchers at 160

institutions

VIVO 200324,400 faculty with profiles150,000 total IDs including

undisambiguated co-authors

Digital Author Identifier (Netherlands)

2005 (1980s for National Thesaurus of Author Names)

40,000 in the NTA15,000 researchers with Digital

Author IDs

Names Project (UK) 2007 46,000

New Zealand Electronic Text Centre

2007 2,000

Trove People and Organisations/NLA Party Infrastructure (Australia)

2007900,000 people and

organisations

AuthorClaim 2008 200

Researcher Name Resolver (Japan)

2008 190,000

Page 9: How dinosaurs broke our system: challenges in building national researcher identifier services

Populating identifier systemsSystem Records created

by cataloguersRecords imported from other systems

Records generated by data subjects

AuthorClaim      Digital Author Identifier (Netherlands)      

Frida/Cristin (Norway)      Lattes (Brazil)      Names Project (UK)      New Zealand Electronic Text Centre      Researcher Name Resolver (Japan)      Trove People and Organisations/NLA Party Infrastructure (Australia)

     

VIVO      

Page 10: How dinosaurs broke our system: challenges in building national researcher identifier services

Good sources of data for some nations

National system Existing unique identifiers

Japan Researcher identifiers from national researcher databases

NetherlandsNumber from National Thesaurus of Author names is converted into Digital Author Identifier

Norway Human resources data: social security numbers

Other national systems assign new identifiers as new identities are established.

Page 11: How dinosaurs broke our system: challenges in building national researcher identifier services

Features of mature national identifier systems

With more mature systems: A national organisation generally has oversight: e.g. in

Brazil, Norway, Netherlands

Integration with research funders, reporting agencies and institutional repositories

Individual institutions also have defined roles relating to managing information about their own staff

Page 12: How dinosaurs broke our system: challenges in building national researcher identifier services

SITUATION IN UK

JISC Conference, 2010

Page 13: How dinosaurs broke our system: challenges in building national researcher identifier services

Work to investigate unique IDs for UK researchers

Identified in 2006 as part of the call for proposals for the JISC-funded Repositories and Preservation Programme

Mimas and the British Library proposed a two-year project to: Investigate requirements for a UK name

authority service Build a pilot system to demonstrate potential

Page 14: How dinosaurs broke our system: challenges in building national researcher identifier services

The Names Project

‘From the Annals of the Onomastic Society’

Ian Watson (1990)

The Chang Project

Page 15: How dinosaurs broke our system: challenges in building national researcher identifier services

Names (not an acronym…)

Name Authorities Make Everything Simpler

Names: Ambiguous, Meaningful (or Meaningless?), Essential, Symbolic

…nearly everyone has a name-related story

Page 16: How dinosaurs broke our system: challenges in building national researcher identifier services

JISC Conference, 2010

Rhyming couples

Page 17: How dinosaurs broke our system: challenges in building national researcher identifier services

Original plan

Use data from British Library’s Zetoc service to create author IDs Journal article information from 1993-> Last names, initials, paper titles, subject

classifications

But… International in scope Lack of information on affiliations and first names to

help with making matches Huge dataset -> processing issues

Page 18: How dinosaurs broke our system: challenges in building national researcher identifier services

Revised plan

Used 2008 Research Assessment Exercise data (as cleaned up by JISC Merit project) to pre-populate the Names system Identify unique individuals and assign

identifiers

Data quality good, included institutional information: high accuracy, despite only having initials, not full first names

Except for…JISC Conference, 2010

Page 19: How dinosaurs broke our system: challenges in building national researcher identifier services
Page 20: How dinosaurs broke our system: challenges in building national researcher identifier services

JISC Conference, 2010

Page 21: How dinosaurs broke our system: challenges in building national researcher identifier services

Building on Merit…

Merit data covers around 20% of active UK researchers

Working to enhance records and create new ones with information from other sources Institutional repositories British Library data sets (Zetoc) Direct input from researchers

Page 22: How dinosaurs broke our system: challenges in building national researcher identifier services

Submission form

JISC Conference, 2010

Page 23: How dinosaurs broke our system: challenges in building national researcher identifier services

http://separatedbyacommonlanguage.blogspot.com/2009/08/initials-and-names.html

Page 24: How dinosaurs broke our system: challenges in building national researcher identifier services

Quality matters

Automatic matching can only achieve so much Dependent on data source

British Library team perform manual check of results of matching new data sources Allows for separation/merging of records

Plan to allow people to update their own information

Page 25: How dinosaurs broke our system: challenges in building national researcher identifier services

Ultimate aim

High-quality set of unique identifiers for UK researchers and research institutions

Available to other systems (national and international) e.g. Names records exported to ISNI in 2011

Possible additional services Disambiguation of existing data sets Identification of external researchers

Page 26: How dinosaurs broke our system: challenges in building national researcher identifier services

Access to Names

API allows for flexible searching of Names data

EPrints plugin released in 2011: allows repository users to choose from a list of Names identities …and to create a Names record if none

existsJISC Conference, 2010

Page 27: How dinosaurs broke our system: challenges in building national researcher identifier services

JISC Conference, 2010

Page 28: How dinosaurs broke our system: challenges in building national researcher identifier services

JISC Conference, 2010

Page 29: How dinosaurs broke our system: challenges in building national researcher identifier services

Next steps…

JISC-convened Researcher ID group – final meeting in September > recommendations

Options Appraisal Report for UK national researcher identifier service > December

Improving data and adding new records

JISC Conference, 2010

Page 30: How dinosaurs broke our system: challenges in building national researcher identifier services

Summing up

Names is a hybrid of library/publisher approaches Automated matching/disambiguation Human quality checks Data immediately available for re-use in

other systems Researchers can supply information

Page 31: How dinosaurs broke our system: challenges in building national researcher identifier services

An evolving area

Main challenges are cultural and political rather than technical

National author/researcher ID services can be important parts of research infrastructure

Getting agreement and co-ordination at national level is vital

Page 32: How dinosaurs broke our system: challenges in building national researcher identifier services

Project updates

Names: http://names.mimas.ac.uk

Blog: http://namesproject.wordpress.com

Twitter: @NamesProject

JISC Conference, 2010