Introducing ICEDIG Innovation and consolidation for large

Preview:

Citation preview

www.luomus.fi

Introducing ICEDIGInnovation and consolidation for large-scale

digitisation of natural heritage

Hannu Saarenmaa, Kari Lahti & Leif Schulman

ICEDIG Opening Conference6 March 2018

Helsinki, Finland

12.3.2018Professor Leif Schulman, Director 1

www.luomus.fi

ICEDIG is for• Pan-European Research Infrastructure, in the

making.• DiSSCo will be developed through a series of

projects.• ICEDIG is a Design Study for DiSSCo, in the area of

digitisation of scientific collections.

www.luomus.fi

ICEDIG– paving the way for large-scale digitisation of natural heritage

www.luomus.fi

Just a group?

ANA CASINO, CETAF

ICEDIG challenge• Premise: Only around 10 % of

natural sciences collections havebeen digitally catalogued and 1-2% imaged

• Scale and complexity of digitisingand providing accessàtechnological, socio-cultural, andorganisational capacityenhancement required.

• Challenges of efficiently digitisingand seamlessly accessing thecollectionsà new research andtechnological innovationrequired ICEDIG will design all necessary technical, financial,

policy and governance aspects for developing andoperating DiSSCo (prototypes, blueprints, novelworkflows, new industry partnerships, and citizeninvolvement models)

Tray of mayflies (Ephemeroptera) with bounding boxesfrom the Inselect programme at NHM

LEIF SCHULMAN, FI

www.luomus.fi

Vision

• 1.5 billion specimens in European scientificcollections

• DiSSCo needs to digitise a significant part of them(such as 50%), in a foreseeable future (such as 20years)...

• This requires digitising 20-40 million specimens eachyear, which is 10 times the current rate.

• How to do that requires innovation of new ways ofdigitising, and consolidation and scaling up whatalready works.

www.luomus.fi 12.3.2018 6Professor Leif Schulman, Director

www.luomus.fi 12.3.2018 7Professor Leif Schulman, Director

www.luomus.fi

Innovation and Consolidation

• The best available mass-digitisation systems forherbarium collections can process 5,000specimens/day. That is 1 million/year.

‒ Herbaria contain about ¼ of all specimens. 2D objects.à Consolidation and scaling up the existing technology is the

approach.

• The best available mass-digitisation systems forpinned insects can process 500 specimens/day...

‒ Insect collections contain more than ½ of all specimens.‒ Most insects are 3D objects.à Innovation of new digitisation technologies is required.

www.luomus.fi

• Call: H2020-INFRADEV-1-2017 (Development and long-termsustainability of new pan-European research infrastructures)

• Submission: 29 March 2017, as a Research and Innovation Action(RIA)

• Approval: 14.5/15 p., 1st of 39 proposals; GA Nov 2017 (777483)• Term: Jan 1st 2018 to March 31st 2020 (27 months )• Budget: 2.99 M€• Objective: To promote new research and technological innovation

required to solve the challenges of digitising and seamlesslyaccessing over one billion objects, which will form a data pool ofpetabyte scale. Design Study for DiSSCo.

• Coordinator: Finnish Museum of Natural History LUOMUS• Partners: Coordinator + 11 partners from 6 other European

countries, including CETAF and industry

12.3.2018 9Professor Leif Schulman, Director

ICEDIG– Innovation and Consolidation for largE scaleDIGitisation of natural heritage

ANA CASINO, CETAF

ICEDIG workpackagesand their linkages

• Technology stream: Innovations to digitise a significant part of the 1.5 billion objects in aforeseeable future, at acceptable cost, and to manage petabyte-size data.

• Consolidation stream: Develop a shared governance model to support all aspects of serviceunification (implementation of open access principles, incentive schemes, planning andprioritisation, capacity development, etc.)

• Support stream: Ensure embedding the project into the infrastructure conceptual andgovernance design; administration; communication.

LEIF SCHULMAN, FI

DiSSCopreparatory/

implementationphase

www.luomus.fi

• Mass digitisation techniques (WP3)‒ Herbarium sheets: review of state-of-the-art, gather

lessons learned, give recommendations‒ Pinned insects: possibilities to speed up techniques to

10–100-fold; test new approaches (robotics, multispectralimaging, X-ray)

‒ Liquid samples and skins and other vertebratematerial: investigate suitability of available techniques;probe researcher needs

‒ Microscope slides: possibilities to speed up techniques‒ 3D: possibilities for automation

Professor Leif Schulman, Director

Examples of technical studies tobe carried out (1/3)

www.luomus.fi

www.luomus.fi

• Data capture & quality (WP4)‒ Methods for automated text digitisation: Optical

Character Recognition‒ Crowdsourcing‒ Commercial outsourcing‒ Interoperability with collection management systems

• Data mobilisation (WP5)‒ Transcription sites on the internet: review of available

source codes‒ Small collections: specification of work for citizen

associations; pilots

Professor Leif Schulman, Director

Examples of technical studies tobe carried out (2/3)

www.luomus.fi

• Data infrastructure (WP6)‒ Digital Objects as surrogates for the physical

collection items‒ An architecture is needed

‒ Data Management Plans‒ For ICEDIG

‒ DiSSCo Services

‒ Evaluation of EU-level data infrastructures‒ EUDAT, EGI

‒ National cloud infrastructures

‒ Commercial cloud infrastructures

‒ Zenodo

Professor Leif Schulman, Director

Examples of technical studies tobe carried out (3/3)

www.luomus.fi

Data flow and use cases:– different infrastructure requirements

in different phases of data life cycle?Conveyor-drivenimagingstation Citizen&OCR

transcriptionportal Collection

managementsystem

Long-termpreservation

Scienceusersworld-wide

Stor-age

Stor-age

Stor-age

Stor-age

Stor-age

Digital Objects in theEuropean (Global)Open Science Cloud

www.luomus.fi

Digital Objects (DO) – surrogates for the physical collection items

www.luomus.fiProfessor Leif Schulman, Director

Examples of policy / legal issuesto be covered

• Prioritisation in digitisation: criteria• Management of digitisation progress• European network of digitisation factories,

and/or in-house facilities?• Data management and distribution: IPR, RRI,

open access policies, privacy, ABS• Technical standards: research object identifiers,

data attribution/citation, cloud hosting legislation• Common digital research agenda

www.luomus.fiProfessor Leif Schulman, Director

How ICEDIG interfaces with thecommunity?

• Through CETAF, SPNHC, TDWG, GBIF, ...• Linking with parallel initiatives (ALA, iDigBio, ...)• Round Tables every 3 months (to gain insights

from industry and stakeholders)• All-Hands Meetings every 6 months (open to

stakeholders)• Opening Conference and Final Conference

www.luomus.fiProfessor Leif Schulman, Director

To conclude...

• ICEDIG is a Design Study – it will not implementanything operational, but will test ideas,technologies and approaches.

• ICEDIG will create plans for the construction ofDiSSCo.

• Input and advise from the scientific community,other users, industry, policy makers, and citizenswill be needed throughout the project.

www.luomus.fi

Demonstrations during5-7 March

• Conveyor-driven mass-digitisation of pinned insects‒ During lunch break

• Conveyor-driven mass-digitisation of herbariumsheets

‒ At the basement of Topelia-building, Unioninkatu 38,Wednesday at 8:00-8:45

www.luomus.fi

8:00-8:45

9:00-12:00

Recommended