31
An introduction to data publications Kirsten Elger Deutsches GeoForschungsZentrum GFZ, Potsdam, [email protected]

An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

An introduction to datapublications

Kirsten Elger

Deutsches GeoForschungsZentrum GFZ, Potsdam, [email protected]

Page 2: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Research Data

• Research data are essential for scientific research

• Many datasets, e.g. observational data, areirreplaceable

• With the advent of the internet , there is a significant change in the way to collect, manage, and archive research datasets

Page 3: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

So extensive and dangerous a workEleven nations established 14 principal research stations across the Polar Regions. 12 were in the Arctic, along with at least 13 auxiliary stations. Over 700 men incurred the dangers of Arctic service to establish and relieve these stations between 1881 and 1884.

Observations on: meteorology, geomagnetism, auroral phenome-na, ocean currents, tides, structure and motion of ice and atmospheric electricity

Page 4: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Geological field work in 1995…

GPS values

Page 5: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

data ”publication“ in 1995…

Page 6: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

…and after the end of the project?

• the bad case: the pHd student/ postDoc takes the data with him orher (on a floppy disc/ CD) and, years later, throws everything away

• Slightly better: data submission (in digital or analogue form) to a computer of the department, with or without data description(depending on the time and motivation of the respective scientist

What happens when the professor or lab PI retires?• Who takes care of the hard drives with the old data?• Who takes care of paper copies of maps or other datasets?• How long may rock samples be archived after the scientist left?

Page 7: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Research Data Today

Thanks to the internet …• many datasets are available online• very fast data access, even to large datasets• online access to journal articles• online-only journals are coming of age• real-time data

Page 8: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Real-time data

example: climate station in Alaska (air, surface, shallow ground temperatures)

Quelle: Permafrost Lab, UAF, Fairbankshttp://permafrost.gi.alaska.edu/

Page 9: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

GEOFON earthquake information service

GEOFON Live Seismograms

Page 10: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

NOAA (National Ocean andAthmosphere Administration):

• Synoptic meterological records of the first IPY ín digital form(surface air temp, sea levelpressure 1-year time series)

• extensive documentary image collection

• Overview on IPY reports • Posters• Online available for download:

www.arctic.noaa.gov/aro/ipy-1

Page 11: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

… as a consequence

• With the advent of the digital era and the internet, data sets increasingly grow in size and complexity

• Data reuse and data mining are becoming more and more important

• Metadata portals (with automatically generatedstandardised metadata) are more and more important fordata discoverygetting

• There is an incrasing number of data repositories and forall types of research data

• There is increasing expectation by the scientific community, funding agencies and the public to make publicly-funded research results and data free and open accessible without any constraints.

Page 12: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Politics…

Schwerpunktinitiative “Digitale Information” der Allianz der deutschenWissenschaftsorganisationen: Die Verfügbarkeit und Nachnutzung digitaler Informationen schließt den möglichst kostenfreien und offenen Zugang zu Forschungsdaten ein..

2003 Berliner Erklärung über den offenen Zugang zu wissenschaftlichem Wissen: “Open Access- Veröffentlichungen umfassen originäre wissenschaftliche Forschungsergebnisse ebenso wie Ursprungsdaten, Metadaten, …“

Digitale Agenda der Bundesregierung 2013-2017

Page 13: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

• Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape worldwide. Research publications and other digital objects such as research data and scientific software will thus be publicly available on the internet.

• The Helmholtz Association was one of the initial signatories of the „Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities“ in 2003. This commitment towards open access was then formally approved by its Assembly of Members (assembly of the directors of the Helmholtz Centres): „Publications from the Helmholtz Association shall in future, without exception, be available free of charge, as far as no conflicting agreement with publishers or others exists.“ (Resolution of the Assembly of Members, 27 September 2004).

Helmholtz Open Science

Page 14: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Obstacles of sharing

• too much work with no benefit

• data publications were deletedfrom reference lists by journaleditors

• „they mis-interpret or mis-use mydata“

• „someone will publish MY databefore me“

• Do I have to share ALL my data? © www.aukeherrema.nl

Page 15: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

PRIVATE DOMAIN

SHAREDDOMAIN

PERMANENT DOMAIN

PUBLIC DOMAIN

Domains of research data

Think about data sharing from the beginning on!

Page 16: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

How to make intelligent openness standard?• data must be accessible and readily located• Data must be intelligible for those who wish to scrutinize them• They must be assessable so that judgments can be made about their reliability

and the competence of those who created them• They must be usable by others• For data to meet these requirements it must be supported by explanatory

metadata (data about data)

Science as an open enterprise (2012) The Royal Society Science Policy Centre report 02/12 ISBN: 978-0-85403-962-3

The practice of science: Open inquiry is at the heart of the scientific enterprise. Publication of scientific theories - and of the experimental and observational data on which they are based - permits others to identify errors, to support, reject or refine theories and to reuse data for further understanding and knowledge. Science’s powerful capacity for self-correction comes from this openness to scrutiny and challenge

Intelligent Openness (Royal Soc. London 2012)

Page 17: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

• Researchers‘ willingness to publish their data

• Technical solutions to facilitate dataavailability, access and reuse

• Recognition and credits for data producers

There is a need for….

Page 18: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Data publication with DOI

• persistent • citable• with metadata

Page 19: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

DataCite and Digital Object Identifiers(DOI) for Data

STD DOI "Publikation und Zitierbarkeit von Primärdaten" (DFG Project 2004-2009, Partner: TIB, DKRZ, PANGAEA, DLR, GFZ) DOI for research data DataCite

Page 20: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

What is a DOI

• Digital Object Identifier

• A unique and permanent identifier for digital objects

• “Signpost” to the URL with the dataset and its description = landing page

• Persistent = long term data access guaranteed by the publisher

• With metadata

Page 21: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Metadata and MetadataMetadata for data discovery: example DOI landing page

title citation

description/ abstract

Keywords

spatialcoverage

relatedwork

downloaddata files

standardisedmetadata

Page 22: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Metadata and Metadata

Metadata for data discoveryauthor, title, description, keywords, spatial/temporal domain, ...

Structural metadata (for reuse): formats, methodology, sources…

Definition of data labels

Page 23: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Metadata and Metadata

metadata for data discoveryauthor, title, description, keywords, spatial/temporal domain, ...

structural metadata (for reuse)formats, methodology, sources, processing steps, …

administrative metadata metadata related to the use, management, and encoding

processes of digital objects over a period of time Includes technical metadata: versions, checksum, timestamp,…

Page 24: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

A comprehensive data description isessential for data reuse and shouldalways be available before a DOI registration

There are different possibilities for datapublication

Page 25: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Examples for data publication 1 data supplements to scientific articles

Links to datasets

Link to original articlewith data description

Page 26: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Peer-reviewed articles with thedescription of datasets or

collections, etc.

Examples for data publication 2: Data Journals

Page 27: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

3. Data Reports – GFZ examples

Institutional Report Series have long traditions as important sources of information. Today: persistently online accessible and citable with DOI…GFZ: Data Reports• Flexible format – “enhanced data

description“ • standardised templates for each

discipline, internal review• Project-specific design if required

Page 28: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Coalition on Publishing Data in the Earth and Space Sciences

GOAL OPEN DATA in the EARTH

and SPACE SCIENCES STATEMENT OF COMMITMENT

• To promote metadata information and domain standards, […], to help simplify and standardize deposition and reuse.

• To promote referencing of data sets using the Joint Declaration of Data Citation Principles, in which citations of data sets should be included within reference lists.

• To include in research papers concise statements indicating where data reside and clarifying availability.

• To promote and implement links to data sets in publications and corresponding links to journals in data facilities via persistent identifiers. (January 2015)

Page 29: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

SIGNATURES (Nov 2015)

additional signatures welcome

Page 30: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Conclusions

• Data are increasingly recognized as part of the scholarly record, data citation is coming of age.

• Data publications with assigned DOI provide citable andpersistent access to research data.

• There is a growing number of data repositories to store and access data (institutional, domain specific, general).

• Data description is essential for reuse

Page 31: An introduction to data publications...Open science, the unrestricted access to scientific publications and cultural heritage, is an ongoing and future trend in the scientific landscape

Next step

International Geo Sample Number IGSN – uniqueidentifier forphysical objects