Upload
katrina-wood
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Publishing and Cataloguing DatasetsIt’s time everyone got involved
UKSG Conference 2009
BUT FIRST, LET’S GET SERIOUS
Part 1OECD likes being cool with data.
TIB in Hamburg say they have archived and added DOIs to
500,000 datasets.Yet their most-cited dataset has
been cited just 3 times.
Some data to start with
Either no-one wants to cite data
OR, having a DOI by itself isn’t enough
So …
Let us imagine for a moment . . .
If an article is . . .
A piece of data that is presented in a static, two-dimensional, form.
“ “Geoffrey Bilder, CrossRef, 2007
• HTML• PDF
Static, two-dimensional objects
http://dx.doi.org/10.1787.280675838368
http://dx.doi.org/10.1787.280675838368
Active, two-dimensional object
Another active,
two-dimensional object?
It’s a view on a datacube
In fact, it’s a view on a collection of datacubes
Active,
multi-dimensional object!
OECD Article
Static, two-dimensional
object
OECD Excel TableActive, two-dimensional
object
OECD Database
Active, multi-dimensional
object
http://dx.
doi.org/10
.1787.2806
7
5838368
So, instead of imagining, let’s say
we built this.
We’d get something like . . .
Dataset OECD Regional Database
Excel – Active two-
dimensional object
Dataset – Active multi-
dimensional object
PDF – Passive two-
dimensional object
. . . this.
OK – that’s cool and OECD can do this because we have
all the objects in our publishing system.
But how are other publishers, authors and
librarians coping with data?
Source: OECD
Chart from The Economist
Source: Acemoglu et al (2001), based on Curtin, 1989, Philip D. Curtin, Death by migration: Europe’s encounter with the tropical world in the nineteenth century, Cambridge University Press, New York
(1989).Curtin 1989 and other sources.
Tertiary school enrollment: School enrollment, tertiary (% of gross).
Source: Barro and Lee (2000) and their databases.
Taken from an appendix to an article published in Elsevier’s World Development
You can’t fault the author for trying . . .but it’s not a lot of help for a reader
And Librarians,
How many are cataloguing datasets in their OPACs in ways which are compatible with search systems for books and journals?
Conclusion: Datasets:Scholarly Publishing’s Black
Sheep?
A&I & subject portals Publishers
Library portals
Content Aggregators
EconLitRePEc
ScienceDirect
OPACs
Ingenta
Scholarly Publishing Sites for Journals and Books
Network
A&I & subject portals Publishers
Library portals
Content Aggregators
EconLitRePEc
ScienceDirect
OPACs
Ingenta
Scholarly Publishing Sites for Journals and Books
Network
A&I & subject portals Publishers
Library portals
Content Aggregators
EconLitRePEc
ScienceDirect
OPACs
Ingenta
Scholarly Publishing Sites for Journals and Books
Network
Using metadata for: Datasets
In the same industry standard formats as . . .Book chapters
andJournal articles
Authors will be able to cite . . .
Publishers will be able to link . . .
Discovery systems will be able to find . . .
Librarians will be able to catalogue . .
.
Datasets alongside published outputs . . .
. . . to the benefit of Everyone
A proposed example of a dataset using standard bibliographic and citation metadata.
Bibliography of Books that
cite this database
Citation tool compatible with EndNotes et al
Dataset title with ISSN,
DOI (& MARC) record
There are still challenges:- Dynamic data- Versioning- Preservation
But, let’s round the sheep up first.
OECD is:• issuing a white paper on
Publishing Standards for Datasets
• Speaking with CrossRef about citation standards for dynamic objects
• publishing OECD datasets with ‘sheepdogs’ from mid-2009:
MARC recordsONIX recordsCitation records
End of Part One
QUESTIONS?
DISCUSSION?
BEING COOL WITH DATA: OECD’S FIRST STEPS
Part 2
March 2007
Print editionWeb-book on SourceOECDUSB Key EditionOECD Factbook on eXplorer (new for 2009)OECD Factbook on iPhone (new for 2009)
April 2009
OECD Regional Statistics data using the eXplorer tool tool
OECD Regional Statistics using NCVA’s eXplorer tool
October 2008March 2009
http://stats.oecd.org/OECDregionalstatistics/
Other cool visualisation stuff
• IMF Datamapper on www.imf.org. See also www.mappingworlds.com who provided the technology.
• See Gapcasts and Trendalyzer on www.gapminder.org
• The New York Times uses a lot of dynamic graphics
• USA Today built their reputation on graphics – now they’re doing it online. We like How much is $700bn?
• Economist’s Chart Gallery generate a lot of comment.
• Data sharing sites include www.swivel.com, www.many-eyes.com an newcomers www.icharts.net and www.widgenie.com .
• There are many blogs on charts or visualization such as www.flowingdata.com or www.eagereyes.org