36
SSP November 12, 2007 Washington, D.C. Tom Garnett Biodiversity Heritage Library 1 Biodiversity Heritage Library A Knowledge Domain Repository Community-Driven Open Access Tom Garnett Biodiversity Heritage Library

241 06 garnett

Embed Size (px)

Citation preview

Page 1: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 1

Biodiversity Heritage LibraryA Knowledge Domain Repository

Community-Driven Open Access

Tom GarnettBiodiversity Heritage Library

Page 2: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 2

Biodiversity Heritage LibraryIn any well-appointed Natural History Library there should be found every book and every edition of every book dealing in the remotest way with the subjects concerned. One never knows wherein one edition differs from or supplements the other and unless these are on the same table at the same time it is not possible to collate them properly. Moreover for accurate work it is necessary for the student to verify every reference he may find; it is not enough to copy from a previous author; he must verify each reference itself from the original.

Charles Davies Sherborn, Epilogue to Index Animalium, March 1922

Charles Davies Sherborn (1861-1942)

Page 3: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 3

Biodiversity Heritage LibraryYet another physical difficulty is the task of assembling the library and indexes which will enable the student to work under proper conditions…. the beginner must now be prepared to spend liberally, or else must establish himself in an institution where a large library exists; if he work by himself with only a few books, he will have to confine himself to a very narrow specialty indeed.'The Limitations of Taxonomy' by J.M. Aldrich, Science, April 22,

1927, vol. LXV, no. 1686, p.381

Insecta. Diptera. Volume I (1886-1901)

Page 4: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 4

Biodiversity Heritage LibraryThe cited half-life of publications in taxonomy is longer than in any other scientific discipline

-Macro-economic case for open access, Tom Moritz

-Current taxonomic literature often relies on texts and specimens > 100 years old.

Levinus Vincent

Elenchus tabularum, pinacothecarum, 1719

Page 5: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 5

Biodiversity Heritage Library

The Taxonomic Impediment

“The taxonomic impediment is a term that describes the gaps of knowledge in our taxonomic system”

- Darwin Declaration, 1998

Georges Louis Leclerc, comte de BuffonHistoire naturelle : générale et particulière (Oiseaux), 1799-1808

Page 6: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 6

Biodiversity Heritage Library

• that there is access to information held in national/regional/global collections

• that electronic data is efficiently captured and provided in useable form

• that existing information held in literature and by current experts is made available electronically

• that stability of scientific names of organisms, used to access this information, is promoted

- Darwin Declaration, 1998

The essential requirements for accessing and utilising this global information are:

Thylacine from Philip Lutley Sclater, Guide to the Gardens of the Zoological Society of London, 1891

Page 7: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 7

Biodiversity Heritage Library

Henry BatesInsecta. Coleoptera, 1881-1884

Convention on Biological Diversity: Article 17… exchange of information shall include exchange of results of technical, scientific and socio-economic research … It shall also, where feasible, include repatriation of information.

Page 8: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 8

Biodiversity Heritage Library

0

1

2

3

4

5

6

7

8

US & Canada Europe Mexico & C.America

SouthAmerica

Biologia Centrali-AmericanaEdited by Frederick Ducane Godman and Osbert SalvinLondon : Pub. for the editors by R. H. Porter, 1879-1915

Chart showing distribution in public collections of the complete 63 volume sets held worldwide.2 complete copies in Central America held at the Smithsonian Tropical Research Institute Library

Page 9: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 9

Biodiversity Heritage Library

Henry Walter BatesThe Naturalist on the River Amazons, 1863

Vishwas Chavan travels a lot. An informatician based at the National Chemical Laboratory in Pune, India, he collects data on what types of animal live where in India to enter into a biodiversity database … Much of the information Chavan seeks is in old, out-of-print tomes … To find them, Chavan has spent years trailing around libraries. He dreams of the day when books such as these are scanned and made available as digital files on the Internet.

“Science in the Web Age: The Real Death of Print”by Andreas von Bubnoff

Nature 438, 550-552 1 December 2005

Page 10: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 10

Biodiversity Heritage LibraryLibrary and Laboratory: the Marriage of Research, Data and Taxonomic LiteratureLondon, February 2005Eighty participants from 22 countries gathered to discuss the status and future of access to the taxonomic literature and to propose an agenda for actions that would improve the research environment for taxonomy. The participants were biologists; librarians; conservation workers; publishers; representatives of learned and professional societies, private foundations and government agencies; and specialists in information technology.Progne subis- Purple Martin

Illustrations of the nest and eggs of birds of Ohio, 1879-1886

Page 11: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 11

Biodiversity Heritage Library

Ernest Ingersoll Hand-book to the National Museum … Smithsonian Institution, 1886

Based on the clear priorities of the London conference, the libraries involved gathered in Washington to lay out the ground work for the Biodiversity Heritage Library, which is a response to a global community need. Representatives of a number of major natural history and botanical libraries met at the Smithsonian National Museum of Natural History in Washington, D.C. to develop a strategy and operational plan to digitize the published literature of biodiversity held in their respective collections and to make that literature available for open access and responsible use as a part of a global “biodiversity commons.”

Page 12: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 12

Biodiversity Heritage Library• Museums

– Natural History Museum (London)– Smithsonian Institution– American Museum of Natural History]

• Botanical Gardens– Missouri Botanical Garden– New York Botanical Garden– Royal Botanic Gardens, Kew

• University Libraries– Botany Libraries, Harvard University– Ernst Meyer Library of the Museum of Comparative Zoology, Harvard

University• Research Institute Library

– Marine Biological Laboratory / Woods Hole Oceanographic Institution Library (MBL/WHOI)

Page 13: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 13

Biodiversity Heritage LibraryCollaborators:

Internet ArchiveInternational Commission on Zoological

NomenclatureOpen Content AllianceEuropean Distributed Institute of TaxonomyGlobal Biodiversity Information Facility

(GBIF)Many more under negotiation

Page 14: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 14

Biodiversity Heritage Library

Internet ArchiveSet up scanning centers in London, New York, Washington, Boston, etc. High-quality, non-destructiveScanning.Image files and text derived from OCR.

Page 15: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 15

Biodiversity Heritage Library“Guano diggers among the

albatrosses. Laysan Island”

At this stage, we would have page pictures, “dirty OCR”, some metadata, but what good is it? Will researchers be left like these guano diggers in Hawaii?

Lionel Walter RothschildThe avifauna of Laysan and the neighboring islands, 1893-1900

Page 16: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 16

Biodiversity Heritage Library

Reptilia and Batrachia. (1885-1902) by Albert C.L.G.  Günther

Mandates:Open Access: all content can be reused, repurposed, reformatted.Congruent: must fit in to and contribute to a healthy knowledge ecology.

Page 17: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 17

Biodiversity Heritage Library

Jacob Christian SchäfferElementa entomologica . . . 1766.

BHL Portal http://www.biodiversitylibrary.org

Serve image and text files; create volume, part, piece metadata; ingest page level metadata at scanning level for the creation of page level Globally Unique Identifiers (GUIDs) for linking to other taxonomic services

Page 18: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 18

Biodiversity Heritage Library

Classes of textsEach class presents a unique set of issues to resolve:Public Domain – pre-1923Post-1923 monographs

some with copyright renewalssome without copyright renewals

Non-profit learned society journals with permissionsCommercial journalsGray literatureArchival material

Page 19: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 19

Biodiversity Heritage Library

BHL Seeks Permissions from Copyright Holders

Opt in Copyright Model: The BHL will actively work with professional societies and associations to integrate their publications into the BHL in a way that serves the societies’ missions and goals

BHL will digitize learned society backfiles and mount them through the BHL Portal at no cost.

Will provide a set of files to the publishers for reuse as they see fit.

Will index the articles using Taxonomic Intelligence, thereby vastly increasing their usability.

Page 20: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 20

Biodiversity Heritage Library

What Does the BHL Offer to Publishers?Use of the articles will increase as evidenced by citation upsurge.

Long-term management of the digital assets is provided by the BHL at no cost so it’s “SEP.”

Publishers’ content is embedded in the emerging knowledge ecology that is sweeping biology in this century .

Structural markup of backfiles into conformance with NLM DTD (working on it).

Thirteen Permission Agreements to date. More under negotiation.

Integration with gray literature in later phases of project.

Page 21: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 21

Biodiversity Heritage Library

Embedding Content in the Knowledge Ecology

The BHL is primarily funded as a component of the Encyclopedia of Life, an international effort to create an authoritative website for every species of the earth’s biota.

Page 22: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 22

Page 23: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 23

Biodiversity Heritage Library

Embedding Content in the Knowledge EcologySpecies names, taxon concepts, and the classification of

living organisms are the basis for linking multiple disciplines such as evolutionary biology, taxonomy, genomics, agriculture, conservation, etc.

Taxonomic intelligence algorithms are being developed to mine the BHL content to link species names with other biological resources.

Page 24: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 24

Biodiversity Heritage Library

Page 25: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 25

Biodiversity Heritage Library

Page 26: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 26

Biodiversity Heritage Library

Page 27: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 27

Biodiversity Heritage Library

Page 28: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 28

Biodiversity Heritage Library

Page 29: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 29

Biodiversity Heritage LibraryCo-evolving bioinformatics resources

produce a rich information ecology:Consortium for the Barcoding of Life

(CBOL) with gene sequences deposited in GenBank.

GBIF’s Electronic Catalog of Taxonomic Names

Hebaria and museum specimen databasesElectronic Gazetteers and GPS.Additional services – you’re invited to help

Page 30: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 30

Biodiversity Heritage Library• Scientific and Scholarly Support

Strategy– Make it too useful not to support. – Embed it current and developing workflows

for the identification, tracking, documenting, and researching the biota. BHL is building on many documented use cases.

– Network with many professional societies.– Automated structural markup of journal

literature to bring the digitized ocr into conformance with the NLM DTD.

Page 31: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 31

Biodiversity Heritage Library• <article>•   <title>A BRIEF CONSIDERATION OF CERTAIN POINTS IN THE MORPHOLOGY OFTHE FAMILY

CHALCIDID^E.*.</title> •   <author>L. O. HOWARD.</author> •   <volume>1</volume> •   <issue>2</issue> •   <start_page>65</start_page> •   <end_page>86</end_page> •   <start_count_page>85</start_count_page> •   <end_count_page>106</end_count_page> •   <start_page_image_file>3908800908001101smthrich_0085.djvu</start_page_image_file> •   <end_page_image_file>3908800908001101smthrich_0106.djvu</end_page_image_file> •   </article>• - <article>•   <title>FURTHER NOTES ON PHENGODES AND ZARHIPIS.</title> •   <author>DR. C. V. RILEY.</author> •   <volume>1</volume> •   <issue>2</issue> •   <start_page>86</start_page> •   <end_page>96</end_page> •   <start_count_page>106</start_count_page> •   <end_count_page>116</end_count_page> •   <start_page_image_file>3908800908001101smthrich_0106.djvu</start_page_image_file> •   <end_page_image_file>3908800908001101smthrich_0116.djvu</end_page_image_file> •   </

Page 32: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 32

Biodiversity Heritage Library• Legal Sustainability Strategy

– Avoid legal conflicts. – Keep copyright infringement risk low. It is

impossible to eliminate it altogether.– Obtain permissions where feasible .– Where it isn’t feasible, move on.

Page 33: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 33

Biodiversity Heritage Library• Financial Sustainability Strategy

– Quick ramp-up high early costs – development, mass scanning, etc. Drive long-term costs down the asymptote toward zero.

– Derive some long-term costs from the operating budgets of the member institutions. (examples under consideration: acquisitions budget, staff positions, etc.)

– Integrate functions/tasks with wider efforts where appropriate, e.g. mass storage.

– Clear roles for staff who wear multiple hats. Two full-time grant funded positions currently but >15 staff who make substantive contributions.

– Make the BHL absolutely essential.

Page 34: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 34

Biodiversity Heritage Library• The Long Now Strategy

– Institutions that are creating the BHL exist to persist through time. That’s an important part of their business. Use them.

– The future is uncertain, the technology landscape changes, people pass on. So create consortial structures that are low-overhead, flexible, and can respond quickly. F2F interaction is surprisingly necessary to create this.

Page 35: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 35

Biodiversity Heritage Library• The Long Now Strategy (cont.)

– Take Risks. Why?– “We must, indeed, all hang together, or most

assuredly we shall all hang separately.“– Interoperability is the key. Repository

islands will sink.– Interested in helping? Contact me.– [email protected]

Page 36: 241 06 garnett

SSP November 12, 2007Washington, D.C.

Tom GarnettBiodiversity Heritage Library 36

Biodiversity Heritage Library