50
Biodiversity Informatics: small pieces, loosely joined. Dave Roberts Natural History Museum, London [email protected] Contemporary Issues in Biodiversity U. Oxford MSc in Conservation, Biodiversity and Management 20 Feb 2013

Roberts u oxf_msc_200213

Embed Size (px)

Citation preview

Page 1: Roberts u oxf_msc_200213

Biodiversity Informatics: small pieces, loosely joined.

Dave RobertsNatural History Museum, [email protected]

Contemporary Issues in BiodiversityU. Oxford MSc in Conservation, Biodiversity and Management

20 Feb 2013

Page 2: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Goal ...

Data set ...

People ...

Addressing the challenges of taxonomy

Inventory the Earth’s speciesDocument their relationships“Publish” & apply these data

1.8 M described spp. (17M names)300M pages (over last 250 years)1.5-3B specimens

4-6,000 taxonomists30-40,000 “pro-amateurs”Many more citizen scientists?

Page 3: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Page 4: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Biodiversity informatics landscape

Key problemsLandscape is complex, fragmented & hard to navigateMany audiences (policy makers, scientists, amateurs, citizen scientists)Many scales (global solutions to local problems)

Figure adapted from Peterson et al 2010

Genotype Phenotype Biotic Interactions Environment Human Effects

Niche & Pop. Ecology

Biodiversity Loss

Phylogenetic Trees

Taxonomy

Geographic Dsitributions

Range Maps Forecasts of Change

Conservation & management

Products

Data

GenBank MorphBank Interactions Geospatial Census

IUCN

TreeBase

IPNI, Zoobank

Pop. data

GBIF

Extent of Occurrence AquaMaps

AquaMaps

Systems

Page 5: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Addressing the challenges of biodiversity informatics

“…the field [of biodiversity informatics] appears to be growing in a void of overarching, motivating questions, effectively making it a set of technologies in search of questions to address.”

Peterson et al, Syst. & Biodiv. 2010

Page 6: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Small pieces loosely joined Has many potential meanings:

Joining contributors together to form communities

Joining the data together that go towards forming a Scratchpad

Joining Scratchpad content with the landscape of biodiversity informatics data on the web

Page 7: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

The technology must largely embody the cause–effectrelationship connecting problem to solution.

The effects of the technological fix must be assessable using relatively unambiguous or uncontroversial criteria.

Research and development is most likely to contribute decisively to solving a social problem when it focuses on improving a standardized technical core that already exists.

Sarewitz and Nelson (2008) Three rules for technological fixes. Nature, 456: 871-872

I

II

III

Can technology help?

Page 8: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Identifiers

A key to find something in a database.

Page 9: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

10.4289/0013-8797.115.1.75

Page 10: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

10.4289/0013-8797.115.1.75http://dx.doi.org/

Page 11: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

10.4289/0013-8797.115.1.75http://hdl.handle.net/

Page 12: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

10.4289/0013-8797.115.1.75http://www.google.co.uk/search?q=

Page 13: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

10.4289/0013-8797.115.1.75http://zoobank.org/

Page 14: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Page 15: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Page 16: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Purves, D., Scharlemann, J. P. W., Harfoot, M., Newbold, T., Tittensor, D. P., Hutton, J. & Emmott, S. (2013). Ecosystems: Time to model all life on Earth. Nature 493: 295–297. DOI: 10.1038/493295a

Variation in biomass across the world simulated by the Madingley model for terrestrial and marine ecosystems. Fundamental ecological processes, encoded into simple computational forms, determine the abundance and body mass of organisms (grouped into cohorts for simplicity) and so indicate the state of ecosystems.

Low biomass

High biomass

Abu

ndan

ce

Body mass

Ecosystemstate

Ecologicalprocesses

Reproduction

Eating

Metabolism

Mortality

Dispersal

Other

Herbivore cohortCarnivore cohortOmnivore cohort

Page 17: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Mataturation7.7 years

Mataturation7 years

Weight

5.1 kg

Weight

4.6 kg

Weight

3.2 kg

Maturation9 years

SHRINKING FISHFor Northeast Arctic

cod, the age, size and

spawners have fallen dramatically.

Length85 cm

Length82 cm

Length73 cm

2000s

Borrell, B. (2013). Ocean conservation: A big fight over little fish. Nature 493: 597–598.

DOI:10.1038/493597a

Page 18: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Data mining.The abundant microorganisms in Earth’s soils perform myriad ecosystem services, many of which are still poorly understood or remain unrecognized. The best ways of identifying and studying these processes is a topic of debate in the ecology community.

Jansson, J. K. & Prosser, J. I. (2013). Nature 494: 40–41. doi: 10.1038/494040a

Page 19: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Number of microbial speciesNumber of genes

Modified from GBIF/GBIC –2-4 Jul 2012 –Copenhagen, ©2012, R. J. Robbins

Relman, D. (2012) Nature, 486,194–195 doi:10.1038/486194a

8 June 2012 14 June 2012

Page 20: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Modified from GBIF/GBIC –2-4 Jul 2012 –Copenhagen, ©2012, R. J. Robbins

FUNGIANIMALS

PLANTS

Then the notion that we can accomplish that goal only by looking here is just plain wrong.

If the goal of biodiversitystudies is to understandall of the diversity in theEarth’s biosphere…

BACTERIAARCHAEA

EUKARYA

Page 21: Roberts u oxf_msc_200213

DY Wu et al. Nature 462, 1056-1060 (2009) doi:10.1038/nature08656

Maximum-likelihood phylogenetic tree of the bacterial domain based on a concatenated alignment of 31 broadly conserved protein-coding genes. Phyla are distinguished by colour of the branch and ‘GenomicEncyclopedia of Bacteria and Archaea’ genomes are indicated in red in the outer circle of species names.

Gammaproteobacteria

Betaproteobacteria

Alphaproteobacteria

Deltaproteobacteria

Epsilonproteobacteria

Acidobacteria

Aquificae

Chlorobi

Bacteroidetes

Chlamydiae/Verrucomicrobia

Planctomycetes

S pirochaetes

Actinobacteria

Chloroflexi

Cyanobacteria

Firmicutes

Tenericutes

Fusobacteria

Synergistetes

Thermotogae

Deinococcus/Thermus

Deinococcus radiodurans R1

Thermus thermophilu s HB8

Thermosipho melanesiens is BI429

Fervidobacterium nodosum Rt1 7 B1

Thermo toga maritima MSB8

Thermotoga lettingae TMO

Petrotoga mobilis SJ95 R ubrobacter xylanophilus DSM 994 1

Tropheryma wh ipplei TW0 8 2 7

Bifidoba cterium longum NCC2 705

C orynebacteriu m jeikeium K 411

Coryn ebacteriu m urealyticu m DSM 7109

Coryne bacterium d iphtheriae NCT C 13129

Corynebacterium efficiens YS 314

Kineoc occus radiotolerans SRS3 0216

Ren ibacterium salmoninarum ATCC 33209

Arthrobacter sp FB24Kocur ia rhizophila DC2201Clavibacter mich iganensis su bsp michiganensis

Leifsonia xyli subsp xyli s tr CTCB07

Nocardioides sp JS614Propionibacte rium acnes KPA171202

Streptomyces coelicolor A3 2

Acidothermus cellulolytic us 11B

Thermobifida fusca YX

Frankia s p CcI3Sali nispor a ar enicola C NS 205

Saccharopolyspor a eryth raea N RRL 2338Rh odoco ccus sp RHA1Nocard ia fa rcinica IFM 10 152

Mycobacte rium abscessusMycobacterium gilvum P YR GC K

Myco bacter ium lepra e TN

Herpet

osip

hon

aur

antia

c us A

T CC 237

79

Rose

iflex

us ca

s tenhol

zii D

SM 13

941

Ch loro

flexu

s aura

ntiacu

s

Dehal

ococ

coid

es sp

BA

V1Glo

eobac

ter v

iolac

eus PCC 74

21

Syn ec

hococc

us sp

JA 2

3B a

2 13

Acary

ochlo

ris m

arin

a MB

IC1 1017

Thermosy

nechoco

ccu s e

longat

us BP 1

Nostoc p

unctifo

rme PCC 7310

2

Trichodesm

ium eryt

hraeum

IMS 101

Syn echoco

ccus sp

PCC 700

2

Synechocy

stis

sp PCC 6803

Cyano thece

s p ATCC 51142

Microcysti

s aeru

ginosa N

IES 843

Syn echococc us elonga tus PCC

630 1

Synech ococcus sp RC C307

S ynechococcus sp CC9902

Syne chococcus sp C

C9311

Proch lorococcu s marin

us str M

I T 9 211

Proc hlorococc us marin

us s ubsp m

arinus s tr C

C MP1 375

Prochlor ococcus marin

us str NATL2A

Prochlorococcus marin

us subsp pas toris str C

CMP1986

Natranaerobius therm

ophilus JW

NM W

N LF

Symbiobacterium

thermophilum

IAM 14863

Moorella therm

oacetica ATCC 39073

Heliobacterium m

odesticaldum Ice1

Desulfitobact erium

hafniense Y51

Carboxydotherm

us hydrogenoforman

s Z 2901

Pelotomaculum

thermopropioni

cum SI

Desulfotom

aculum reducens

MI 1

Candidatus Desulforudis audaxvi

ator MP104C

Syntrophomonas w

olfei sub

sp wolfei st r Goett ingen

Clostridium therm

ocellum ATCC 27405

Thermoanaer obacter tengc

ongensis M

B4

Caldicellulosiruptor saccharolyticus

DSM

8903

Clostridium phytoferm

entans ISD

gClostridium

novyi N

TClostridium

acetobutyli cum A

TCC 824Clostrid

ium tetani E88

Clost ridium botulinum

A3 str Loch M

areeClostridium

kluyveri D

SM 555

p mu

idir

tsol

Cerfringens str 13

tob

muid

irts

olC

ulinumB7

1 dn

ulk

E rt

s B

Clos

trid

ium

bei

jerin

ckii

NCI

MB

8052

Alkalip

hilus oremland

ii OhILA

sA

lkaliphilus metalliredigens

QYM

F

Clostridium difficile

630

Finegoldia magna

ATCC 29328 Exig

uoba

cter

ium

sib

iricu

m 2

55 1

5B

acill

us h

alod

uran

s C

125

Baci

llus

clau

sii K

SM K

16O

cean

obac

illus

ihey

ensi

s H

TE8

31G

eob

acill

us k

aus

toph

ilus

HTA

426

Bac

illus

pum

ilus

SAFR

032

Baci

llus

we

ihen

ste

phan

ensi

s KB

AB4

Lysi

nib

acill

us s

pha

eric

us C

3 4

1St

aphy

loco

ccus

hae

mol

ytic

us J

CSC

1435

List

eria

wel

shim

eri s

erov

ar 6

b st

r S

LCC

5334

Ente

roco

ccus

faec

alis

V58

3St

r ept

oco

ccus

sui

s 05

ZY

H3

3

Stre

ptoc

occ

us m

utan

s U

A15

9

Str

epto

cocc

us p

yoge

nes

MG

AS1

0750

Lact

ococ

cus

lact

is su

bsp

cre

mor

is S

K11

Lact

obac

illus

sake

i su

bsp

sak

ei 2

3K

Lact

obac

illus

cas

ei A

TCC

334

Lact

obac

illu

s de

lbru

ecki

i sub

sp b

ulga

ricus

ATC

C B

AA

365

Lact

obac

i llus

he

lvet

icus

DPC

457

1

Lac

toba

cillu

s gas

s eri

ATC

C 3

3323

Lact

obac

illus

sa l

ivar

ius

UCC

118

Lact

oba

cillu

s pl

anta

rum

WC

FS1

Lact

obac

il lus

br e

vis A

TCC

367

Pedi

ococ

cus

pen

tosa

ceus

ATC

C 2

5745

Lact

obac

illus

reu

teri

F275

Lact

obac

illus

fer

men

tum

IFO

395

6

Leuc

onos

toc

citr

eum

KM

20

Oen

ococ

cus

oen

i PSU

1

Fusobacterium nucleatum subsp nucleatum ATCC 25586

Acholeplasma laidlawii PG 8A

As ter yellows witches broom ph ytoplasma AYW

B

Candi datus Phytoplasma m

ali

Mesoplasma florum

L1

Mycoplasm

a mycoides subsp m

ycoides SC str PG1

Mycoplasm

a hyopneum

oniae 7448

Mycoplasm

a mobile

163K

Mycoplasm

a arthritidi s 158L3 1

Myc oplasm

a pulmo

nis UAB CTIP

Mycoplasm

a synoviae 53

Mycoplasm

a agalactiae PG2

Mycoplasm

a penetr ans HF 2

Ureaplasma parvum

serovar 3 str ATCC 7009

70

Mycoplasm

a galliseptic um R

Mycoplasm

a pneumoniae

M129

Mycopla sm

a genitalium G37

Lep tospira bifl exa se rovar P atoc strain P atoc 1 A mes

Le ptospira interrogans sero var C openha geni str F iocruz L1

Trepo nema denticola ATC C 35405

Tre ponema pallidum s ubs p pallidum s tr Nichols

B orrelia hermsii DAH

Bo rrelia garinii PB i

Rhodo pirellula baltica SH 1

Methylacidiphilum infernorum V4

Akkerm ansia m uciniphila ATCC BAA 835

Opitutus terrae P B 90 1

C andidatus Protochlamydia amoebophila UW E25

Chl amydia trachomatis A HAR 13

Chlamydop hila abort us S2 6 3

Chl amydophila pneumoniae TW 183

Chloroherpeton thalassium AT C C 351 10

Ch lorobium phaeobact eroides BS1

C hlorobac ulum parvum NCIB 8327

Chloro bium pha eobacte roides DSM 266

Prosthecoch loris vi brioformis DS M 26 5

Chlorob ium chlorochromatii CaD3

Sali nibacter ruber DSM 13855

Cytophaga hutchinsonii ATCC 33406

Candidatus Amoebophilus asia ticus 5a2

Bacteroides t hetaiotaom

icron VP I 548 2

Parabacteroides distasonis ATC

C 85 03

Porphyromonas gingivalis W

83

Flavoba cterium ps ychrophilum

JIP02 86

Gram

ella forsetii KT0803

E lusimicrob

ium m

inutum Pei191

Su lfurihydrogenibium sp YO3

AOP1

Aquifex ae olicus VF5

Nitratiruptor sp SB155 2

Sulfurovum sp NB

C37 1

Arcobacter butzleri RM

4018

Sul furimonas denitrifican

s DSM

1251

Campyl obacter fetus subsp

fetus 82 40

Cam

pylobac ter c oncisus 13826

Campyloba

cter jejuni s ubsp doylei 269 97

Cam

pylobacter hom

inis ATCC B

AA 381W

olinella s ucci nogenes DSM

1740

Helicobacter hep

aticus ATCC 5

1449

Helic obac ter pylori 26695

Acidobacteria bac

terium Ellin3

45

Solibacte

r usitatus E llin6076

Myxo

coccus xanthus D

K 1622

Anaeromyxobacter sp

Fw10

9 5

Sorangium

cellulosu

m S

o c e 56 B

dellovibrio bacteriovorus H

D10

0

Pelobacter carb

inolicus DSM

2380

Geo

bacter uraniireducens

Rf 4

Ge

obacter sulfurre

ducens PC

A

Geo

bacter lovleyi SZ

Pelobacter propionicus

DSM

2379

Syntrophus ac

iditrophicus SB

Syntroph

obacter fu

maroxidan

s MPO

B

Desulfo

coccus oleovorans Hxd3

Desulfo

at lea ps ychroali

hp L

Sv54

siragluv oirbivofluseD

psbusvu

lgar

is D

P4

Des

ulfo

vibr

io d

esul

furic

ans

subs

p de

sulfu

rican

s s

tr G

20

siralullecartni ainoswaL

M E

HP

001N

Mag

neto

cocc

us s

p M

C 1

Mag

neto

spiri

llum

mag

netic

um A

MB

1

Rhod

ospi

rillu

m ru

brum

ATC

C 11

170

Acid

iphi

lium

cry

ptum

JF 5

Gra

nuli b

acte

r bet

hes

dens

is C

GD

NIH

1

Glu

cona

ceto

bact

er d

iazo

trop

hicu

s PA

l 5

Glu

cono

bact

er o

xyda

ns 6

21H

Sphi

ngom

onas

witt

ichi

i RW

1

Zym

omon

as m

obi

lis s

ubsp

mob

ilis

ZM4

Sphi

ngop

yxis

alas

kens

is R

B225

6

Novo

sphi

ngob

i um

aro

mat

ici v

oran

s D

SM 1

2444

Eryt

hrob

acte

r lit

oral

is H

TCC2

594

Rho

doba

cter

sph

aero

ides

2 4

1

Para

cocc

us d

enitr

ifica

ns PD

1222

Dinor

oseo

bact

er sh

ibae

DFL

12

Jann

asch

i a sp

CCS

1

Rose

oba ct

er d

enitr

ifica

ns O

Ch 1

14

Sil ic

ibac

ter p

omer

oyi D

SS 3

Caulo

bacte

r sp

K31

Mar

icau

lis m

a ris M

CS10

Hyphom

onas n

eptu

nium

ATC

C 1544

4

Parvib

aculu

m la

vam

enti v

orans D

S 1

Nitrobac ter w

inogradsk

yi Nb 25 5

Brad yrhizo

bium sp

ORS278

Rhodopseu

domonas palustr

is BisB

18

Xanthobacter autot rophicus Py2

B eijerin

ckia indica su

bsp in

dica ATCC 9039

Methylobacterium

sp 4 46

Methylobacteri um radiotolerans J C M 2831

Rhizobiu

m le

gumin

osaru

m b

v viciae

3841

Mes

orhizo

bium

sp B

NC1

Meso

rhizo

bium

loti

MAFF

303099

B rucella

me lite

nsis 16M

Ba rtonella

bacil

liform

is KC583

Ba rtonella

quintana st

r To ulouse

Cand

idat

us P

elag

ibac

ter u

biqu

e H

TCC1

062

Rick

etts

ia b

ellii

RM

L369

C

Rick

etts

ia ty

phi

str

Wilm

ingt

onO

rient

ia ts

utsu

gam

ushi

Bor

yong

Wol

bac

hia

endo

sym

bion

t st

rain

TRS

of

Brug

ia m

alay

i

Wol

bach

ia p

ipie

ntis

Ehrli

chia

rum

inan

tium

str W

elge

vond

en

Ehrli

chia

can

is st

r Jak

e

Anap

lasm

a m

argi

nale

str S

t Mar

i es

Ana

plas

ma

pha

gocy

toph

ilum

HZ

Neo

ricke

ttsi

a se

nnet

su st

r Miy

aya

ma

Ch romobac terium violaceum

ATCC 12472

Neisser ia gonorrh

oeae NC CP 11945

Methylobacillus flagellatus KT

Thiobacill us denitrificans ATCC 25259

Nitrosospira

multif

ormis ATCC 25196

Nitrosomonas eutro

pha C91

Dec hloromonas arom atica RCB

Azoar cus sp EbN 1

B ordetella avium 197 N

Herminiim

onas arseni coxydans

Burkh olderia sp 383

C upriavidus taiwanensis

Poly nucleobacter ne cessa rius STIR 1

Methylibium petroleiphilum PM1

Lep tothrix c holodnii SP 6

Rh odoferax f errireduc ens T118

Pola romonas naphthalenivorans C J2

Verminephroba cter e iseniae EF0 1 2

Delftia ac idovorans S P H 1

Nitrosococ cus o ceani AT C C 19707

Alkalilimnicola ehr lichei ML HE 1

Halor hodospira halophila SL1

Methylococcu s c apsulatus str Bath

Sten otrophomonas maltophilia K 279a

Xylella fastidio sa Temecula1

Dichelobacter nodosus VCS170 3A

Francisella tularensis subsp holarctica

Thiomicrospira crunogena XCL 2

Candidatus Vesicomyosocius okut anii HALe gionella pneumophila str LensCoxiel la burnetii RSA 331

Marinobacter aquaeolei VT8Hahella chejuensis KCTC 2396

Chromoha lobacter salexigens DSM 3043Marinomonas sp MWYL1

Saccharophagus degradans 2 40Cellvibrio japonicus Ueda107

Pseudo monas syringae pv phaseolicola 1448A

Alcanivorax borkumensis SK2Acinetobacter sp ADP1Psychrobacter sp PRwf 1Psychrobacter arcticu s 273 4

Can didatus Su lcia muelleri GW

SS

Go rdonia bronchialis

Sulfurospirillum

deleyianum

Atopobi um parv ulum

E ggerthella lenta

K ytococcus sed entarius

Kribbell a flavida

Conexibacter woesei

S lackia heliotrinireducens

C ryptobacterium curtum

Acidimicrob ium ferrooxid ans

Beuten bergia caver nae

Cellulomonas flavigena

Xylanimonas c ellulosilytica

Sanguibacter keddieii

Jonesia denitrificans

Brachybacteri um faecium

Caten ulispora acidiphil a

Thermobispora bispora Streptosporangium roseum Thermomonospora curvata

Nocardiop sis dassonvillei

Stackebrandtia nassauens is Geodermatophilus obscurus Nakamurell a mult ipartita Actinosynnema mirum Saccharomonospora viridis

Tsukamurella p aurometabola Brac hyspira murdo chii

Pla nctomyce s limnophilus

Dyadobacter fermentans

R hodothermus marinus

Spir osoma l inguale

P edobacter heparinus

Chi tinophaga pinensis

Capnocytophaga oc hracea

Denitrovibrio acet iphilus

Ha

liangium ochra

ceum

Dse

bola

hofl

u

mui

bter

snea

e D

esulfomicrobium

baculatum

Dethiosulfovibrio peptidovor ans

Spha

erob

acte

r the

rmop

hilu

s

Ther

mob

acul

um te

rrenu

m

Veillonella parvula

Desulfotom

aculum acetoxidans

Anaer ococcus prevotii Aycil

bolcsullica

aci

ddlaco

ariu

s

Sebaldella termitidis

Leptotrichia buccali s

Streptobaci llus moniliformis

Meiothermu s silvanus

Meiothermus ruber

Therma naerovibrio acidaminovorans

Kangie lla koreensis

Pseudoalteromonas atlantica T6c

Idiomarina ioihiensis L2TR

Pseudoalterom onas haloplanktis TAC125

Colwellia psyc hrerythraea 34H

Psychromonas ingrahamii 37

S hewanella frigidimarina NCI MB 400

Shewanella sediminis HAW EB3

Aeromonas salmoni cida subsp salmonicida A449

Photobacterium pro fundum SS9

Vibrio fische ri ES114

Vibrio cholerae O395

Haemophilus ducreyi 35000HP

Actinobacillus succinoge nes 130Z

Shigel la flexne ri 2a str 301

Baumannia cicade llinicola str Hc Homalod isca coagulata

Candidatus Blochmannia penn sylvanicus str BPEN

Candidatus

ridanus

Buchnera aphidicola str Sg Schizaphis graminum

B uchnera aphidicola str APS Acyrthosiphon p isum

Buch nera aphidicola str Bp Ba izongia pistaciae

Wigglesworthia glossinidia end osymbiont of Glossina brevipalpis

Buchnera aphidicola str Cc Cinar a cedri

Page 22: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Modified from GBIF/GBIC –2-4 Jul 2012 –Copenhagen, ©2012, R. J. Robbins

For more than 80% of the time life has been evolving on Earth, multicellular “individuals” did not exist.

Even now, they occur in only a handful of top-level taxa.

Thus, making the “individual” the centerpiece for understanding functional intereactions, evolution and for classifying life on Earth seems problematic.

Put simply, does Ecology have the right tools?

http://evolution.unibas.ch/teaching/evol_fort/pdf/Buss1987.pdf

Page 23: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

GBIF/GBIC – 2-4 Jul 2012 – Copenhagen, © 2012, R. J. Robbins

Old Joke:

A drunk is crawling around a lamp post on his hands and knees.

A cop comes along …

Cop: What are you doing?

Drunk: Looking for my car keys.

Cop: Are you sure you dropped them here?

Drunk: No, I dropped them in the alley.

Cop: So why are you looking here?

Drunk: Because the light’s better.

Page 24: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

GBIF/GBIC – 2-4 Jul 2012 – Copenhagen, © 2012, R. J. Robbins

Old Joke:

A drunk is crawling around a lamp post on his hands and knees.

A cop comes along …

Cop: What are you doing?

Drunk: Looking for my car keys.

Cop: Are you sure you dropped them here?

Drunk: No, I dropped them in the alley.

Cop: So why are you looking here?

Drunk: Because the light’s better.

Science is a ‘light’s better’ endeavor in that research effort is not directed at areas where the work is technically infeasible. Research is directed where real, interpretable results may be obtained.

We do, in fact, conduct research where the light’s better.

But, when the light changes, so does science.

With better illumination, we look in new areas.

We find new things…

Page 25: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Most textbooks will tell you that, in 1610, Galileo Galilei became the first person to observe Saturn's rings.

But what did he really see?

GBIF/GBIC – 2-4 Jul 2012 – Copenhagen, © 2012, R. J. Robbins

Page 26: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

GBIF/GBIC – 2-4 Jul 2012 – Copenhagen, © 2012, R. J. Robbins

This?

Page 27: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

GBIF/GBIC – 2-4 Jul 2012 – Copenhagen, © 2012, R. J. Robbins

Or this?

Page 28: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

GBIF/GBIC – 2-4 Jul 2012 – Copenhagen, © 2012, R. J. Robbins

The generation of

important new insights

while handicapped

with limited

technology, indirect

measurement, and

fuzzy data is the mark

of scientific greatness.

Page 29: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Modified from ‘Linking Global Names and Pro-iBiosphere’, 2013, D. J. Patterson

Hour-glass motif for big data infrastructure

Data re-use

Data generation

Data pool

Page 30: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Modified from ‘Linking Global Names and Pro-iBiosphere’, 2013, D. J. Patterson

Big data world with re-use dataVisualisation Analysis Aggregation Manipulation

Observations Experiments Models Processed

Re-useQuality enhancement

DistributeMake discoverable and actionableAtomiseStandardize (metadata, ontology)Use stable UUIDs to identify contentPreserveFederate

RegisterMake accessibleNormalize dataStructure dataMake data digital

Data re-use

Data generation

Data pool

Page 31: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Modified from ‘Linking Global Names and Pro-iBiosphere’, 2013, D. J. Patterson

Big data world with re-use dataVisualisation Analysis Aggregation Manipulation

Observations Experiments Models Processed

Re-useQuality enhancement

DistributeMake discoverable and actionableAtomiseStandardize (metadata, ontology)Use stable UUIDs to identify contentPreserveFederate

RegisterMake accessibleNormalize dataStructure dataMake data digital

Data re-use

Data generation

Data pool

Page 32: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Modified from ‘Linking Global Names and Pro-iBiosphere’, 2013, D. J. Patterson

Nodes interconnected

Dynamically interconnectedNodes with sub-discipline specific responsibilitiesStandard Exchange formatsUsing UUIDs to identify contentOntology

Nodes are the essence of infrastructure

Page 33: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

From Rod Page: http://t.co/WvMbD6IP

Page 34: Roberts u oxf_msc_200213

Our informatics grand challenge…

“Link together evolutionary data… by developing analytical tools and proper documentation and then use this framework to conduct comparative analyses, studies of evolutionary process and biodiversity analyses”

Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi:10.1016/j.tree.2011.11.001

Page 35: Roberts u oxf_msc_200213

Our informatics grand challenge…

Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi:10.1016/j.tree.2011.11.001

This requires data, information & knowledge to be…

•! Digital Not printed paper

•! Openly accessible Not behind barriers

•! Linked-up Not in silos

“Link together evolutionary data… by developing analytical tools and proper documentation and then use this framework to conduct comparative analyses, studies of evolutionary process and biodiversity analyses”

Page 36: Roberts u oxf_msc_200213

•! 15-20k new spp. described annually (2M total)1

•! 30k nomenclatural acts (12M total) 1 •! 20k phylogenies (750k total)2

•! 31k taxa sequenced (360k taxa total)3

•! 800k BioMed papers (40M total pp. of taxonomy) 4 •! Countless specimens, images, maps, keys…

Most of our output is not digital, open or linked

Typically generated by small communities for “local” research projects

Figures from 1) Zhang, Zootaxa 2011 4, 1-4; 2) Web-of-Science; 3) Genbank and 4) PubMed.

Page 37: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Magic

Your data Your web site

A website for you & your community

Page 38: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

• Hosted websites for biodiversity data

• Virtual research & publication platform

• Completely open access & open source

• Modular & flexible

What are Scratchpads?

Page 39: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

• A single biodiversity database

• Restricted thematically, geographically or taxonomically

• A tool just for taxonomists

• Owned or controlled by anyone other than the data creator

What Scratchpads are not!

Page 40: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

How are Scratchpads funded?

2007 2011 2014

Virtual BiodiversityViBRANT

&

Page 41: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Taxonomy & LiteratureLice, mosquitos, freeloader flies, ...

(rapid upload and management of names, synonyms & bibliographic data)

Freeloader Flies, fungus gnats, ...(publication of Scratchpad data in the ZooKeys journal and export to Encyclopedia of Life)

Taxon descriptions & Publications

European Mosquito Bulletin, Phasmid Studies, ...(submission, review & dissemination of articles)

eJournals

Termites, bryozoa, ... (character matrices exporting to SDD and Nexus format, phylogenies, specimen records & maps)

Characters, Phylogeny & Specimens

Image GalleriesDragon trees, nanno fossils, cockroaches, fungi, polychaetes, ...

(rapid upload, annotation & display of images)

ICZN, GBIF, Sampled Red List Index for Plants, Global Plants Initiative ...(space for data collection, services, discussion & organisation)

Societies, Organisations & Projects

SitesUsers

2007 2008 2009 2010 2011 2012

Active Users

Site

s

Use

rs

ViBRANTScratchpads 2

500

1000

2000

3000

4000500060007000

20

50

100

200

300400

Scratchpadsbiodiversity online

Page 42: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

ViBRANT Goals

VisionConnecting the people, data & science of biodiversity

PositionOpen & sustainable development of a federated network of biodiversity informatics infrastructures

MissionFacilitate the mobalisation, sharing, reuse and publication of biodiversity data

http://vbrant.eu

ScratchpadsVirtual Research

Environment

Bioclimaticmodelling

Manuscript publishing

Sustainability

Data mining

Citizen science

Field recording

Sociology

Support services

Training& outreach

Data standards

Visualisation

Controlled vocabulary

Data aggregation

GBIF integration

Scratchpad hosting

Software inte-gration

Matrix data editor

Data publishing

Communal literature

Literature mark up

Phylogeny tools

Identification tools

NetworkingTraining

StandardsMobilisation

ServiceData

Publishing

ResearchArchitecture

Literature

Page 43: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Nexus

DwCA

CSV/tab

Newick

EoL Transfer schema (SPM) XML

SDD, Lucid, Nexus

RDF

Taxonomic Concept Schema XML

Excel file

CSV, XLS, Microsoft Word .DOC, TXT

Page 44: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

What can Scratchpads do?• Taxon pages (generated from tagged content)• Distribution maps (from specimens and TDWG regional distributions - Brummitt, 2001)• Specimen records• Bibliography management• Images, video and sound (bulk import)• Excel spreadsheet import• Tabular data editing & Character matrixes• Custom content• User management• Custom webforms• Analytics• Darwin Core Archive export (links to eMonocot Portal and EOL)• EOL data import (taxonomy, species information)• GBIF Map integration

Page 45: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

irtual Biodiversity

http://www.comber.hcmr.gr

Page 46: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Oxford Batch Operations Enginehttps://oboe.oerc.ox.ac.uk/

Page 47: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

BDJThe Biodiversity Data Journal

Making small data big!

Page 48: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

B iodiversity D ata Journal

1t2011

ISSN 1314-2828 (online) ISSN 1314-2836 (print)

Launched to accelerate biodiversity data journal

http://www.pensoft.net/biodiversitydata

A peer-reviewed open-access journal

Editor-in-Chief: VINCENT SMITH Natural History Museum, London, UK

Plazi

I . P . N . I

1. Define the publication

2. Enter metadata

3. Select taxa & content

4. Organise manuscript

5. Submit to journal

Articles

Bibliographies

Occurrence

Taxon treatments

Taxon names

Page 49: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Acknowledgements• Scratchpad technical development - Simon Rycroft, Ben Scott, Ed Baker, Alice Heaton & Katherine Boulton

• Scratchpad outreach- Laurence Livermore, Dimitris Koureas & Isa Van de Velde

• E-Monocot - Paul Wilkin & the Kew team, Charles Godfray & the Oxford team

• ViBRANT- Vince Smith, Dave Roberts & Lucy Reeve

• Our 7,000+ users

Page 50: Roberts u oxf_msc_200213

Virtual BiodiversityViBRANT

-infrastructureSEVENTH FRAMEWORK PROGRAMME

Thank you for yourattention.

Any questionse-mail: [email protected]

e-mail: [email protected]

http://vbrant.eu http://scratchpads.euhttp://www.slideshare.net/vibrantmanager/roberts-u-oxfmsc200213