21
Introduction to the Sweetpotato (SP) Database Infrastructure

Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Introduction to the Sweetpotato (SP) Database Infrastructure

Page 2: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Purpose & Goals

• Efficient storage of all SP breeding data

• Trial data, phenotypic and genotypic information, metadata

• Incorporate analysis tools

• Easy querying: Slice data by year, location, program, etc.

• Enable new, genome-based breeding methods

Page 3: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Rony Swennen’s banana breeding notebooks on their way to Belgium

Page 4: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

1980s: The Personal Computer Revolution

Page 5: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Excel-based data management

DB

Experiment planning & Data Capture

Data Analysis Storage

Page 6: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is
Page 7: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is
Page 8: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Data Management with Excel

Plusses

Easy to use

User has complete control

Minuses

Sharing Data

Combining Data

Data Analysis difficult across sheets

Data Integrity

Becomes difficult to manage for large datasets

Difficult to manage genotyping information

Page 9: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is
Page 10: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Data management using the “cloud”

• Most data is managed using the web

– Youtube for videos

– Google for documents

– Flickr, iCloud etc. for photos

– Twitter for status updates

– etc.

• What about breeding data?

Page 11: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Data Infrastructure

Main Datastore

• https://sweetpotatobase.org/

Analysis (integrated with sweetpotatobase)

• http://hidap.sgn.cornell.edu/

Data Collection

• Android FieldBook App

• AccuDataLogger

Page 12: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

SWEETPOTATO BASE

ACCESSIONS

FIELD LAYOUT

LOCATIONS

DATA EXPORT, ANALYSES IN R,

HIDAP

GENOTYPING

CROSSES

PHENOTYPING

Page 13: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

BrAPI

• The Breeding API

• Application Programming Interface

• Standardized way to exchange breeding data between applications over the internet

• http://brapi.org/

SWEETPOTATO BASE

HIDAP

Page 14: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Database Requirements

Standardization of

• Trait dictionary and measurement procedures

• Ontologies (Reinhard)

• Naming of plant accessions

Page 15: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Genotyping data

• Can be extremely voluminous data

• (GBS, etc)

• Must be integrated with phenotypic data

• For Cassavabase, already 1.5 billion data points

• Hard to manage without a database

Page 16: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Data Security on the Web

• Data protected by logins

• Different levels of user access privileges

• Regular on-site and off-site backups

Page 17: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Open Data Policy

• Data on Sweetpotatobase are open

• Need a user account to download

• Downloads are tracked by account

Page 18: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is
Page 19: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

https://solgenomics.net

https://cassavabase.org

https://github.com/solgenomics

https://citrusgreening.org

https://yambase.org https://sweetpotatobase.org https://musabase.org

Page 20: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

“Cloud” approach

Plusses

Easy to use through a web browser

No software installation necessary

Software is continuously updated

All data automatically integrated

Query over several years, locations, etc. possible

Integrates phenotypic with genotypic data

Minuses

Internet needs to work

Page 21: Introduction to the Sweetpotato (SP) Database Infrastructure · ³&ORXG´DSSURDFK µ Plusses µ Easy to use through a web browser µ No software installation necessary Software is

Asante!