Upload
natalie-stanford
View
41
Download
2
Embed Size (px)
Citation preview
@nataliestanford
SEEKing our way to better presentation of data and models
from scientific investigations.
Carole Goble
Stuart Owen
Jacky Snoep
Wolfgang Mueller
Olga Krebs Quyen Nguyen
Natalie Stanford
Katy WolstencroftPeter Kunszt Bernd Rinn
also contributing:VLN SEEK team
also contributing:UK SEEK team
Systems biology projects produce complex and heterogeneous datasets.
The data is saved and stored in convenient, but non-standard formats.
This is the case for each researcher within groups across large consortia
projects.
Consortia
Grp 3
Grp 1
Grp 2
The data contained within the files can be very ambiguous.
Sharing within labs, across projects, and publicly becomes difficult.
The availability and reusability of the data in the long-term is compromised.
This all leads to issues with conveying what a project has achieved to
funders. • Papers?• Data produced?• Discoveries?• Presentations?• Workshops?• Tutorials?
Defining success and impact of
project.
We need better ways of formatting, storing, and sharing
data and models.
SEEK is a commons originally designed for centralizing information and assets for large
consortia projects.
Each user has their own profile.
…and their data and models are uploaded to projects within the SEEK database.
SEEK has varied functionality.
Yellow pages, manage SOPs and
link to investigations, studies, assays, specimens and
samples.
Find my peers.
Creating and sharing SOPs
across projects.
Track my specimens.
Track different
versions of my model.
Data viewing functionality; ISA
framework for linking studies to
data, models, SOPs, samples,
publications.
Browse experimental data
without downloading
them.
How data, models and SOPs fit
together.
Which data belong with
which publication.
It works as aggregated asset manager, allowing storage on SEEK, or linking assets
from disparate databases.
It allows published work and all associated data and files to be organised in an ISA (Investigation, Study, Assay) format.
Construction Validation
Metabolomics
Metabolomics
Mass SpecTranscriptomics
Proteomics
Fluxomics
Investigations
Studies
AssaysTowards Interoperable Bioscience Data, Nature Genetics, 2012
Assays
The ISA structure reflects an intuitive structure and storage of scientific findings.
SEEK also integrates with other tools.
Have now set up FAIRdom to further develop SEEK as an open platform where all assets can be uploaded and linked to
with DOI.
“There is no greater impediment to the advancement of knowledge than
the ambiguity of words.”
-Thomas Reid + Natalie Stanford
Data + Models.
The data contained within the files can be very ambiguous.
There are many Systems Biology standards available.
MinimalInformationModels
Standard Formats
Ontologies
Data Models Simulation Results
[Nicolas Le Novere]
MAGE-TABStandardFormats
RDF annotations
..But, the barrier to standard formats and annotation usage by researchers can seem great.
There are tools available to assist users.
We develop RightField, a semantic annotation tool for data files.
We use it to generate templates for different types of assay data.
Excel workbook loaded into RightField with multiple worksheets
Suitable ontologies are selected and used to annotate cells for associated data input.
Selected parent term from the ontology
Methods for specifying ontology terms
Term lists for selected cells
Value Type and Property
Scientists are able to use the templates in Excel, where the annotations take the form of drop down menus or data entry
cells.
The usage of tools like RightField are reducing the uptake barriers for generating formatted and annotated data and models.
“Ruin is the destination toward which all men rush, each pursuing his own best interest in a society that believes in the the freedom
of the commons.”
- Garrett Hardin, The Tragedy of the Commons.