10
GeoClouds workshop, Indianapolis, IN, Sept. 17, 2009 - P. Missier A collaborative talk by Paolo Missier Information Management Group School of Computer Science, University of Manchester, UK with additional material kindly shared by: Prof. Dave DeRoure and David Newman, University of Southampton Prof. Carole Goble and the e-Labs design group, University of Manchester 1 Scientific Workflow Management System Taverna, Biocatalogue, and myExperiment: a threelegged founda;on for effec;ve collabora;on in Escience Sunday, 13 March 2011

Invited talk at the GeoClouds Workshop, Indianapolis, 2009

Embed Size (px)

DESCRIPTION

Taverna, Biocatalogue, and myExperiment:a three-legged foundation for effective collaboration in E-science

Citation preview

Page 1: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

GeoClouds workshop, Indianapolis, IN, Sept. 17, 2009 - P. Missier

A collaborative talk by Paolo Missier Information Management Group School of Computer Science, University of Manchester, UK

with additional material kindly shared by:Prof. Dave DeRoure and David Newman, University of Southampton

Prof. Carole Goble and the e-Labs design group, University of Manchester1

Scientific Workflow Management System

Taverna,  Biocatalogue,  and  myExperiment:a  three-­‐legged  founda;on  for  effec;ve  collabora;onin  E-­‐science

Sunday, 13 March 2011

Page 2: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

ESIP meeting,Santa Barbara, CA, July 2009 - P. Missier

What is the myGrid Project?

UK  e-­‐Science  pilot  project  since  2001.   Centred  at  Manchester,  Southampton  and  the  EMBL-­‐EBI Part  of  Open  Middleware  Infrastructure  InsEtute  UK  hFp://www.omii.ac.uk.  

Mixture  of  developers,  bioinformaEcians  and  researchers An  alliance  of  contribuEng  projects  and  partners Open  source  development  and  content  LGPL  or  BSD Infrastructure We  don’t  own  any  resources  (apart  from  catalogues) Or  a  Grid.  

Sunday, 13 March 2011

Page 3: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

ESIP meeting,Santa Barbara, CA, July 2009 - P. Missier

Graphical  WorkbenchFor  Professionals

Plug-­‐in  architectureNested  WorkflowsDrag  and  DropWiring  together

Taverna

Rapidly  incorporate  new  service  without  coding.  Not  restricted  to  predetermined  servicesAccess  to  local  and  remote  resources  and  analysis  tools3500+  service  operaEons  available  when  start  up

Sunday, 13 March 2011

Page 4: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

ESIP meeting,Santa Barbara, CA, July 2009 - P. Missier

What do Scientists use Taverna for?

Systems  biology  model  buildingProteomicsSequence  analysisProtein  structure  predicEonGene/protein  annotaEon  Microarray  data  analysisQTL  studiesQSAR  studiesMedical  image  analysisPublic  Health  care  epidemiologyHeart  model  simulaEonsHigh  throughput  screeningPhenotypical  studiesPhylogeny          StaEsEcal  analysis          Text  mining

Astronomy,  Music,  Meteorology

Netherlands  BioinformaEcs  CentreGenome  Canada  BioinformaEcs  PlaaormBioMOBYUS  FLOSS  social  science  programRENCISysMO  ConsorEumFrench  SIGENAE  farm  animals  projectThaiGridCARMEN  Neuroscience  projectSPINE  consorEumEU  Enfin,  EMBRACE,  BioSapian,  CasimirEU  SysMO  ConsorEumNERC  Centre  for  Ecology  and  HydrologyBergen  Centre  for  ComputaEonal  BiologyMax-­‐Planck  insEtute  for  Plant  Breeding  ResearchGenoa  Cancer  Research  CentreAstroGrid

         30  USA  academic  and  research  ins;tu;ons

Sunday, 13 March 2011

Page 5: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

ESIP meeting,Santa Barbara, CA, July 2009 - P. Missier

Who else is in this space?

5

Kepler

Triana

BPEL

Ptolemy II

Taverna

Trident

BioExtract

Sunday, 13 March 2011

Page 6: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

Socially share, discover and reuse workflows and other methods.

Cooperative bazaar.

l Sunday  10th  May:1748  registered  users,  143  groups,  669  workflows,  197  files,  52  packs56  different  countries.  Top  4:  UK,  US,  The  Netherlands,  Germany

www.myexperiment.org

Sunday, 13 March 2011

Page 7: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

Sunday, 13 March 2011

Page 8: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

Sunday, 13 March 2011

Page 9: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

• To establish quality, relevance, trust

• To track information attribution through complex transformations

• To describe one’s experiment to others, for understanding / reuse

• To provide evidence in support of scientific claims

• To enable post hoc process analysis for improvement, re-design

Linköping, Sweden -- January 2010

Why data provenance matters, if done right

The W3C Incubator on Provenance has been collecting numerous use cases:http://www.w3.org/2005/Incubator/prov/wiki/Use_Cases#

Sunday, 13 March 2011

Page 10: Invited talk at the GeoClouds Workshop, Indianapolis, 2009

Goals, expected contributions• Established technology provider - open-source

– traditionally active in the bioinf space– but also involved in the e-Lico EU project (data mining

portal)– large community base, established production

environment

• Main goal:– to offer our workflow and workflow repository technology,

put it to the test on the challenges of data preservation pipelines

• Challenges:– expect new requirements on our current technology

• robust, high-volume data pipelines• workflow provenance -- process evolution• data provenance

10

Sunday, 13 March 2011