Beyond Transparency: Success & Lessons From tambisBoston2003

  • View
    79

  • Download
    1

Embed Size (px)

DESCRIPTION

Invited talk about TAMBIS at a conference in Boston, 2003

Text of Beyond Transparency: Success & Lessons From tambisBoston2003

  • 1.Beyond Transparency: Success & Lessons From tambis Robert Stevens bioHealth Informatics Group University of Manchester, UK funded by: EPSRC/BBSRC/AstraZeneca Pharmaceuticals

2. Introduction What is the problem? What does TAMBIS do? How does it work? o Middleware o Metadata o Ontologies Outcomes and Lessons Next Steps, GRIDology and the Semantic Web 3. Take Homes TAMBIS aims to provide the illusion of: A single query language. A single data model. A single location for distributed bio information sources. The illusion is called transparency. Interoperating resources (by people or systems) requires descriptions of their information (metadata) and a consistent shared understanding of what the metadata means (an ontology) Biologists pose a conceptual question against an ontology that gets rewritten to a coordinated plan of multi-information source requests and tool invocations ~ middleware The illusion is high pain high gain in a highly autonomous and changeable environment where the sources hinder rather than help. Transparency -> semi-transparency The ontologies and advanced knowledge representation techniques turn out to be our greatest outcome! A example of classic GRIDology ~ metadata, middleware, ontology 4. What is the problem? Bioinformatics is the use of computational techniques for the consolidation and analysis of experimental data in biology The bio community is distributed and shares data and tools The global bioinformatics infrastructure is piecemeal The sources and tools are poorly integrated and difficult to use together 5. SQL appropriate questions to each tool file searches What is the problem? databases on-line services files There are over 500 biological information sources world-wide But this data is only useful with sensible access mechanisms The biologist must: phrase the question for each information system aninterpret the answers received from the different sources 6. The biologists query task Identify sources and their locations Identify the content/function of sources Recognise components of a query and target them to appropriate sources in the optimal order Communicate with sources Transform data between source formats Express syntactically complex queries appropriate to each source Merge and link results from different sources 7. A solution... A common interface to all these information sources 8. SRS 9. BioNavigator - graphical workflow 10. BioKleisli multidatabase queries o A syntactically consistent view o The multidatabase query language, CPL usescript "cpl_libs/tambislib.cpl"; htmlout("tambis.html", "html", {m | p