40
Michigan Life Sciences Corridor Michigan Life Sciences Corridor Bioinformatics, University of Bioinformatics, University of Michigan Michigan March 14, 2001 March 14, 2001 Building Analysis Building Analysis Environments Environments Beyond the Genome and the Web Beyond the Genome and the Web Bruce R. Schatz CANIS Laboratory School of Library & Information Science School of Biomedical & Health Information Sciences University of Illinois at Urbana-Champaign [email protected] , www.canis.uiuc.edu

Building Analysis Environments Beyond the Genome and the Web

Embed Size (px)

DESCRIPTION

Building Analysis Environments Beyond the Genome and the Web. Bruce R. Schatz CANIS Laboratory School of Library & Information Science School of Biomedical & Health Information Sciences University of Illinois at Urbana-Champaign [email protected] , www.canis.uiuc.edu. - PowerPoint PPT Presentation

Citation preview

Page 1: Building Analysis Environments Beyond the Genome and the Web

Michigan Life Sciences CorridorMichigan Life Sciences CorridorBioinformatics, University of MichiganBioinformatics, University of Michigan

March 14, 2001March 14, 2001

Building Analysis EnvironmentsBuilding Analysis Environments Beyond the Genome and the WebBeyond the Genome and the Web

Bruce R. SchatzCANIS Laboratory

School of Library & Information ScienceSchool of Biomedical & Health Information Sciences

University of Illinois at [email protected] , www.canis.uiuc.edu

Page 2: Building Analysis Environments Beyond the Genome and the Web

Technological ProgressTechnological Progress

In the past decade, technology has created the Genome and the Web

In 1991, these ideas were only plansIn 2001, they have already progressed from research systems to commercial products

In the next decade, the revolution will actually begin and the world will be completely

different!

Page 3: Building Analysis Environments Beyond the Genome and the Web

Paradigm Shift (Pre)Paradigm Shift (Pre)

Towards Dry-Lab Biology, Walter Gilbert (Jan 1991)

“The new paradigm, now emerging, is that all the 'genes' will be known (in the sense of being resident in databases available electronically), and that the starting point of a biological investigation will be theoretical. An individual scientist will begin with a theoretical conjecture, only then turning to experiment to follow or test that hypothesis. ...

To use this flood of knowledge [the total sequence of the human and model organisms], which will pour across the computer networks of the world, biologists not only must become computer-literate, but also change their approach to the problem of understanding life. ...

The Coming of Informational ScienceCorrelation of Information across Sources

Page 4: Building Analysis Environments Beyond the Genome and the Web

Paradigm Shift (Post)Paradigm Shift (Post)

Dissecting Human Disease, Victor McKusick (Feb 2001)

Structural genomics Functional genomics Genomics Proteomics Map-based gene discovery Sequence-based gene discovery Monogenic disorders Multifactorial disorders Specific DNA diagnosis Monitoring susceptibility Analysis of one gene Analysis of multi-gene

pathways Gene action Gene regulation Etiology (mutation) Pathogenesis (mechanism) One species Several species

Page 5: Building Analysis Environments Beyond the Genome and the Web

Analysis Environments IAnalysis Environments I

The Present -- Year 2001

Search Central Archives

Locating a Generic (average) solution mining sequences from the Genome diagnosing diseases from the Clinical Trial

some Problems may have point Solutions find the cystic fibrosis gene find the diabetes treatment

Page 6: Building Analysis Environments Beyond the Genome and the Web

Analysis Environments IIAnalysis Environments II

The Future -- Year 2011

Navigate Distributed Repositories

Locating a Specific (situational) solution correlating sequences, genes, expressions correlating diagnoses, treatments, lifestyles

most Problems have cluster Solutions find genes for Heart Disease find treatments for Arthritis

Page 7: Building Analysis Environments Beyond the Genome and the Web

WCS -- a testbed for the world of 2001 community repositories before the Web

in 1991, a distributed analysis environment

MCS -- a testbed for the world of 2011 concept navigation before the Interspace

in 2001, a biomedical analysis environment

to enable Michigan Corridor faculty and students

to live in the world of the future (information space)

Testbeds of the FutureTestbeds of the Future

Page 8: Building Analysis Environments Beyond the Genome and the Web

Community SystemsCommunity Systems

browse and share all the knowledge of a community

data results(database management) (electronic mail)

literature news(information retrieval) (bulletin

boards)

knowledge(hypertext annotations)

Formal Informal

Page 9: Building Analysis Environments Beyond the Genome and the Web

Worm Community SystemWorm Community System WCS Information:Literature BIOSIS, MEDLINE, newsletters,

meetings

Data Genes, Maps, Sequences, strains, cells

WCS FunctionalityBrowsing search, navigationFiltering selection, analysisSharing linking, publishing

WCS: 250 users at 50 labs across Internet (1991)

Page 10: Building Analysis Environments Beyond the Genome and the Web

WCSMolecular

Page 11: Building Analysis Environments Beyond the Genome and the Web

WCS Cellular

Page 12: Building Analysis Environments Beyond the Genome and the Web

WCS Publishing

Page 13: Building Analysis Environments Beyond the Genome and the Web

WCS Linking

Page 14: Building Analysis Environments Beyond the Genome and the Web

WCS invokes

gm

Page 15: Building Analysis Environments Beyond the Genome and the Web

WCS vis-à-vis

acedb

Page 16: Building Analysis Environments Beyond the Genome and the Web

WCSPPCS

demo

Page 17: Building Analysis Environments Beyond the Genome and the Web

A Model CommunityA Model Community 1984-1988 Telesophy (Bellcore)

prototype to federate objects 1989-1994 WCS (Arizona)

testbed in molecular biology National Model for Biomedical Informatics

NAS National Collaboratories report NIH Human Brain project

Translational Results NCSA Mosaic into Web browsers acedb (worm) into Genome databases Biology Workbench, 10K users across Web

Page 18: Building Analysis Environments Beyond the Genome and the Web

THE THIRD WAVE OF NET EVOLUTIONTHE THIRD WAVE OF NET EVOLUTION

PACKETS

OBJECTS

CONCEPTS

Page 19: Building Analysis Environments Beyond the Genome and the Web

from Objects to Concepts

from Syntax to Semantics

Infrastructure is Interaction with Abstraction

Internet is packet transmission across computers

Interspace is concept navigation across repositories

Towards the InterspaceTowards the Interspace

Page 20: Building Analysis Environments Beyond the Genome and the Web

1992 1993 1995 1996 1998

COMPUTING CONCEPTSCOMPUTING CONCEPTS

‘92: 4,000 (molecular biology)

‘93: 40,000 (molecular biology)

‘95: 400,000 (electrical engineering)

‘96: 4,000,000 (engineering)

‘98: 40,000,000 (medicine)

Page 21: Building Analysis Environments Beyond the Genome and the Web

Simulating a New WorldSimulating a New World Obtain discipline-scale collection

MEDLINE from NLM, 10M bibliographic abstracts human classification: Medical Subject Headings

Partition discipline into Community Repositories 4 core terms per abstract for MeSH classification 32K nodes with core terms (classification tree)

Community is all abstracts classified by core term 40M abstracts containing 280M concepts concept spaces took 2 days on NCSA Origin 2000

Simulating World of Medical Communities 10K repositories with > 1K abstracts (1K w/ > 10K)

Page 22: Building Analysis Environments Beyond the Genome and the Web

Concept NavigationConcept Navigation

Semantic Indexes for Community Repositories

Navigating Abstractions within Repository concept space category map

Interactive browsing by Community experts

Page 23: Building Analysis Environments Beyond the Genome and the Web

Interspace Remote Access ClientInterspace Remote Access Client

Page 24: Building Analysis Environments Beyond the Genome and the Web

Navigation in MEDSPACENavigation in MEDSPACE

For a patient with Rheumatoid Arthritis Find a drug that reduces the pain (analgesic) but does not cause stomach (gastrointestinal) bleeding

Choose DomainChoose Domain

Page 25: Building Analysis Environments Beyond the Genome and the Web

Concept SearchConcept Search

Page 26: Building Analysis Environments Beyond the Genome and the Web

Concept NavigationConcept Navigation

Page 27: Building Analysis Environments Beyond the Genome and the Web

Retrieve DocumentRetrieve Document

Page 28: Building Analysis Environments Beyond the Genome and the Web

Navigate DocumentNavigate Document

Page 29: Building Analysis Environments Beyond the Genome and the Web

Retrieve DocumentRetrieve Document

Page 30: Building Analysis Environments Beyond the Genome and the Web

Concept SwitchingConcept Switching

In the Interspace…

each Community maintains its own repository

Switching is navigating Across repositories

use your specialty vocabulary to search another specialty

Page 31: Building Analysis Environments Beyond the Genome and the Web

Biomedical SessionBiomedical Session

Page 32: Building Analysis Environments Beyond the Genome and the Web

Categories and ConceptsCategories and Concepts

Page 33: Building Analysis Environments Beyond the Genome and the Web

Concept SwitchingConcept Switching

Page 34: Building Analysis Environments Beyond the Genome and the Web

Document RetrievalDocument Retrieval

Page 35: Building Analysis Environments Beyond the Genome and the Web

Towards A Model DisciplineTowards A Model Discipline 1995-1999 Interspace (Illinois, Urbana)

prototype to federate concepts 2000-2004 MEDSPACE (Illinois, Chicago)

testbed in clinical medicine (plan, demo) National Model for Biomedical Informatics

lead news in Science on MEDLINE dry-run Best Paper at AMIA (Medical Informatics)

2001-2005 MCS (Michigan) testbed in biomedical research

Page 36: Building Analysis Environments Beyond the Genome and the Web

Michigan InterspaceMichigan Interspace Gather the Information Sources

Michigan Corridor System (MCS) each (department, institute, lab) has repository

Generate the Community Repositories text documents with articles and annotations specialty datatypes: databases and motifs

Construct the Analysis Environment federated concept navigation across repositories type-dependent parsing for text/data interlinks

Page 37: Building Analysis Environments Beyond the Genome and the Web

MCS SourcesMCS Sources Literature

Journals: MEDLINE, BIOSIS, full-text Specialty Conferences (e.g. Neuroscience) Community Newsletters, Lab Annotations

Databases Sequences: GENBANK, Celera Genes and Maps from Model Organisms Microarray Expressions, Protein Structures Gene Pathways, Cellular Anatomy

Page 38: Building Analysis Environments Beyond the Genome and the Web

Ten Steps from Here to ThereTen Steps from Here to There Determine Users (range of needs) Develop Hardware (networks) Determine Collections (range of types) Develop Software (databases) Interlinks Automatic (name recognition) Interlinks Manual (distributed

annotation) Community Literature (journals, conferences) Concept Navigation (indexing, switching) Custom Databases (community datasets) Custom Software (specialized analysis)

Page 39: Building Analysis Environments Beyond the Genome and the Web

Bioinformatics CenterBioinformatics Center Institute for Biological Information Systems

develop new information systems deploy to study biological systems integrated analysis for biological information analysis environment for community repositories

Interspace technologies support Communities Basic Science: Individual Genomes Clinical Practice: Individual Patients

Page 40: Building Analysis Environments Beyond the Genome and the Web

IBIS New GloryIBIS New Glory Institute for Biological Information Systems

unique facility for all Michigan laboratories interactive systems training for all levels

IBIS reborne Thoth, sacred ibis who hatched the world inventor of writing, keeper of divine archives inventor of arts & sciences, medicine & surgery First of the magicians, he was called the Elder:

His disciples claimed access to the crypt where he kept his books of magic, so they undertook to decipher and learn “these formulas which commanded all the forces of nature and subdued the very gods themselves”.