62
Copyright 2000 Russ Altma Early Challenges in Early Challenges in Building an Building an Ontology for Ontology for Pharmacogenomics Pharmacogenomics R R uss uss B. Altman B. Altman Stanford Biomedical Informatics Stanford Biomedical Informatics Stanford University Stanford University [email protected] [email protected] http://www.smi.stanford.edu/ http://www.smi.stanford.edu/ people/altman/ people/altman/ http://pharmgkb.org/ http://pharmgkb.org/

Copyright 2000 Russ Altman Early Challenges in Building an Ontology for Pharmacogenomics Russ B. Altman Stanford Biomedical Informatics Stanford University

Embed Size (px)

Citation preview

Copyright 2000 Russ Altman

Early Challenges in Early Challenges in Building an Ontology for Building an Ontology for

PharmacogenomicsPharmacogenomics

RRussuss B. Altman B. AltmanStanford Biomedical InformaticsStanford Biomedical Informatics

Stanford UniversityStanford University

[email protected]@smi.stanford.eduhttp://www.smi.stanford.edu/people/altman/http://www.smi.stanford.edu/people/altman/

http://pharmgkb.org/http://pharmgkb.org/

Copyright 2000 Russ Altman

OutlineOutline

1. PharmGKB: challenges1. PharmGKB: challenges

2. Preliminary result: Riboweb 2. Preliminary result: Riboweb ontology for experimental data.ontology for experimental data.

3. Method for PharmGKB data model3. Method for PharmGKB data model

4. A word on infrastructure.4. A word on infrastructure.

Copyright 2000 Russ Altman

PharmacogeneticsPharmacogenetics• Understand how genetic variation leads to Understand how genetic variation leads to

variation in responses to drugs.variation in responses to drugs.

• One of the promises of the genome projectOne of the promises of the genome project

• Pharmacogenomics = interacting systems of genes Pharmacogenomics = interacting systems of genes determining responses.determining responses.

• Some high profile examples derived from Some high profile examples derived from dramatic phenotypes, variants found.dramatic phenotypes, variants found.

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Drug administered

Drug in tissues of distribution

Drug concentration in systemic circulation

Drug metabolized or excreted

Drug concentration at site of action

Pharmacologic effect

Clinical response

ABSORPTION

DISTRIBUTION ELIMINATION

Toxicity Efficacy

Ph

armac

okin

etic

sP

harm

aco

dy

nam

ics

Copyright 2000 Russ Altman

PharmacogeneticsDatabase

CytoP450

Methyl-transferases Transporters

Steroidreceptors

LeukotrienemetabolismAdrenergic

receptors???

NIGMS Pharmacogenetic Research Network & Databasehttp://www.nih.gov/grants/guide/rfa-files/RFA-GM-99-004.html

Sulfurtransferases

Copyright 2000 Russ Altman

GenomicInformation

Molecular andCellular

Phenotype ClinicalPhenotype

Drug ResponseSystems

Molecules

Individuals

Alleles

Variationsin genome

Protein products

ObservablePhenotypes

ObservablePhenotypes

Role in organism

Genetic makeup

Environment

Nongenetic factors

CodingRelationship

Pharmacologicactivities

Physiology

Isolated functionalmeasures

Integrated functionalmeasures

Drugs

Molecularvariations

Treatmentprotocols

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

DataModel forPharmGKB

Templatesfor dataacquisition

Deployeddataacquisitionforms

New datafor PharmGKB

Data StoredwithinPharmGKB

Templates inadequate,change data model

Use data model toautomatically generate

Data acquisitionforms inadequate

Translate intoexecutable HTMLforms

Make availableto scientists inresearch network

Store fully linkednew data intoPharmGKB

Copyright 2000 Russ Altman

Preliminary work: Preliminary work: RiboWEBRiboWEB

Hypothesis: Ontology with three Hypothesis: Ontology with three main components can support 3D main components can support 3D modelingmodeling

1. Experimental data1. Experimental data2. Physical Objects2. Physical Objects3. Reference information3. Reference information

Copyright 2000 Russ Altman

RiboWEB ArchitectureRiboWEB Architecture

ServerInterface generatorSession manager

Computational modulesKnowledge base

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Knowledge Base Knowledge Base SummarySummary

171 journal articles171 journal articles

~30 templates for experimental data~30 templates for experimental data

15,000 instances of objects, people, 15,000 instances of objects, people, datadata

8000 experimental data items8000 experimental data items

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

RiboWEBRiboWEBControlControlPanelPanel

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Average Number of Standard Deviations of Models By Experiment Type for All Data

27.7

0

2

4

6

8

10

12

Cleavage Data Crosslinking Data Footprinting Data

Experimental Data

Ave

rag

e N

um

ber

of

Sta

nd

ard

Dev

iati

on

s

A

B

C

D

E

Avg Error Avg Error vs. vs. Three experiment types (all Three experiment types (all

data)data)

OH* CLEAVAGE CROSSLINK FOOTPRINTS

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Two Methods for Building Two Methods for Building the Ontology for PharmGKBthe Ontology for PharmGKB1. TOP DOWN (borrow from RiboWEB)1. TOP DOWN (borrow from RiboWEB)

Physical ObjectsPhysical ObjectsExperimental DataExperimental DataReference terminologiesReference terminologies

2. BOTTOM UP (read the grants!)2. BOTTOM UP (read the grants!)

Particular genesParticular genesIndividual small moleculesIndividual small moleculesExperimental methodsExperimental methods

Copyright 2000 Russ Altman

Reference TerminologiesReference TerminologiesPrinciple: don’t reinvent things we don’t have to!Principle: don’t reinvent things we don’t have to!

1. ICD-9 for diseases1. ICD-9 for diseases2. SNOMED/RDC codes for symptoms2. SNOMED/RDC codes for symptoms3. EC and Cytochrome classification system3. EC and Cytochrome classification system4. GO4. GO5. Cellular localization vocabulary5. Cellular localization vocabulary6. SMILES for small compounds6. SMILES for small compounds7. Others...7. Others...

These are imported on an “as needed basis” for now. These are imported on an “as needed basis” for now. Normally don’t have instances in our KB of Normally don’t have instances in our KB of terminology classes. terminology classes.

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Non-ontological issues for Non-ontological issues for PharmGKBPharmGKB

1. Putting patient data on the web…non trivial. 1. Putting patient data on the web…non trivial. Need technologies for data scrubbing and Need technologies for data scrubbing and aggregation to protect privacy/confidentiality.aggregation to protect privacy/confidentiality.

2. Intellectual property protection for research 2. Intellectual property protection for research groups.groups.

-We are establishing submission policies for -We are establishing submission policies for both genotype and phenotype information.both genotype and phenotype information.

3. Relationships with other databases: dbSNP, 3. Relationships with other databases: dbSNP, GENBANK, PDB and assessing adequacy of our GENBANK, PDB and assessing adequacy of our ontology for those communications.ontology for those communications.

Copyright 2000 Russ Altman

Infrastructure for Ontology Infrastructure for Ontology ModelingModeling

SOPHIASOPHIA--academic-level software--academic-level software--ACCESS97 and SQL Server backends--ACCESS97 and SQL Server backends--uses ASP and Visual Basic for logic--uses ASP and Visual Basic for logic--PERL interface (serialized)--PERL interface (serialized)--simple security/access control model--simple security/access control model--web-based browser--web-based browser--web-based editor--web-based editor--available for distribution--available for distribution

Not the current platform for developmentNot the current platform for development--speed and robustness worries...--speed and robustness worries...

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Copyright 2000 Russ Altman

Infrastructure for Ontology Infrastructure for Ontology ModelingModeling

Protégé 2000Protégé 2000--professional-level software--professional-level software--runs in Java on many platforms--runs in Java on many platforms--nice Java class structure for KB--nice Java class structure for KB--somewhat simplistic ORACLE backend--somewhat simplistic ORACLE backend--see talk this afternoon--see talk this afternoon

The current platform for developmentThe current platform for development--add Java servlet for API to app writers--add Java servlet for API to app writers--add access security model for classes/slots--add access security model for classes/slots--create major league ORACLE backend--create major league ORACLE backend--tune performance--tune performance

Copyright 2000 Russ Altman

RiboWEB ThanksRiboWEB Thanks

Michael BadaMichael Bada: RiboWEB knowledge base : RiboWEB knowledge base contentscontents

Michelle CarrilloMichelle Carrillo: RiboWEB analysis modules: RiboWEB analysis modules

Allison WaughAllison Waugh: RNA XML format: RNA XML format

Yali WangYali Wang: SOPHIA infrastructure: SOPHIA infrastructure

ALSO: Harry Noller, Madison Syntax Ad Hoc ALSO: Harry Noller, Madison Syntax Ad Hoc Group...Group...

Copyright 2000 Russ Altman

PharmGKB ThanksPharmGKB ThanksTeri Klein (Project Director)Teri Klein (Project Director)Daniel Rubin (Ontology design)Daniel Rubin (Ontology design)Farhad Shafa (Datacenter Director)Farhad Shafa (Datacenter Director)Ray Fergerson (Protégé-2000 Architect)Ray Fergerson (Protégé-2000 Architect)Marty Mayberry (KB infrastructure)Marty Mayberry (KB infrastructure)Micheal Hewett (KB infrastructure)Micheal Hewett (KB infrastructure)Mildred Cho (Bioethics)Mildred Cho (Bioethics)Mark Musen (Co-PI)Mark Musen (Co-PI)

Josh StuartJosh Stuart Soumya RaychaudhuriSoumya RaychaudhuriJeff ChangJeff Chang Diane OliverDiane OliverVincent LiuVincent Liu Irene LiuIrene LiuZhen LinZhen Lin Louisa CrosbyLouisa Crosby

Copyright 2000 Russ Altman

Class: Person

Slots:NameAddressSexCollaborator

Class: ManSlots:NameAddressSex MaleCollaboratorY-allele

Class: WomanSlots:NameAddressSex FemaleCollaboratorX-alleles

Instance: Russ23Slots:Name RussAddress 251 CampusSex MaleY-allele Y234112Collab. Kathy123, Mark666

Instance: Mark666Slots:Name MarkAddress 253 CampusSex MaleY-allele Y534033Collab. Russ23

Instance: Kathy123Slots:Name KathyAddress 201 ParnassusSex FemaleX-alleles X234, X454Collab. Russ23

IS-A Links

Classes(concepts,frames)

Instances

Attributes(slots)

Attributevalues

Basics of Frame Representations for Data Modeling