Upload
caroline-skinner
View
219
Download
0
Embed Size (px)
Citation preview
Copyright 2000 Russ Altman
Early Challenges in Early Challenges in Building an Ontology for Building an Ontology for
PharmacogenomicsPharmacogenomics
RRussuss B. Altman B. AltmanStanford Biomedical InformaticsStanford Biomedical Informatics
Stanford UniversityStanford University
[email protected]@smi.stanford.eduhttp://www.smi.stanford.edu/people/altman/http://www.smi.stanford.edu/people/altman/
http://pharmgkb.org/http://pharmgkb.org/
Copyright 2000 Russ Altman
OutlineOutline
1. PharmGKB: challenges1. PharmGKB: challenges
2. Preliminary result: Riboweb 2. Preliminary result: Riboweb ontology for experimental data.ontology for experimental data.
3. Method for PharmGKB data model3. Method for PharmGKB data model
4. A word on infrastructure.4. A word on infrastructure.
Copyright 2000 Russ Altman
PharmacogeneticsPharmacogenetics• Understand how genetic variation leads to Understand how genetic variation leads to
variation in responses to drugs.variation in responses to drugs.
• One of the promises of the genome projectOne of the promises of the genome project
• Pharmacogenomics = interacting systems of genes Pharmacogenomics = interacting systems of genes determining responses.determining responses.
• Some high profile examples derived from Some high profile examples derived from dramatic phenotypes, variants found.dramatic phenotypes, variants found.
Copyright 2000 Russ Altman
Drug administered
Drug in tissues of distribution
Drug concentration in systemic circulation
Drug metabolized or excreted
Drug concentration at site of action
Pharmacologic effect
Clinical response
ABSORPTION
DISTRIBUTION ELIMINATION
Toxicity Efficacy
Ph
armac
okin
etic
sP
harm
aco
dy
nam
ics
Copyright 2000 Russ Altman
PharmacogeneticsDatabase
CytoP450
Methyl-transferases Transporters
Steroidreceptors
LeukotrienemetabolismAdrenergic
receptors???
NIGMS Pharmacogenetic Research Network & Databasehttp://www.nih.gov/grants/guide/rfa-files/RFA-GM-99-004.html
Sulfurtransferases
Copyright 2000 Russ Altman
GenomicInformation
Molecular andCellular
Phenotype ClinicalPhenotype
Drug ResponseSystems
Molecules
Individuals
Alleles
Variationsin genome
Protein products
ObservablePhenotypes
ObservablePhenotypes
Role in organism
Genetic makeup
Environment
Nongenetic factors
CodingRelationship
Pharmacologicactivities
Physiology
Isolated functionalmeasures
Integrated functionalmeasures
Drugs
Molecularvariations
Treatmentprotocols
Copyright 2000 Russ Altman
DataModel forPharmGKB
Templatesfor dataacquisition
Deployeddataacquisitionforms
New datafor PharmGKB
Data StoredwithinPharmGKB
Templates inadequate,change data model
Use data model toautomatically generate
Data acquisitionforms inadequate
Translate intoexecutable HTMLforms
Make availableto scientists inresearch network
Store fully linkednew data intoPharmGKB
Copyright 2000 Russ Altman
Preliminary work: Preliminary work: RiboWEBRiboWEB
Hypothesis: Ontology with three Hypothesis: Ontology with three main components can support 3D main components can support 3D modelingmodeling
1. Experimental data1. Experimental data2. Physical Objects2. Physical Objects3. Reference information3. Reference information
Copyright 2000 Russ Altman
RiboWEB ArchitectureRiboWEB Architecture
ServerInterface generatorSession manager
Computational modulesKnowledge base
Copyright 2000 Russ Altman
Knowledge Base Knowledge Base SummarySummary
171 journal articles171 journal articles
~30 templates for experimental data~30 templates for experimental data
15,000 instances of objects, people, 15,000 instances of objects, people, datadata
8000 experimental data items8000 experimental data items
Copyright 2000 Russ Altman
Average Number of Standard Deviations of Models By Experiment Type for All Data
27.7
0
2
4
6
8
10
12
Cleavage Data Crosslinking Data Footprinting Data
Experimental Data
Ave
rag
e N
um
ber
of
Sta
nd
ard
Dev
iati
on
s
A
B
C
D
E
Avg Error Avg Error vs. vs. Three experiment types (all Three experiment types (all
data)data)
OH* CLEAVAGE CROSSLINK FOOTPRINTS
Copyright 2000 Russ Altman
Two Methods for Building Two Methods for Building the Ontology for PharmGKBthe Ontology for PharmGKB1. TOP DOWN (borrow from RiboWEB)1. TOP DOWN (borrow from RiboWEB)
Physical ObjectsPhysical ObjectsExperimental DataExperimental DataReference terminologiesReference terminologies
2. BOTTOM UP (read the grants!)2. BOTTOM UP (read the grants!)
Particular genesParticular genesIndividual small moleculesIndividual small moleculesExperimental methodsExperimental methods
Copyright 2000 Russ Altman
Reference TerminologiesReference TerminologiesPrinciple: don’t reinvent things we don’t have to!Principle: don’t reinvent things we don’t have to!
1. ICD-9 for diseases1. ICD-9 for diseases2. SNOMED/RDC codes for symptoms2. SNOMED/RDC codes for symptoms3. EC and Cytochrome classification system3. EC and Cytochrome classification system4. GO4. GO5. Cellular localization vocabulary5. Cellular localization vocabulary6. SMILES for small compounds6. SMILES for small compounds7. Others...7. Others...
These are imported on an “as needed basis” for now. These are imported on an “as needed basis” for now. Normally don’t have instances in our KB of Normally don’t have instances in our KB of terminology classes. terminology classes.
Copyright 2000 Russ Altman
Non-ontological issues for Non-ontological issues for PharmGKBPharmGKB
1. Putting patient data on the web…non trivial. 1. Putting patient data on the web…non trivial. Need technologies for data scrubbing and Need technologies for data scrubbing and aggregation to protect privacy/confidentiality.aggregation to protect privacy/confidentiality.
2. Intellectual property protection for research 2. Intellectual property protection for research groups.groups.
-We are establishing submission policies for -We are establishing submission policies for both genotype and phenotype information.both genotype and phenotype information.
3. Relationships with other databases: dbSNP, 3. Relationships with other databases: dbSNP, GENBANK, PDB and assessing adequacy of our GENBANK, PDB and assessing adequacy of our ontology for those communications.ontology for those communications.
Copyright 2000 Russ Altman
Infrastructure for Ontology Infrastructure for Ontology ModelingModeling
SOPHIASOPHIA--academic-level software--academic-level software--ACCESS97 and SQL Server backends--ACCESS97 and SQL Server backends--uses ASP and Visual Basic for logic--uses ASP and Visual Basic for logic--PERL interface (serialized)--PERL interface (serialized)--simple security/access control model--simple security/access control model--web-based browser--web-based browser--web-based editor--web-based editor--available for distribution--available for distribution
Not the current platform for developmentNot the current platform for development--speed and robustness worries...--speed and robustness worries...
Copyright 2000 Russ Altman
Infrastructure for Ontology Infrastructure for Ontology ModelingModeling
Protégé 2000Protégé 2000--professional-level software--professional-level software--runs in Java on many platforms--runs in Java on many platforms--nice Java class structure for KB--nice Java class structure for KB--somewhat simplistic ORACLE backend--somewhat simplistic ORACLE backend--see talk this afternoon--see talk this afternoon
The current platform for developmentThe current platform for development--add Java servlet for API to app writers--add Java servlet for API to app writers--add access security model for classes/slots--add access security model for classes/slots--create major league ORACLE backend--create major league ORACLE backend--tune performance--tune performance
Copyright 2000 Russ Altman
RiboWEB ThanksRiboWEB Thanks
Michael BadaMichael Bada: RiboWEB knowledge base : RiboWEB knowledge base contentscontents
Michelle CarrilloMichelle Carrillo: RiboWEB analysis modules: RiboWEB analysis modules
Allison WaughAllison Waugh: RNA XML format: RNA XML format
Yali WangYali Wang: SOPHIA infrastructure: SOPHIA infrastructure
ALSO: Harry Noller, Madison Syntax Ad Hoc ALSO: Harry Noller, Madison Syntax Ad Hoc Group...Group...
Copyright 2000 Russ Altman
PharmGKB ThanksPharmGKB ThanksTeri Klein (Project Director)Teri Klein (Project Director)Daniel Rubin (Ontology design)Daniel Rubin (Ontology design)Farhad Shafa (Datacenter Director)Farhad Shafa (Datacenter Director)Ray Fergerson (Protégé-2000 Architect)Ray Fergerson (Protégé-2000 Architect)Marty Mayberry (KB infrastructure)Marty Mayberry (KB infrastructure)Micheal Hewett (KB infrastructure)Micheal Hewett (KB infrastructure)Mildred Cho (Bioethics)Mildred Cho (Bioethics)Mark Musen (Co-PI)Mark Musen (Co-PI)
Josh StuartJosh Stuart Soumya RaychaudhuriSoumya RaychaudhuriJeff ChangJeff Chang Diane OliverDiane OliverVincent LiuVincent Liu Irene LiuIrene LiuZhen LinZhen Lin Louisa CrosbyLouisa Crosby
Copyright 2000 Russ Altman
Class: Person
Slots:NameAddressSexCollaborator
Class: ManSlots:NameAddressSex MaleCollaboratorY-allele
Class: WomanSlots:NameAddressSex FemaleCollaboratorX-alleles
Instance: Russ23Slots:Name RussAddress 251 CampusSex MaleY-allele Y234112Collab. Kathy123, Mark666
Instance: Mark666Slots:Name MarkAddress 253 CampusSex MaleY-allele Y534033Collab. Russ23
Instance: Kathy123Slots:Name KathyAddress 201 ParnassusSex FemaleX-alleles X234, X454Collab. Russ23
IS-A Links
Classes(concepts,frames)
Instances
Attributes(slots)
Attributevalues
Basics of Frame Representations for Data Modeling