Upload
keywan-hassani-pak
View
81
Download
0
Embed Size (px)
Citation preview
PowerPoint Presentation
Mining biological knowledge networks for gene-phenotype discoveryKeywan Hassani-PakEBI course: Introduction to Omics data integrationMarch 2017http://knetminer.rothamsted.ac.uk/
@KnetMiner
1
Rothamsted is the longest running agricultural research station in the world, providing cutting-edge science and innovation for more than 170 years.
Over 450 staff
Bioinformatics groupBioinformatics AnalystsSoftware DevelopersAbout us
Agenda for today
Kevin DialdestoroStephanie BrunetPart IPart IIKeywan Hassani-PakAjit SinghMonika Mistry
To understand why linking genotype to phenotype is complex
To learn which information types are useful for candidate gene prioritization
To understand the concept of knowledge networks/graphs
To use KnetMiner for the interpretation of your RNA-seq, QTL, GWAS results
To learn a little bit about neurodegenerative diseasesLearning Objectives for Part I
The Genotype to Phenotype ChallengeGenotypeQTL and GWASOmicsIncludes any omics
PhenotypeDiseaseIntelligenceFloweringStress toleranceBiological Knowledge DiscoveryData selection, processing, transformation, integration and interpretation
Many phenotypes are complex, polygenic and the result of complex interactions on cellular levelLinking genotype and phenotype is one of the greatest challenges in biology
5
The approach is generic and works similarly for other species
Free and open source
Data warehousing using a graph-database
Platform to integrate public and private datasets in various formats
Provides a GUI, CLI, APIs and workflows for reproducible data integration
Ondex Data Integration PlatformOndexwww.ondex.org
Not covered in this training course!
7
Lets start with some GWAS data
http://plants.ensembl.org/biomart Example Arabidopsis#SNP=66,816 | #Gene=27,502 | #Phenotype=107
SNP-Phenotype relations (122,919 relations) of significant SNPs (as defined by Ensembl, p-value3,000,000 edgesGenome-scale knowledge network
http://www.sciencedirect.com/science/article/pii/S221206611630030812
Progressive loss of structure or function of neurons, including death of neurons
Many types including Alzheimer, Parkinson, Huntington
Many similarities between these diseases on a sub-cellular level
Discovering these similarities offers hope for therapeutic advances that could ameliorate many diseases simultaneouslyNeurodegenerative Diseases
Use OMIM advanced searchQuery: alzheimer parkinson huntingtonTick: Search in TitleTick: MIM Number Prefix: # phenotypeDownload results as Tab-delimited fileCopy MIM ids without the prefix #Use UniProt Retrieve/ID mappingProvide your MIM identifiersSelect option: From MIM to UniProtKBPress Go and download all proteins in XML format (compressed)
Tutorial data based on 33 human genes
Integration of public datasets
Public Databases
Quantitative dataInteraction data
Omics Data
Datasets and workflows: https://github.com/Rothamsted/ondex-knet-builder
Relationships in Biological Knowledge NetworksGenesHomologyAnnotations
GeneticsInteractionsPhenotype
Highlight text-miningAdd gene expressionMention xrefs and that they cn be collapsed16
Methods needed to evaluate millions of relationships in knowledge network, prioritize genes and extract relevant subnetworks Interactive and exploratory tools needed to enable knowledge discovery and decision making
Interpretation should be the task of domain experts i.e. biologists!
How to search and interpret too much information?
Scale17
Web BrowserServerServlets and JSP PageJava SocketKnowledgeGraph DBOndex API
DHTMLJavaScriptApache TomcatMultithreaded Java ServerHTML, JSON, XML and images over HTTP via AjaxViewsJava Socket
KnetMiner System Overview
Client
KnetMiner UI OverviewSearchGene ViewMap ViewEvidence ViewNetwork View
123
http://knetminer.rothamsted.ac.uk/HumanDisease
KnetMiner search interface (1)
Ontology-based term suggestions
Concept Typese.g. GO, Trait
OR, NOT, Replace
KnetMiner search interface (2)User provided QTL regionSupports gene IDs and names
Example 1Search terms: "cell death" OR apoptosis
Open Query SuggestorClick on cell death tabReplace with neuron deathDoes it change the number of documents and genes that can be found?
Exercise 1 Search Interface
Video 1
Gene View - Ranked genes and evidence summaries
Uses TF*IDF to rank documents by their relevance to a search term
Uses the properties of gene-evidence networks such as the specificity of documents to a gene the frequency of evidence concepts
Calculates Knet-Score for every gene
Smart pre-indexing of the knowledge network makes the computation of the score very fastGene Ranking
Network View Interactive network visualization
EnlargeShow allRe-layoutInfo BoxAdd hidden nodes and edges
Example 2Exercise 2: Gene View NetworkSearch terms: Alzheimer OR Parkinson OR Heparin OR "cell deathGene List: APP, MAPT, PRNP
Click on the APP gene which loads the Network ViewOpen the Info Box Click on different Concept and Relation typesCheck their attributes and click on links to external databasesExplore all direct and indirect paths from APP to Alzheimer and ParkinsonHide Publication conceptsShow all drugs that can target the APP interaction networkGo back to Gene View and select Known targetsClick View Network and find out if APP, MAPT and PRNP interact, are differentially expressed and have GWAS data
Video 2
Example 3Exercise 3: Evidence View NetworkSearch terms: Alzheimer "cell death"
Go to Evidence ViewSort table by column GENESFind GO concept downregulation of neuron deathFind GO concept upregulation of apoptosisClick on number of genes linked to these termsIn Network View, show labels for Gene and GO conceptsWhats the evidence linking genes to selected GO terms?
Video 3
Map View Interactive map of chromosome, gene, SNP and QTL dataShow networkEnlargeResetSettingsGWAS studies
Example 4Exercise 4: Map View NetworkSearch terms: Parkinson "cell deathQTL: Chromosome 12 :: 35000000 - 44000000
Go to Map ViewToggle Full ScreenZoom into Chromosome 12 and find you QTLFind one or several genes that are in close proximity to GWAS SNPsSelect one ore more genes, e.g. LRRK2, PRNP and EIF4G1Launch Network ViewStudy the network and how the genes are connected
Video 4
Web application for very fast search of large genome-scale knowledge graphs
Ranking of candidate genes based on knowledge mining
Interactive visualisation of genome and knowledge maps
Facilitates hypothesis validation and generationKnetMiner Making Gene Discovery Efficient & Funhttp://knetminer.rothamsted.ac.uk/
You like KnetMiner but you might be askingWhat if Im interested in a different disease?What if Im interested in a different species?What if I want to integrate my own private data?What if I dont have a server to run KnetMiner?
As part of a Innovate UK project we are working with Genestack to address these qestions by integrating KnetMiner tools into the Genestack Bioinformatics Platform.
Next: We will teach you how to use Genestack to build your own networks and deploy your own KnetMiner applicationObjectives for Part II
Acknowledgements
John Doonan
Sergio FeingoldMartin Castellote
Uwe ScholzMatthias Lange
Andy Law
Keywan Hassani-PakAjit SinghMarco BrandiziMonika MistryLisa LillChris Rawlings
Dave EdwardsPhilipp Bayer
Misha KapusheskyKevin Dialdestoro@KnetMiner