33
Mining biological knowledge networks for gene-phenotype discovery Keywan Hassani-Pak EBI course: Introduction to Omics data integration March 2017 http://knetminer.rothamsted.ac.uk/ @KnetMiner

KnetMiner - EBI Workshop 2017

Embed Size (px)

Citation preview

PowerPoint Presentation

Mining biological knowledge networks for gene-phenotype discoveryKeywan Hassani-PakEBI course: Introduction to Omics data integrationMarch 2017http://knetminer.rothamsted.ac.uk/

@KnetMiner

1

Rothamsted is the longest running agricultural research station in the world, providing cutting-edge science and innovation for more than 170 years.

Over 450 staff

Bioinformatics groupBioinformatics AnalystsSoftware DevelopersAbout us

Agenda for today

Kevin DialdestoroStephanie BrunetPart IPart IIKeywan Hassani-PakAjit SinghMonika Mistry

To understand why linking genotype to phenotype is complex

To learn which information types are useful for candidate gene prioritization

To understand the concept of knowledge networks/graphs

To use KnetMiner for the interpretation of your RNA-seq, QTL, GWAS results

To learn a little bit about neurodegenerative diseasesLearning Objectives for Part I

The Genotype to Phenotype ChallengeGenotypeQTL and GWASOmicsIncludes any omics

PhenotypeDiseaseIntelligenceFloweringStress toleranceBiological Knowledge DiscoveryData selection, processing, transformation, integration and interpretation

Many phenotypes are complex, polygenic and the result of complex interactions on cellular levelLinking genotype and phenotype is one of the greatest challenges in biology

5

The approach is generic and works similarly for other species

Free and open source

Data warehousing using a graph-database

Platform to integrate public and private datasets in various formats

Provides a GUI, CLI, APIs and workflows for reproducible data integration

Ondex Data Integration PlatformOndexwww.ondex.org

Not covered in this training course!

7

Lets start with some GWAS data

http://plants.ensembl.org/biomart Example Arabidopsis#SNP=66,816 | #Gene=27,502 | #Phenotype=107

SNP-Phenotype relations (122,919 relations) of significant SNPs (as defined by Ensembl, p-value3,000,000 edgesGenome-scale knowledge network

http://www.sciencedirect.com/science/article/pii/S221206611630030812

Progressive loss of structure or function of neurons, including death of neurons

Many types including Alzheimer, Parkinson, Huntington

Many similarities between these diseases on a sub-cellular level

Discovering these similarities offers hope for therapeutic advances that could ameliorate many diseases simultaneouslyNeurodegenerative Diseases

Use OMIM advanced searchQuery: alzheimer parkinson huntingtonTick: Search in TitleTick: MIM Number Prefix: # phenotypeDownload results as Tab-delimited fileCopy MIM ids without the prefix #Use UniProt Retrieve/ID mappingProvide your MIM identifiersSelect option: From MIM to UniProtKBPress Go and download all proteins in XML format (compressed)

Tutorial data based on 33 human genes

Integration of public datasets

Public Databases

Quantitative dataInteraction data

Omics Data

Datasets and workflows: https://github.com/Rothamsted/ondex-knet-builder

Relationships in Biological Knowledge NetworksGenesHomologyAnnotations

GeneticsInteractionsPhenotype

Highlight text-miningAdd gene expressionMention xrefs and that they cn be collapsed16

Methods needed to evaluate millions of relationships in knowledge network, prioritize genes and extract relevant subnetworks Interactive and exploratory tools needed to enable knowledge discovery and decision making

Interpretation should be the task of domain experts i.e. biologists!

How to search and interpret too much information?

Scale17

Web BrowserServerServlets and JSP PageJava SocketKnowledgeGraph DBOndex API

DHTMLJavaScriptApache TomcatMultithreaded Java ServerHTML, JSON, XML and images over HTTP via AjaxViewsJava Socket

KnetMiner System Overview

Client

KnetMiner UI OverviewSearchGene ViewMap ViewEvidence ViewNetwork View

123

http://knetminer.rothamsted.ac.uk/HumanDisease

KnetMiner search interface (1)

Ontology-based term suggestions

Concept Typese.g. GO, Trait

OR, NOT, Replace

KnetMiner search interface (2)User provided QTL regionSupports gene IDs and names

Example 1Search terms: "cell death" OR apoptosis

Open Query SuggestorClick on cell death tabReplace with neuron deathDoes it change the number of documents and genes that can be found?

Exercise 1 Search Interface

Video 1

Gene View - Ranked genes and evidence summaries

Uses TF*IDF to rank documents by their relevance to a search term

Uses the properties of gene-evidence networks such as the specificity of documents to a gene the frequency of evidence concepts

Calculates Knet-Score for every gene

Smart pre-indexing of the knowledge network makes the computation of the score very fastGene Ranking

Network View Interactive network visualization

EnlargeShow allRe-layoutInfo BoxAdd hidden nodes and edges

Example 2Exercise 2: Gene View NetworkSearch terms: Alzheimer OR Parkinson OR Heparin OR "cell deathGene List: APP, MAPT, PRNP

Click on the APP gene which loads the Network ViewOpen the Info Box Click on different Concept and Relation typesCheck their attributes and click on links to external databasesExplore all direct and indirect paths from APP to Alzheimer and ParkinsonHide Publication conceptsShow all drugs that can target the APP interaction networkGo back to Gene View and select Known targetsClick View Network and find out if APP, MAPT and PRNP interact, are differentially expressed and have GWAS data

Video 2

Example 3Exercise 3: Evidence View NetworkSearch terms: Alzheimer "cell death"

Go to Evidence ViewSort table by column GENESFind GO concept downregulation of neuron deathFind GO concept upregulation of apoptosisClick on number of genes linked to these termsIn Network View, show labels for Gene and GO conceptsWhats the evidence linking genes to selected GO terms?

Video 3

Map View Interactive map of chromosome, gene, SNP and QTL dataShow networkEnlargeResetSettingsGWAS studies

Example 4Exercise 4: Map View NetworkSearch terms: Parkinson "cell deathQTL: Chromosome 12 :: 35000000 - 44000000

Go to Map ViewToggle Full ScreenZoom into Chromosome 12 and find you QTLFind one or several genes that are in close proximity to GWAS SNPsSelect one ore more genes, e.g. LRRK2, PRNP and EIF4G1Launch Network ViewStudy the network and how the genes are connected

Video 4

Web application for very fast search of large genome-scale knowledge graphs

Ranking of candidate genes based on knowledge mining

Interactive visualisation of genome and knowledge maps

Facilitates hypothesis validation and generationKnetMiner Making Gene Discovery Efficient & Funhttp://knetminer.rothamsted.ac.uk/

You like KnetMiner but you might be askingWhat if Im interested in a different disease?What if Im interested in a different species?What if I want to integrate my own private data?What if I dont have a server to run KnetMiner?

As part of a Innovate UK project we are working with Genestack to address these qestions by integrating KnetMiner tools into the Genestack Bioinformatics Platform.

Next: We will teach you how to use Genestack to build your own networks and deploy your own KnetMiner applicationObjectives for Part II

Acknowledgements

John Doonan

Sergio FeingoldMartin Castellote

Uwe ScholzMatthias Lange

Andy Law

Keywan Hassani-PakAjit SinghMarco BrandiziMonika MistryLisa LillChris Rawlings

Dave EdwardsPhilipp Bayer

Misha KapusheskyKevin Dialdestoro@KnetMiner