71
1 Dr Thomas Schlitt [email protected] Lecturer in Bioinformatics Department of Medical and Molecular Genetics King’s College London School of Medicine Bioinformatics What this lecture is about •What is bioinformatics? •Bioinformatics as science •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector NTi

Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

1

Dr Thomas [email protected]

Lecturer in BioinformaticsDepartment of Medical and Molecular Genetics

King’s College LondonSchool of Medicine

Bioinformatics

What this lecture is about•What is bioinformatics?•Bioinformatics as science•Bioinformatics as a toolbox•Use cases•Books

What this lecture is not about•particular commercial software tools such as Vector NTi

Page 2: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

2

What this lecture is about•What is bioinformatics?•Bioinformatics as science•Bioinformatics as a toolbox•Use cases•Books

What this lecture is not about•particular commercial software tools such as Vector NTi

http://www.comicspage.com/comicspage/main.jsp?catid=1450&custid=69&file=20070830cplis-a-p.jpg&code=cplis&dir=/loveis

Page 3: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

3

U|É|ÇyÉÜÅtà|váU|É|ÇyÉÜÅtà|váU|É|ÇyÉÜÅtà|váU|É|ÇyÉÜÅtà|vá |á |á |á |á ‹‹‹‹

“… the rapidly developing area of computer science devoted to collecting, organizing, and analyzing DNA and protein sequences.” Lodish et al. 2000

Brown TA. (2002) Genomes, 2nd ed., John Wiley & Sons, New York.Draghici S. (2003) Data Analysis Tools for DNA Microarrays, Chapman & Hall/CRC, Boca Raton, FL.Lodish H, Berk A, Zipursky SL, Matsudaira P, Baltimore D, Darnell JE. (2000) Molecular Cell Biology, 4th ed., W. H. Freeman & Co., New York.

“… complicated(?)” Schlitt, 2007

“… the use of computer methods in studies of genomes.”Brown, 2002

“…the science of refining biological information into biological knowledge using computers”. Draghici, 2003

Page 4: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

4

TvvÉÜw|Çz àÉTvvÉÜw|Çz àÉTvvÉÜw|Çz àÉTvvÉÜw|Çz àÉ U|É|ÇyÉÜÅtà|vá |á U|É|ÇyÉÜÅtà|vá |á U|É|ÇyÉÜÅtà|vá |á U|É|ÇyÉÜÅtà|vá |á ‹‹‹‹

a journal

an institute

a tutorial

a website

another journal

business

What this lecture is about•What is bioinformatics?•Bioinformatics as science•Bioinformatics as a toolbox•Use cases•Books

What this lecture is not about•particular commercial software tools such as Vector NTi

Page 5: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

5

What is an algorithm?

An algorithm is a finite list of well-defined instructions for accomplishing some task that, given an initial state, will terminate in a defined end-state.

Bioinformaticians (at least some) develop algorithms to analyse biological data. These are usually implemented in a particular programming language and run on a computer.

Bioinformatics as science

What is an algorithm?

An algorithm is a finite list of well-defined instructionsfor accomplishing some task that, given an initial state, will terminate in a defined end-state.

Page 6: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

6

Exp1_01C04Exp1_01C05Exp1_01C06Exp1_01C07Exp1_01C08Exp1_01C09Exp1_01C10Exp1_01C11Exp1_01C12Exp1_01D01Exp1_01D02Exp1_01D03

Page 7: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

7

Am I bothered?

Page 8: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

8

BMJ 2007(335), 460-461

de Oliveira Nature 2006(444), 836-837

Page 9: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

9

•sequence data from suspects, victims, relatives

What is necessary for such an analysis?

labdatabasesoftwaresoftwaresoftware

Bioinformatics

•additional sequence data from others•tools for sequence analysis•tools for sequence comparison•phylogenetic tree construction

What this lecture is about•What is bioinformatics?•Bioinformatics as science•Bioinformatics as a toolbox•Use cases•Books

What this lecture is not about•particular commercial software tools such as Vector NTi

Page 10: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

10

Where do I find Bioinformatics software to do the analyses described before?

-software packages that have to be installed on your computer(some of them are open source(=free), some are commercial

+your data does not leave your computer+free (if open source)-expensive (if not open source)-need to be installed-updates-might not be installed on computers if you move to anew lab

-webpages/onlinetools+free+no installation+if you change labs you still have access to the same tools-data has to be sent elsewhere-can be slow if there are lots of other users

Here, I will focus on two main sources of information

-NCBI National Center for Biotechnology InformationBethesda, Washington, USA

-EBI European Bioinformatics Institute (EMBL outstation)Hinxton, Cambridgeshire, UK

Page 11: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

11

http://www.ncbi.nlm.nih.gov/

NCBI offers many different services, including Pubmed (literature db), Genbank (nucleotide db), OMIM (disease genes db), …and Tutorials !

www.ebi.ac.uk

http://www.ebi.ac.uk/2can/

The European Bioinformatics Institute (EBI) provides various services, such as nucleotide database, literature database, protein database (uniprot), genome database (ENSEMBL), microarray data repository (ArrayExpress), and Tutorials

Page 12: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

12

What this lecture is about•What is bioinformatics?•Bioinformatics as science•Bioinformatics as a toolbox•Use cases•Books

What this lecture is not about•particular commercial software tools such as Vector NTi

Use case 1 - You know the accession number and want to get the sequence (sequence retrieval)Note: if you want to publish an article about a gene you sequenced you have to deposit the sequence first in one of three public data repositories. They will issue an accession number, without this accession number journals will not accept your manuscript.

The advantage is that everyone has access to the sequences via the internet, even if the original authors are now selling t-Shirts, became city bankers or work in a hospital.

Page 13: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

13

GenbankNCBI

DNA Databank of JapanDDBJ

EMBL Nucleotide Sequence DatabaseEBI

Sequence data is frequently exchanged between the three large centres

http://www.ncbi.nlm.nih.gov/sites/entrez

nucleotidesearch

Page 14: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

14

DNA sequence entry in Genbank format

-Locus: accession number, type of sequence, length, submission date-Definition: title describing the sequence-accession number-version (indicates updates)-source-organism-publication reference(s)-sequence features:Links to other databases, gene start, end, protein sequence (translation)-actual sequence data

Alternative sequence formats

EMBL formatSame information, different layout

Page 15: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

15

Alternative sequence formats

FASTA formatLoss of meta-information, but simple format; input format of choice for many other programs

>embl|AF000028|AF000028 Argonauta nodosa cytochrome c oxidase…actttatattttatttttggtatttgatcaggactattaggtacttcttta agtttaataatccgagcagaacttggacaacctggatcacttcttaatgatgaccaacta tataatgtaattgttacagctcacgctttcgtaataattttttttttagttataccagta ataattggaggatttggaaattggcttgttcctcttatattaggagctccagatatagct tttcctcgtataaataatataagtttttgacttttacctcctgctttaactcttctttta gcatcagcagcagttgaaagaggagtgggaactggatgaactgtatatccgcctctttca agaaatttggctcatataggtccttctgttgatttagcaatcttttctcttcatttagcc ggaatctcctcaattcttggagcaattaactttattactactattattaatatacgatga gaaggaatattattagaacgtttacctttattcgtttgatctgttcttattacagctgta ttattacttttatctttaccagttttagctggagcaattactatattacttactgaccga aattttaacactacattctttgatccaagtggaggaggagaccctattttataccaacat ttattc

>sequencenameATGGTT…

Summary use case 1 –You know the accession number and want to get the sequence (sequence retrieval)

There are three sequence repositories worldwide that collectand store sequence information (NCBI, EBI, DDBJ)

Each sequence to be published in a scientific journal has to be submitted to one of these repositories

Each sequence in these databases has an unique id – its accession number

Using this accession number it is possible to retrieve the sequence in different formats (embl, genbank, fasta)

Page 16: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

16

Use case 2 - You have a sequence and want to find similar sequences>mysequence1atggaggatgatttcatgtgcgatgatgaggaggactacgacctggaatac tctgaagatagtaactccgagccaaatgtggatttggaaaatcagtactataattc caaagcattaaaagaagatgacccaaaagcggcattaagcagtttccaaaaggttt tggaacttgaaggtgaaaaaggagaatggggatttaaagcactgaaacaaatgatt aagattaacttcaagttgacaaactttccagaaatgatgaatagatataagcagct attgacctatattcggagtgcagtcacaagaaattattctgaaaaatccattaatt ctattcttgattatatctctacttctaaacagatggatttactgcaggaattctat gaaacaacactggaagctttgaaagatgctaag

Use Blast - there are different varieties, depending on what kind ofsequence you have and what kind of sequence you are looking for

blastn Search nucleotide database using a nucleotide queryblastp Search protein database using a protein queryblastx Search protein database using a translated nucleotide querytblastn Search translated nucleotide database using a protein querytblastx Search translated nucleotide database using a translated nucleotide query

http://www.ncbi.nlm.nih.gov/BLAST/

Page 17: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

17

… then wait …

algorithm you used

database you searched

graphical representation of the results

individual sequencesfound in the database

Page 18: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

18

gene id of hit (accession number) description

score – indicates howsimilar the query sequenceis to the results, larger number is betterBUT: longer sequences lead to higher scores

e-value – expectation valuehow often would you expect to find this sequence in the database randomly (this is particularly relevant if your query sequence is short or contains many repeats, etc)smaller number is betterNote: 2e-3 = 2*10-3 = 0.002

How to read a BLAST result

An individual “BLAST hit” in more detail

accession number + description

score, e-value,identical nucleotides, gaps, orientation

your sequence

blast hit

This is a perfect match!

Page 19: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

19

An individual “BLAST hit” in more detail

To compare two protein sequences we need to define “similarity”. How similar is a His to a Pro?Henikoff and Henikoff (1992; PNAS 89:10915-10919). Their idea was to get a better measure of differences between two proteins specifically for more distantly related proteins. While this bias limits the usefulness of BLOSUM matrices for some purposes, for other programs such as FASTA, BLAST, etc. it should do substantially better. This is because the need for an accurate measure of distance is not as great when peptides are more closely related.They used the BLOCKS database to search for differences among sequences but only among the very conserved regions of a protein family. Hence the term BLOSUM is from BLOcks SUbstitutionMatrix. They first collected all of the sequences in the BLOCKS database and then for each one they sum the number of amino acids in each site to get a frequency table of how often different pairs of amino acids are found together in these conserved regions. Based on this comparison they calculated the BLOSUM matrix.Different levels of the BLOSUM matrix can be created by differentially weighting the degree of similarity between sequences. For example, a BLOSUM62 matrix is calculated from protein blocks such that if two sequences are more than 62% identical, then the contribution of these sequences is weighted to sum to one. In this way the contributions of multiple entries of closely related sequences is reduced.

Summary use case 2 –You have a sequence and want to find similar sequencesUse Blast - there are different varieties, depending on what kind of sequence you have and what kind of sequence you are looking for (most often you probably want to use blastnand tblastx)

The result of a Blast search is a list of sequences that are similar to your query sequence – this similarity is defined mathematically and there are parameters to choose (similarity score, gap penalty)

The e-value is the (pseudo-) probability that a search result is obtained by chance (i.e. low e-values are good)

Page 20: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

20

Use case 3 - You want to compare a set of (similar) sequences

You need a multiple alignment tool – e.g. ClustalW

There are a number of other programs you could choose, they all have strengths and weaknesses. Let’s just use ClustalW as an example.

http://www.ebi.ac.uk/Tools/clustalw

Page 21: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

21

Summary use case 3 - You want to compare a set of (similar) sequences

Multiple alignment tools (e.g. ClustalW) allow to align several sequences

The alignment depends on the similarity measure (just as for the Blast search the computer needs to know exactly what you mean by “similar”)

The most common input format is FASTA

Page 22: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

22

Use case 4 - You want to know what a protein does

http://www.ebi.uniprot.org/uniprot-srv/index.do

Page 23: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

23

Just a remark: Gene Ontology (GO)

•Vocabulary with hierarchy between terms (directed acyclic graph =DAG)•terms describe a location, function•Terms belong to one of three main categories:

molecular function, biological process, and cellular component

•if a child term is a valid description of a gene than all parent terms are also valid•useful for database searches and comparisons•www.geneontology.org

root

mol func biol proc cell comp

parent term

child term child term child term

child term child term child term

Page 24: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

24

Summary use case 4 –You want to know what a protein does

Uniprot (formerly known as Swissprot) provides additional information on proteins and many useful links to further databases (e.g. on protein structures, sequence motifs,…)

In the best case the information is manually curated, i.e. someone somewhere reads journal articles and puts the information in the database

GO is a hierarchical vocabulary; it is used to annotate proteins systematically; this avoids the use of synonyms and enables efficient database searches

http://nar.oxfordjournals.org/have a look at the database issue of NAR (Nucleic acids Research) - this is published very January and contains short articles about a large number of databases, including

you are convinced that there should be a particular database containing all the data you need

you don't find your task in this list

http://www.ebi.ac.uk/interpro/ http://smart.embl-heidelberg.de/

interpro,smartyou want to know if the mutations you identified interrupt the structure of a protein, but the protein structure has not yet been resolved

you want to identify the domains in a protein

http://www.ncbi.nlm.nih.gov/sites/entrez?db=omim

OMIMyou are interested in a disease and want to study its molecular biology

you want to find out which genes are involved in a particular disease

http://www.rcsb.org/pdb/home/home.do

MSD (molecular structure database), PDB

you found several mutations in the same sequence region of an enzyme and want to know if these could effect the active site

You want to look at the 3d structure of a protein

http://bibiserv.techfak.uni-bielefeld.de/rnaforester/

RNA secondary structure prediction tools (RNAforester, mfold)

an article claims a particular mutation effect the functionality of a RNA molecule

you want to know the secondary structure of an RNA

http://www.ebi.ac.uk/Tools/clustalw/index.html

Multiple Alignment (e.g. ClustalW)

you found different variants of a geneyou want to compare a set of sequences encoding the same gene

www.ebi.ac.uk/arrayexpressArrayExpresssomeone has published an article where they describe changes in gene expression in the disease you are studying

You want to download the data of an microarray experiment

www.ensemb.orgensemblin your study you found a statistical link between a disease and a particular region of a chromosome

You are interested in a particular genomic region

http://www.ebi.uniprot.org/index.shtml

uniprotyou sequenced a clone, found out what gene it is, but now want to know what the corresponding protein does

You want to know what a protein does

http://www.ncbi.nlm.nih.gov/BLAST/

blast, fasta, …you sequenced a clone and want to know what gene it is

You have a sequence and want to find similar sequences

http://www.ncbi.nlm.nih.gov/ http://www.ebi.ac.uk/embl/index.html

nucleotide databases, e.g. Genbank, EMBL, ddbj

the accession number for a nucleotide is given in an article

You know the accession number and want to get the sequence (sequence retrieval)

urltoolexampletask

Page 25: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

25

Summary•What is bioinformatics?Bioinformatics is a discipline that uses methods from computer science, mathematics, etc. to address biological questions

•Bioinformatics as science… usually involves the development of new methods to collect, store and analyse data

•Bioinformatics as a toolboxMany bioinformatics tools are available online and can be used for free. Most biologists are using bioinformatics tools to analyse their data. Here, I presented four use cases in more detail: sequence retrieval, sequence search, sequence comparison and retrieving information on proteins

What this lecture is about•What is bioinformatics?•Bioinformatics as science•Bioinformatics as a toolbox•Use cases•Books

What this lecture is not about•particular commercial software tools such as Vector NTi

Page 26: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

26

Books

Baxevanis&Oulette: Bioinformatics – A practical guide … Wileythis is a very useful book and I recommend it if you have to usebioinformatics tools for your research (a bit expensive)

Claverie&Notredame: Bioinformatics for Dummies, WileyThis might be an alternative to the first book, it is a lot cheaper. I haven’t used this book, but other books of the Dummies series. They usually provide an easy to read introduction but are less useful as a reference (poor index) – please let me know if you find it useful

The following books cover the science behind the bioinformatics toolsArthur Lesk: Introduction to Genomics, Oxford Uni PressArthur Lesk: Introduction to Bioinformatics, Oxford Uni PressMoorhouse&Barry: Bioinformatics, Biocomputing and Perl, WileyHiggs&Attwood: Bioinformatics and Molecular Evolution, Blackwell

Page 27: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

27

Dr Thomas SchlittLecturer in Bioinformatics

Department of Medical and Molecular GeneticsKing’s College London School of Medicine

[email protected]

Using bioinformatics to study gene networks and protein networks

Gene Network ModellingIntroduction to Gene Network ModellingFour different levels of Gene Network Models

Network TopologyOngoing work:

Finding disease genes in networks

Outline

Page 28: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

28

Gene Network ModellingIntroduction to Gene Network ModellingFour different levels of Gene Network Models

Network TopologyOngoing work:

Finding disease genes in networks

Outline

Introduction to Gene Network Modelling

GENE 1 GENE 2 GENE 3 GENE 4

DNA

promoter

coding DNA

transcription factor

An abstract Gene Network

Page 29: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

29

Levine&Tjian Nature (2003), 424, 147-151.

Introduction to Gene Network ModellingGene Networks – a more realistic example

Davidson E.H., McClay D.R., Hood, L. (2003) PNAS 100(4), 1465-1480

Introduction to Gene Network ModellingGene Networks – an even more realistic example

Page 30: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

30

Gene Network ModellingIntroduction to Gene Network ModellingFour different levels of Gene Network Models

Network TopologyOngoing work:

Finding disease genes in networks

Outline

1)Parts list – genes, transcription factors, promoters, binding sites, …

2)Architecture – a graph depicting the connections of the parts

3)Logics – how combinations of regulatory signals interact (e.g., promoter logics)

4)Dynamics – how does it all work in real time

Gene Networksfour different levels

Page 31: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

31

1)Parts list – genes, transcription factors, promoters, binding sites, …

2)Architecture – a graph depicting the connections of the parts

3)Logics – how combinations of regulatory signals interact (e.g., promoter logics)

4)Dynamics – how does it all work in real time

Gene Networksfour different levels

1)Parts list – genes, transcription factors, promoters, binding sites, …Parts lists could be thought of as the results from genome sequencing or similar efforts (large scale mutant screens, genome-wide localisation of transcription factors, etc.)

Gene Networksfour different levels

Page 32: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

32

1)Parts list – genes, transcription factors, promoters, binding sites, …Parts lists could be thought of as the results from genome sequencing or similar efforts (large scale mutant screens, genome-wide localisation of transcription factors, etc.)

Gene Networksfour different levels

Number of transcription regulators in different organisms.The number of genes and transcriptional regulators (genes annotated with GO term GO:0030528 “transcription regulator activity” for yeast (Saccharomyces cerevisiae) was taken from SGD (http://db.yeastgenome.org/cgi-bin/SGD/search/featureSearch) and for fly (Drosophila melanogaster, DROM3) and human (Homo sapiens, NCBI 34 dbSNP120) was taken from ENSEMBL (http://www.ensembl.org/Multi/martview)

1034 (4.6%)22287human

492 (3.6%)13525fly

312 (4.7%)6682yeast

number of transcription regulatorsnumber of genes

Organism

1)Parts list – genes, transcription factors, promoters, binding sites, …

2)Architecture – a graph depicting the connections of the parts

3)Logics – how combinations of regulatory signals interact (e.g., promoter logics)

4)Dynamics – how does it all work in real time

Gene Networksfour different levels

Page 33: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

33

2)Architecture – a graph depicting the connections of the parts

A D

gene Anode

gene Dnode

w

relationarc

relationedge

Gene Networksfour different levels

Examples of relationships that can be represented by edges or arcs

E Fbinds to

A Bactivates

C Dcocitation

G Hsequence similarity

Page 34: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

34

Gene Networks

GENE 1 GENE 2 GENE 3 GENE 4

DNA

promoter

coding DNA

transcription factor

G1

G2 G4

G3

- necessary for higher level models- many algorithms already exist for the analysis

of graphs- interesting results were recently published for a

variety of networks:structural features like power-law distribution of edges, small-world behaviour are widely found in many networks,lead to robustness against errors (e.g. Barabasi group);

- complexity of connectivity: examine the number and size of potential feedback/feedforwardloops (e.g. Alon group)

- modularity: identifying modules would help in designing experiments (minimizing side-effects from other modules); are easier to model in computational models (many groups)

Why study the network topology?

Page 35: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

35

Chromatin IP experiments on a chip (ChiP on chip) using microarrays for finding genomic (intragenic) sequences (few hundred bp long) where a particular transcription factor is likely to bind

Transcription factor localisation

Upstream region Gene A

tag tf of interest

Biological data for topological networks

tf t

A

t

YI R02 3W

YB R0 68C

YBR 069C

YB R19 5C

Y CL0 12W YC L01 4W

YC L01 8W

Y CL0 24W

YC L025C

YD L18 2W

YDR 046 C

Y DR 047W

YD R065 W

YDR 508 C

YD R5 10W

YEL 044W

YEL 045C

YE L046 C

YG R0 55W

YG R12 0C

YH R01 1W

YJL2 10W

YJL 212C

YK L118W

YKR 038 C

YK R04 0C

YKR 042 W

Y KR0 93W

YLR 031W

YLR 062C

YLR 063W

Y LR0 94C

Y LR1 12W

YM R0 56C

YO L125W

YOL 126C

YP L123C

Y PL12 4W

YP L265W

YP R1 45W

YO R37 2C

YA L062W

YAR 018C

YB R03 7C

YBR 038W

YBR 077C

YB R07 8W

YB R09 2C

YBR 138C

Y BR 139W

YC R02 4C-A

YDR 033 W

YD R1 46C

YD R14 7W

YD R1 48C

YD R15 0W

YDR 187 C

Y DR 189W

Y DR 324C

Y DR 325W

Y DR 451C

YDR 452 W

YD R 525W

- A

YD R52 6C

YE L007 W

YER 078C

Y ER0 79W

YER 189WYER 190 W

Y FR0 17C

YG L007W

YG L00 8C

YG L02 1W

Y GL1 14W

YG L116W

YGR 092 W

YGR 229 C

Y GR 230W

YH L028W

YH L029 C

Y HR 151C

Y HR 152W

YI L158 W

YJL0 51W

Y JL079C

YJL1 00W

Y JL101 C

YJ L148W

YJL1 58C

YJR 044C

Y JR09 1C

YJR 092W

YJR 127C

YK L096W-A

YKL0 97C

YK R04 3C

YK R04 4W

YLR 084C

YLR 105C

YL R13 1C

YL R18 9C

YLR 190W

YLR 367W

YL R43 8C-A

Y LR4 39W

YM L028 W

YM L05 0W

YM L052 W

YM L053 C

YML 100W

Y ML1 01C

Y ML1 02C- A

YM R00 1C

YM R00 2W

YM R14 4W

YM R16 5C

YM R19 9W

Y NL0 56W

Y NL0 58C

YNL 145W

Y NL2 41C

YN R0 18W

YO R 023C

YO R02 4W

Y OR 025W

YO R2 36W

Y OR 246C

YOR 247 W

YO R 248W

YOR 315 W

YOR 316C

Y PL14 1C

YP L242 C

YP R1 19W

YPR 149W

Y PR 151C

YO L08 9C

YCR 102C

YC R10 2W-A

YD L068W

Y DL0 71C

YD R 042C

YD R0 64W

Y DR 317W

YE R00 1W

YGL 001C

Y GR 259C

YKL 062W

YLR 460C

YN L134C

Y OL08 4W

Y OL0 85C

Y OL1 21C

Y NL0 68C

YAL02 4CYAL0 28W

YAL02 9C

Y BL03 2W

YBL 033C

YB L097W

YB R1 33C

YB R13 5W

YDL 017W

Y DL0 18C

YDR 113 C

YDR 115 W

YD R 261C

YD R26 3C

YDR 500 C

YD R5 01W

YE R07 0W

YER 124 C

YE R12 5W

YFL 021W

Y FL02 2C

YF R02 3W

YG L256W

YGL 257C

YGR 050 C

YHL0 23C

YH L024 W

YH R03 1C

YH R 032W

YHR 061 C

YHR 143 W

YJR1 09C

YJR1 10W

YLR 286C

Y LR30 0W

Y LR3 99C

YLR 400W

YM L063 W

YM L064 C

Y MR 076C

Y MR 117C

YM R1 97C

YM R1 98W

YMR 215 W

YN L17 0W

YN L172 W

YN L173 C

YO L023W

Y OR 073W

YP L116W

YPL1 17C

Y PL13 9C

YP L155C

YPR 035W

YM R04 2W

YC R 018C- A YC R 019W

YD R13 1C

YD R 210W

- D

YD R4 34W

Y ER0 69W

Y FL023 W

YFL0 24C

YJL 088W

Y JL175 W

Y MR 194W

YN L05 0C

Y NL2 58C

Y OL1 40W

YO R 302W

YOR 303 W

YJL206 C

YBR 234 C

YB R23 5W

YD R1 66C

YDR 167 W

YGL 262WYGR 200 C

YH L00 8C

YH R02 8C

YKL 072W

YK R0 66C

YKR 067 W

YN R0 71C

YNR 072 W

YO R38 8C

YO R3 89W

YPL 014W

YP L015C

YPL0 34W

YPL 275W

Y PL27 6W

YPL 277C YPL 278C

YP R0 13C

YNL 309W

YGR 221C

YG R27 7C

YGR 278W

Y LR3 42W

YM R3 06C- A

YM R30 7W

YN L28 9W

Y OL0 07C

YO R1 38C

Y OR 140W

YOR 373 W

YP L056 C

YP L255 W

YP L256 C

YAL 017W

Y AL018 C

YBL 108W

Y BR 158W

YD L032W

YD L03 5C

YD L12 7W

YD L173W

YDL 174C

YD L179W

YD L223 C

YDL 225W

YD L22 6C

YD R05 4C

YDR 055 W

YD R1 12W

YD R5 43C

YE L076 W-C

Y EL07 7C

YFL 013W-A

YFL 063W

YFL 064C

YFL 065C

YGL 108C

YG L13 9W

YG L140 C

YG L22 7W

Y GR 041W

Y GR 086C

Y GR 189C

YG R23 4W

YG R2 54W

YG R29 6W

YH R0 91C

YH R1 38C

Y IL0 09W

YI L050W

YI L05 1C

YIL 128W

YI L12 9C

YI R00 4W

YJL 078C

YJL0 80C

YJL1 59W

YJL1 60C

YKL 163W

YK L164C

YKL1 85W

YK L186C

Y LR0 12C

YLR 013 W

Y LR0 49C

Y LR07 8CY LR0 79W

YL R19 4C

YL R20 7W

YLR 276C

YLR 346C

Y ML1 11W

YM L125 C

YML 131W

YM R1 35W

- A

YM R 251W

- A

YMR 261 C

Y MR 262W

YNL 192W

YN L23 8W

Y NL2 69W

YN L27 1C

Y NL3 27W

Y NL3 28C

Y NL3 37W

YO R26 4W

Y PL05 0CYP L15 7W

YP L158 C

YP L283 C

YPR 183 W

YE R11 1C

YB R0 70C

YBR 071W

Y BR 162C

YBR 162 W-A

YCR 064 C

YC R06 5W

Y DL2 27C

Y DR 224C

YDR 225 W

YDR 227W

YD R30 9C

Y DR 441C

Y DR 442W

YD R50 7C

YDR 509 W

YE L040 W

Y ER 112W

Y ER1 77W

Y GL0 38C

YG L179C

YG R0 13W

Y GR 014W

Y GR 109C

YGR 109 W

-A

YG R1 09W- B

YGR 151 C

YGR 152 C

YG R1 53W

Y HR 063C

Y HR 149C

YH R1 50W

YI L123 W

Y IR 020C

Y IR 020W

- B

Y JL184W

YJL1 85C

Y JL186 W

YJL 187C

Y JL194 W

YJL1 96C

YJR 054W

YJR 148W

Y KL00 7W

Y KL00 8C

YK L096 W

YK L103 C

YKR 011C

YK R0 13W

YLR 055C

YL R05 6W

YLR 255C

Y LR2 56W

YLR 256 W-A

YLR 332W

YM L02 7W

YM R 016C

YMR 017 W

YM R0 70W

Y MR 135C

YM R1 36W

YM R 179W

YMR 304 C

- A

YM R 305C

YM R 306W

YMR 308 C

YN L178W

YN L23 1C

YN L295 W

Y NL2 97C

YN L29 8WYN L300 W

YN L301C

YN R04 4W

YO L011W

Y OL0 12C

YOL 019W

Y OL1 13W

YO L114C

Y OR 344C

Y PL02 4W

Y PL02 5C

YP L126 W

YP L127C

YPL1 63C

YP L26 7W

YPR 009W

Y CR 018C

YBR 167 C

YB R16 8W

YD L192 W

Y DL2 06W

YD R24 5W

Y DR 282C

YGR 166 W

YLL0 32C

YMR 257 C

Y NL1 03W

YA L012 W

YBR 011 C

YDL 058W

YD L059 C

Y ER 091C- AY ER 092W

YFR 030 W

YFR 035C

Y GR 060W

YG R2 04W

YIR 017 C

YI R0 18W

YKL0 01C

YLR 092 W

Y LR1 79C

YLR 180 W

Y NL0 16W

YN L01 7C

YN L03 0W

YN L03 1C

YN L27 7W

YH R20 6W

YAR 050 W

YB L029W

YBR 066 C

YDL 030W

YDR 043 C

YD R1 54C

YD R15 5C YD R15 6W

Y DR 157W

Y DR 158W

YER 011 W

YGL 226C- A

YG L22 6W

YG R 108W

Y GR 210C

YG R2 11W

YG R24 9W

YH L012 W

YH L01 3C

YH R0 04C

YIL 099W

YIL1 18W

Y IR 019C

YI R0 31C

YJL1 15W

YJL 116C

YJL156 C

YJ R077 C

Y JR07 8W

YJR0 79W

YKL 142W

YK R0 45C

YK R09 7W

Y LL017 W

YLL01 8C

Y LL018 C

-A

Y LL01 9C

YLL 020C

YLR 110 C

YLR 111W

YL R25 7W

YLR 397C

YM L00 7W

YM L008 C

Y MR 019W

YM R 071C

YM R0 72W

YM R13 7CYM R13 8W

YM R17 2C

-A

YMR 173 W

YMR 173 W-A

Y NL0 14W

YOL 048C

YO L049 W

YO L050 C

YO L052C- A

Y OL1 54W

Y OR 273CYO R27 4W

YO R 275C

YO R27 6W

Y PR0 65W

YD R42 3C

YBL0 08W

YB R10 2C

YBR 103 W

YC L026 C-A

YD L239C

YDR 132 C

YG L157W

YG R2 33C

YHL 039W

YH L040 C

YH R 053C

YHR 055 C

YKL 039W

Y KL04 0C

YK L086 W

YK L087C

YK L101 W

YKL 102C

Y KL15 4W

Y KL1 55C

YKR 052 C

YK R0 71C

YLL 060C

YLL 065W

YLR 108 C

YLR 109W

YM L116W

YM R0 38C

YO L119C

YOR 173W

YPR 048 W

YGL 073W

YAL 003W

YAL00 5C

YB R04 9C

YBR 051W

YBR 082 C

YC L05 0C

YD L02 0C

YD R01 0C

YD R0 11W

YD R06 1W

YD R 171W

YD R21 4W

YGR 142W

YG R1 46C

YGR 192 C

YG R 197C

YG R1 98W

YJL034 W

YJL03 5C

YJR 045C YJR 046W

YK L051 W

YKL 052C

YKL 152C

YLL0 24C

YLL 026W

YLL 037WY LL039 C

YL R21 6C

YN L006W

YNL 007C

YN L063W

YNL 064C

YN L077W

YNL 281W

YO L08 1W

YOR 298 C- A

YO R 299W

Y OR 300W

YP R15 8W

YK L109W

YB L001 C

YB L044 W

Y BL04 5C

YB L099W

YBR 039W

YC L06 5W

YC L06 6W

YC L067 C

YCR 039 C

YC R040 W

YCR 041 W

YC R10 5W

YD L06 6W

YDL 067C

YD L18 1W

YD R29 8C

YDR 299 W

YDR 377 W

Y DR 473C

YD R 529C

YD R5 44C

Y DR 545W

YEL 024W

YEL 025C

YG L18 7C

Y GL1 91W

YG L193 C

YGR 183 C

YH R0 01W

Y HR 001W

- A

YH R05 1W

YH R 193CYH R1 94W

YJR 121 W

YKL 015W

YKL0 16C

YLR 038C

YLR 168C

YLR 169 W

YL R17 1WYLR 294C

YLR 295C

Y LR2 96W

YLR 395 C

YLR 463C

YL R46 5C

YL R46 7W

YML 088W

YM L08 9C

Y ML0 91C

YM R 256C

Y NL0 52W

YN L338W

YN L339C

YO R06 4C

YO R 065W

YP L270 W

YP L271 W

YP R02 0W

YPR 190 C

Y PR 191W

YK L038W

YBL 074C

YD L13 7W

YLR 451 W

Y DL2 28C

Y DR 522C

Y EL05 0C

YGL 009C

YH R20 7C

Y HR 208W YH R2 09W

YKL1 20W

YM R10 8W

YMR 193 W

Y NL1 04C

Y OR 001W

YOR 108 W

YOR 271 C

YN L167C

Y HR 094C

YH R09 5W

Y PL24 8C

YBL1 09W

YBL 111C

YB L112 C

YB L113 C

YB R0 17C

YB R01 8CYBR 019C

YBR 020 W

YB R02 1W

YBR 060 C

YBR 188 C

YBR 189W

YCR 106 W

YC R1 07W

YDR 008 CYD R 009W

YI L174 W

YI L175 W

YLR 081 W

YLR 408 C

Y ML0 51W

YM L132W

YN L336 W

Y OR 318C

YO R31 9W

YPR 202 WYP R2 03W

YM L099 C YBR 099C

YD R08 4C

Y FL034 C

-A

YFL03 4W

YJL08 5W

YO L05 8W

YM R02 1C

YA R00 9C

YC R 025C

YD R 058C

Y DR 075W

YD R20 7C

YDR 208 W

Y ER1 45C

YER 146 W

YG R1 36W

Y GR 137W

YGR 256 W

Y IL1 12W

YJL217 W

YJR 049CY JR0 50W

YJR 100C

YLR 213 C

YL R21 4W

Y LR4 10W- A

Y LR4 10W

- B

YM R 319C

YM R32 0WYNL 250W

Y NL2 51C

Y OL1 52W

YPL 177C

YPR 110 C

YPR 111W

YPR 123 C

Y PR 124W

YPR 125 W

YN L314 W

YB R0 02C

YB R0 03W

YD L038 C

YD L039CY GR 007W

Y IR 027C

Y IR 028W

YLR 437C

YLR 438W

YOL 075C

YP L010 W

YPL0 11C

Y PL07 5W

YAL0 38W YB L107 W-A

YB R19 6C

YD R21 0W

-C

YD R2 40C

YDR 316W

- A

Y DR 316W- B

YDR 472 W

YER 137C- A

YER 138W- A

YG R0 38C- A

YGR 215 W

YKR 027 W

Y NL2 03C

YPR 158 W-A

YPR 158W

- B

YD R 421W

YDR 289 C

YDR 375 C

Y DR 376W

Y DR 378C

YD R3 79W

YD R3 80W

Y DR 381W

YDR 382 W

YD R5 23C

YF L059 W

YFL 060C

YH R13 5C

YH R1 36C

YHR 137 W

Y HR 139CYH R1 39C

- A

YH R1 40W

YH R1 56C YHR 157W

YN L124W

Y NL1 25C

YN L126 W

YP L25 1W

YPL2 53C

YO R0 28C

YA L037W

Y AL06 3C

YBL 030C

YB L043 W

Y BR 007C

YB R00 8C

YBR 054W

Y BR 157C

YC L046 W

YC L047C

YD L03 7C

YDL 129W

YD L168W

Y DL1 69C

YD R07 7W

YD R 259C

YD R3 00C

YD R30 1W

YD R3 38C

YDR 343C

YD R34 5C

YD R5 04C

YD R5 05C

YE L052 W

YEL0 53C

YE R02 8C

Y ER 038C

YER 044C

YER 045C

YE R07 3W

Y FL00 4W

Y GL2 53W

Y GR 035C

YGR 144 W

YG R17 1C

YG R1 78C

Y GR 180C

YG R 182C

YGR 238 C

Y HL0 34C

Y HL0 41W

YH R0 47C

YH R04 9W

Y HR 128W

Y HR 179WYI L119C

YIL 170W

YI L171 W

YI L17 2C

YI R00 1C

YJL028 W

Y JL029 C

Y JL030W

YJL0 31C

YJ L076W

YJL 077C

YJ L171C

YJL2 19W

YJL221 C

YJR 003 C

YJR 011C

YJR 061 W

Y JR09 4C

Y JR10 2C

YJR 103W

YJR 132W

YJR1 45C

YJR 146W

YJR 147W

YK L044 W

YK L107 W

Y KL1 10C

Y KL13 2C

YKL 182W

YKR 039W

YLL0 06W

YL L007C

Y LL052 C

YLL 053C

YLL 054C

YLL 061W

YL L062 C

YLR 001 C

YLR 034C

YLR 121 C

YLR 228 C

YLR 413W

YM R08 1C

YM R 082C

Y MR 177W

YMR 244 C- A

Y MR 246W

YM R28 3C

Y NL0 87W

YN L179 C

Y NL1 80C

YO L059 W

YOL 060C

Y OL1 16W

YOL1 51W

YO L156 W

YO L157C

Y OR 019W

YO R0 47C

YO R0 49C

Y OR 178C

YO R230 W

YO R30 6C

YO R31 1C

YO R 381W

Y PL06 1W

YP L062W

YP L165C

YPR 014 C

Y KL0 43W

YBL 025W

YBR 159W

Y CL0 45C

YC R0 98C

Y CR 099C

YCR 100 C

YD L214 C

YD L245 C

YE L03 8W

YEL0 39C

YEL0 66W

YE L068 C

YE L069C

Y EL07 0W

YE L074 W

Y EL07 5C

YE L07 6C

YEL 076C- A

Y ER 033C

YE R03 4W

YG R1 40W

YG R2 51W

YGR 279 C

Y HR 005C

Y HR 213W

YIL0 10W

Y IL01 2W

YI L013C

YIL 132C

YI L173 W

YJL1 05W

YJL21 8W

YJR 158W

YJR 159W

YK L06 3C

Y KR 102W

Y LL06 3C

YM L115 C

YM R0 11W

Y MR 015C

YM R01 8W

YM R02 0W

YM R1 94C- A

Y MR 195W

YOR 107 W

YO R 077W

YB R21 0W

YG L00 5C

YJL16 2CYL R36 3C

YL R36 4W

YM L045 W

YML 045W

- A

YPR 072 W

YE L009C

YB L076C

YB R043 C

YBR 112 C

Y BR1 14W

Y BR1 15C

YBR 119 W

YB R2 49C

YBR 250 W

YC L030C

YDL 170W

YD L17 1C

YD L19 8C

YDR 035W

YD R1 27W

Y DR 341C

YD R 512C

Y EL06 1C

YEL 062W

YEL0 63C

YE R0 52C

YER 055C

Y ER 089C

YE R090 W

Y GL0 59W

YG L23 1C

Y HR 018C

Y HR 019C

YH R02 0WYH R07 1W

YH R1 61C

YH R16 2W

YJL 071W

YJL0 72C

Y JL086C

YJL08 7C

YJL 200C

Y JR02 5C

Y JR02 6W

YJR 027W

YJR0 28W

YJR0 29W

YJR 111C

YJR 112W

YLR 355 C

YLR 356 W

YLR 358C

YLR 359 W

Y MR 062C

YM R06 3W

YM R06 5W

YOL 056W

Y OL06 4C

YOR 110 W

Y OR 130C

Y OR 221C

YO R22 2W

YO R3 37W

YPL1 11W

YBL0 27W

YBL0 28C

Y DL2 46C

YE L047 C

YI L12 2W

YIR 021 W

YIR 022 W

YI R0 35C

YJL216 C

YJR 094W

- A

YN R 067C

YO L15 8C

YO R0 29W

Y OR 179C

YOR 180 C

YO R 181W

YPR 148 C

YG L181 W

YAL 064C

- A

Y AL065 C

YAL 067CY AL06 8C

Y DL0 98C

Y EL01 8W

YE L019 C

Y LR4 28C

YL R43 0W

YO L08 2W

Y KL03 2C

YAL 064W

Y BL08 7C

Y DL2 47W

YD R08 6C

YG R 100W

YH R0 96C

YJL0 23C

YJR 160 C

YKL1 67C

YKR 075C

YLL 033W

YL L034C

YLL04 1C

YL R16 6C

Y LR1 67W

YLR 287C

YM R260 C

Y OR 253W

YC R09 7W

YD L178 W

YE L002C

YG L230CY HR 052W

YNL 003C

YGL 013C

YAL02 2C

YBR 040 W

Y BR 057C

YDL 082W

YDL 083C

YDL 147WYD L148 C

YD R3 66C

Y DR 368W

YD R47 0C

Y DR 471W

Y EL05 4C

YE R0 74W

YH L049C

YHR 087 W

YH R14 1C Y HR 142W

YH R1 97W

YIL 133C

Y JL188 C

YM L13 3C

YM R24 0C

YM R2 41W

Y NL0 69C

Y NL23 6W

Y OL0 77W- A

YO L078W

YOL 080C

Y OR 268C

YO R2 69W

YO R3 07C

YP L249C- A

YM R 182C

YFL0 62W

YH L048 W

YJR 161 C

Y NR 029C

YN R03 0W

Y HL0 27W

YB R0 97W

Y BR 275C

YD R08 3W

YDR 278 C

YJL022 W

YLR 035C

YM R1 06C

Y OL0 22C

YH R1 29C

YH R0 37W

Y LR1 39C

YLR 142 W

YOR 272 W

YB R1 82C

YBL0 37W

YB R18 3W

Y BR2 95W

YCR 073 C

YC R07 3W- A

YD R 350C

YD R3 51W

Y DR 418W

YD R42 0W

YG L115 W

YGL 259W

YG R28 2C

YG R 283C YH R0 30C

YH R08 0C

YH R 081W

YIR 013C

YI R01 4W

YK R1 05C

Y KR 106W

YLR3 89C

YLR 390W

YL R39 0W- A

YLR 392 C

YLR 393 W

YL R39 4W

Y NL1 13W

YN L11 5C

YPL 059W

YPL0 60C- A

YPL1 75W

YP L176 C

YP R19 8W

YPL 230W

YI L111 W

Y PL04 9C

YB L016 WYB L017C

YB R08 3W

YC L027W

YC L055 W

YC L056 C

YCR 089 W YER 048W- A

Y ER1 38C

YE R15 9C-A

YE R16 0C

YH R 084W

Y HR 086W

YI L036 WYI L037C

YI L08 2WYIL 083C

YK L095 W

Y KL18 9W

YL R03 5C- A

Y LR4 52C

YM R 052W

YNL 279W

YNL 280C

YN R0 28W

YO R34 3C

YPR 060C

YGL 192W

YD R05 2C

YE R08 3C

YER 142C

YER 143W

YJR 135C

Y KL0 85W

Y PR1 15W

YG R 288W

YN L21 6W

YAL0 33W

YAL 034C

YBL0 22C

YBR 084 C- A

YB R0 85W

YB R18 1C

YB R19 0W

YBR 191W

YDL 075W

YDL 076C

YD L133 C-A

YD L13 3W

YDL 136W

YD L184 C

YD L18 6W

Y DL1 88C

YDL 191W

YD R0 24W

YDR 025 W

YD R1 00W

YDR 186 C

YD R18 8W

Y DR 312W

Y DR 393W

YD R4 47C

YD R44 8W

Y DR 449C

YD R4 50W

YEL 022W

Y EL02 3C

YER 031 C

Y ER 032W

YE R0 46W

YER 101C

YER 102W

YER 116C

YE R11 7W

YER 168 C

YER 169W

YFL0 14W

Y FL015 C

Y FL01 6C

YFR 031 C-A

YGL 030W

YG L031C

Y GL0 71W

Y GL0 72C

YGL 074C

YG L075 CYG L10 3W

YGL 104C

YG L12 3W

YG L124 C

YG L135W

YG L13 6C

YG L189C

Y GR 033C

YG R03 4W

YG R1 17C

Y GR 118W

YG R 143W

Y GR 148C

YGR 149 W

Y HL0 01W

Y HR 021C

Y HR 021W

- A

YH R03 3W

YH R2 03C

YHR 204 W

YI L018 W

Y IL1 21W

YI L148 W

YIL 149C

YI L177C

YJL089 W

YJL09 0C

YJL134 W

Y JL135W

Y JL136 C

YJL177 W

YJL 178C

Y JL189 W

YJL1 90C

YJ L191WYJL 192C

YJR 058C

YJR 059W

Y KL00 6C- A

Y KL00 6W

YKL1 80W

YL R04 7C

YLR 048 W

Y LR28 7C- A

YL R32 5C

YLR 326 W

YLR3 33C

YLR3 44W

YL R38 7C

YLR 388W

YLR 441 C

YL R44 7C

YLR 448W

YM L02 4W

YML0 25C

YM L02 6C

YM L073 C

YMR 142 C

Y MR 143W

YM R24 2C

YN L096C

YN L14 4C

YNL 162W

YN L163C

YN L302C

Y NL3 29C

Y OL0 39W

YO L04 0C

YOL 109W

YOL 120C

Y OL1 27W

YO L128C

YO L136 C

YO R09 5C

YO R0 96W

YO R1 00C

YO R1 01W

YO R2 34C

YOR 235 W

YO R29 2C

YO R29 3W

YO R31 2C

YO R31 7W

Y OR 338W

YO R 342C

YOR 359 W

YO R3 65C

YO R 367W

YP L016W

YPL0 17C

YP L079 W

Y PL08 0C

YPL 143W

Y PL14 4W

YPL1 45C

YPR 029 C

YPR 030W

YP R08 0W

YP R10 2C

YPR 103 W

YPR 131 C

YPR 132 W

YD R3 72C

YD R37 3W

YER 007C- A

YIL 015C

- A

YI L102 C

YML 055W

YM L056 C

YJR0 60W

YA L026C

Y BR 089C

- A

YB R09 0C

YBR 222 C

YB R2 24W

YB R2 25W

YD R43 8W

YH R09 8C

YH R09 9W

Y IL0 74C

YI L088 C

YI L126 W

YI L12 7C

YJL1 67W

Y JL168 C

YJL209 W

YJL 211C

YJR 010W

Y KL19 1W

YK L192 C

YLL 008W

YLL 009C

YLL 010C

YLL05 5W

YLL05 6C

YLR 174W

YN L094 W

YN L09 5C

Y NL2 82W

YN L283C

Y KR 099W

YAL 044C

YAL 045C

YD R 019C

YD R4 08C

YD R40 9W

YE R06 1C

YER 091C

YGL 186C

YG R06 1C

Y KL21 7W

YK R0 10C

YK R07 9C

YK R08 0WYLR 058C

YL R06 1W

YM R0 98C

Y MR 120C

YM R18 7C

YMR 188 C

YM R1 89W

YM R19 0C

YM R19 1W

YMR 243C

YM R 244W

YM R30 0C

YN L009 W YO R 128C

Y PL01 8W

YP L019C

YB R07 4W

Y DR 180W

YD R2 28C

YD R 229W

YER 103W

YI L17 6C

YLR 462W

YLR 464W

YLR 466 W

Y PL15 0W

Y PL1 51C

YPR 104C

YBR 033 W

YKL0 33W

YLR 313 C

YLR 098C

YD R05 6C

YDR 057 WYDR 322W

YG L121 C

YP R1 39C

Y PR 140W

YC R01 7C

YDR 179 C

YD R 179W- A

YFL0 26W

YFL0 27C

YGL 032C

YG L06 2W

YGL 067WYG L069C

YGR 038 C- B

YG R21 8W

YH R 082C

YH R0 83W

YI L015 W

YKR 091W

YNL 043C

YO R0 92W

Y OR 343C- A

Y OR 343C- B

YP L156 C

YP R0 02C- A

Y BR2 29C

YER 081W

YE R12 6C

YER 127W

YE R15 2C

Y ER 153C

YER 154 W

Y GL0 28C

YG L22 8W

YG L229 C

YGR 125W

Y HR 195W

YJL075 C

YJR 149 W

YK L084W

YK L15 0W

YKL1 51C

YDL 114WY DL1 15C

YD L22 4C

YDR 252W

Y DR 370C

YDR 371 W

YGL 093W

YG L09 4C

YGR 175C

YG R 176W

YGR 295C

Y HR 036W

YJL 042W

YJL2 25C

YKL 049C

YL L066 C

YL L067C

YP L090C

YOL 067C

Y JL037WY JL038C

YK L071 W

YKR 098C

Y MR 125W

Y MR 126C

Y DR 463W

YDL 146W

YD R 494WY GR 225W

YM R2 58C

YAR 002W

YB R0 09C

YBR 010W

YB R09 5C

YB R096 W

Y EL00 1C

YFR 001 W

YJR0 07W

YK L202W

YK L203 C

YLR 002 C

YOL 144W

YOL 145C

Y PR 195C

YK L112 W

YA L023C

Y AL04 1W

YA L042W

YA L043C

YAL04 3C- A

YAL0 53W

YAL0 54C YBL0 07C

Y BL03 8W

YB L039 C

YB L079W

YB L08 0C

Y BR 029C

Y BR 030W

YBR 080 C

YB R10 1C

YBR 146W

Y BR 211C

Y BR 212W

YB R28 3C

YB R28 4W

Y CL0 01W

Y CL0 01W- A

YCL 004W

YC L00 5W

YC L011C

YC L01 6C

YC L017C

YC L031C

YC R0 01W YC R 003W

Y CR 053W

YC R08 2W

YD L010W

YD L012 C

Y DL1 05W

YD L10 6C

Y DL1 16W

YD L122 W

YD L13 0W- A

Y DL1 45C

YD L15 9W

Y DL1 60C

YDL 189W

YD L190C

YD L193 W

YD L208W

YD L20 9C

YD R0 27C

Y DR 130C

YD R1 94C

YD R 195W

YD R2 33C

YD R2 34W

YD R28 0W

YDR 283C

YD R2 84C

YD R 285W

YDR 295 C

YD R29 6W

YD R3 26C

YD R32 7W

YD R3 29C

YDR 330 W

YD R33 9C

YD R34 0W

Y DR 361C

Y DR 363W- A

Y DR 384C

YD R 385W

Y DR 404C

YD R4 05W

YDR 422 C

YEL 017C- A

YEL 017W

YEL0 37C

Y FL01 7W- A

YFL 018C

Y FL047 W

YFL0 48C

Y GL1 06W

YG L107 C

Y GL1 22C

YG L194 C

YG L195 W

YG L222C

Y GR 056W

Y GR 059W

YG R1 19C

YGR 128 C

YG R12 9W

YG R18 5C

YG R 186W

YGR 231 C

YG R 232W

YGR 252 W

Y GR 253C

Y GR 268C

YG R27 0W

YH R06 4C

YHR 077 C

Y HR 078W

Y HR 090C

YHR 115 C

YH R1 16W

YH R 165C

YH R 199C

YH R20 0W

YIL 031W

YI L032 C

YI L048 W

YIL 135C

YIR 010W

YJL 008C

YJL 062W

YJ L063C

Y JL111W

YJL 174W

Y JL176 C

YJL18 3W

YJR 104 C

YJ R105 W

YJR 116W

YJR 137 C

YJR 138 W

YKL 004W

Y KL00 5C

YKL 014C

YKL 028W

YK L029 C

YK L060C

YK L135C

Y KL14 3W

Y KL14 4C

Y KL17 2W

Y KL17 7W

YKL1 79C

YKL 190W

YK L195W

YKL 196C

YKR 029 C

YKR 030W

YKR 056W

YKR 059 W

YKR 081 C

YKR 082W

Y LL011 W

YLR 024C

YL R02 5W

Y LR0 95C

YLR 096W

YL R22 2C

YL R26 4W

YLR 293 C

YLR 330W

YLR 396 C

YM L012 W

YML 013C- A

Y MR 005W

Y MR 033W

YMR 060 C

YM R 061W

YM R0 78C

YM R07 9W

YMR 092 C

YM R 093W

YMR 129 W

Y MR 200W

YM R29 6C

YM R2 97W

Y NL0 36W

YN L037C

YN L057W

YN L059 C

YN L085W

YNL 116W

Y NL1 17W

YNL 118C

Y NL11 9W

YNL 121C

YN L149C

Y NL1 83C

Y NL1 89W

Y NL2 12W

YNL2 13C

YNL 255C

YN L267 W

YN L28 7W

Y NL3 06W

YNL 307C

YN L31 2WY NL3 13C

Y NL3 21W

YN L32 2C

YN R0 37C

YNR 038 W

YNR 046W

Y OL03 6WYO L068 C

YO L076 WY OL0 77C

YO R04 5W

YO R0 56C

YOR 057 W

YO R116 C

YO R 117W

Y OR 145C

YO R147 W

YOR 205 C

YO R2 06W

YO R2 07C

YO R2 08W

YO R2 09C

YO R21 0W

YO R2 61C

YO R2 62W

YO R3 09C

YO R31 0C

Y OR 322C

YPL 012W

YP L013 C

YP L036 W

YPL 037C

YPL1 59C

YP L228 W

Y PR0 17C

YPR 018W

YPR 128C

YPR 129 W

YPR 176C

YP R1 78W

YP R18 6C

YPR 187W

YG R 044C

YD R1 44C

YD R1 45W

YIL0 14W

Y KR0 62W

YLR 065 C

YLR 066W

YLR 369W

Y NL0 32W

YG L237 C Y BR 025C

YB R24 3C

YBR 244 W

Y ER 174C

YLL0 27W

YLR 220 W

YO R15 4W

YPL 207W

Y PL03 8W

Y EL07 2W

YGL 184C

YJL0 60W

YN L025 CYN R05 0C

YPR 167C

YP R1 68W

Y BR 281C

YB R28 2W

YK L201 C

YK R014 C

YN L168C

Y NR 068C

YD R1 23C

YDR 497 C

YER 043C

YP L23 1W

Y GL0 35C

YAL0 11W

YBR 121C

YG L002 W

Y GL0 03C

YLR 415C

Y LR41 6C

YLR 417W

YM R0 66W YN R0 47W

YP R0 58W

YD R13 3CYDR 134 C

Y DR 535C

Y DR 536W

YH R0 12W

Y IL01 9W

YIL 020C

YI L097W

Y IL0 98C

YN L111 C

Y NR 070W

YAR 023C

YM R2 21C

YP L08 9C

YGL 158W

YI L040 W

YK R10 3W

Y LR1 54C

Y MR 013C

YMR 014 W

YO L15 9C

Y OR 058C

YB R2 62C

YBR 264 C

Y BR2 65W

YD R 514C

Y DR 515W

YK L197CYN L152 W

YNL 153C

Y BL04 1W

YB L042C

YB R0 48W

YN L02 8W

YB L103C

Y CR 096C

YER 002 W

YDR 151 C

YDR 152 W

YDR 541 C

YG L09 8W

YG R18 7C

YH R1 71W

YIL 161W

YK R0 26C

YMR 134 W

YO L105C

YIL 131C

YB L081W

YBL 082C

YBL0 83C

Y BR 104W

YBR 109 C

YB R11 0W

YBR 163W

YCL 063W

YC L064 C

YC R07 5C

YC R 092C

YC R 093W

YD L08 4W

YD R0 03W

YEL0 15W YE L016C

YF R003 C

YFR 004 W

Y GL0 92W

YG R09 8C

YG R 099W

Y KL15 7W

YKR 054C

YK R05 5W

Y LR2 09C

YL R21 0WYL R35 3W

YMR 086 W

YM R1 83C

YM R1 84W

YOL 030W

YOL 031C

YP L032 C

Y PL14 0C

YDL 056W

YA R00 7C

Y AR0 08W

YB L02 3C

Y DL0 03W

YD L101C

YD R 097C

Y DR 528W

YE R071 C

YER 072W

YE R0 87C- A

YER 094 C

YE R09 5W

YFL01 1W

YG R1 10W

YG R18 8C

YH R1 53C

YHR 154 W

YI L026 C

YJL04 5W

YJL0 73W

YJL 074C

YJR 030C

YKL1 13C

Y LR1 03C

YLR 104 W

YN L154 C

YN L27 3W

YN L274 C

YN R 009W

YO R 074C

YO R07 5W

Y PL08 1W

YP L082 C

YPR 075C

YPR 120 C

YDR 310 C

YBR 076W

YBR 148 W

YB R17 9C

YBR 180 W

YCL 048W

YC L049 C

Y DL0 26W

YD L028 C

YD R31 1W

YDR 402 C

Y DR 403W

YFL0 10W- A

YF L040W

YFR 032C

Y GL1 37W

YGL 138C

YGL 170C

YG R25 8C

Y GR 260W

YH R1 85C

YKL1 78C

YK R0 33C

Y KR0 35W- A

YL L004W

YLL00 5C

YLR 082 C

YLR 343 W

YMR 272 C

YN L317 W

YN L318 C

YOL 102C

YO R 213C

YO R 254C

Y OR 255W

YO R31 3C

YO R 314W

YPL1 29W

YB R04 4C

YB R161 W

YB R23 8C

YD R02 3W

YD R3 07W

YG R1 06C

YGR 205 W

YI L076 W

YI L07 7C

YJL 161W

YKL1 06W

YL R15 2C

Y MR 055C

YN L027 W

YN L029 C

YPR 194 C

YBL0 72C

YBL 092W

YB L093C

Y BR1 16C

YB R11 7C

Y BR 118W

YBR 126C

Y DL0 60W

Y DL0 61CYD L130 W

YE L008W

Y ER 056C- A

YE R13 0C

YER 131W

YF R03 2C- A

Y GL1 00W

YGL1 47C

YG R0 27C

YG R08 5C

YGR 213 C

YG R21 4W

YHL 015W

YH L016 C

YH L033 C

YI L069C

YJR 123 W

YK L156W

YK R0 57W

YKR 094 C

YKR 095 W

YLL 043W

YLL0 45C

YLR0 29C

YLR 030W

Y LR0 74C

YL R07 5W

YLR 183C

YLR 184 W

YLR 185W

YLR 337C

YLR 340 W

YL R40 6C

YLR 407W

YM R11 6C

YM R22 9C

YM R2 30W

YN L06 7W

YO R18 2C

Y OR 183W

YO R 355W

YO R36 9C

YP L131 W

YPL 199C

YPR 042C

YPR 043W

YO L108 C

YD R050 C

YDR 129 C

YE R0 26CY GR 157W

YH R 123W

YM R08 4W

YOR 113W

YO R03 2C

YE R1 36W

YG R024 C

Y JR13 9C

YER 088 CYBR 156C

YH R15 5W

YPR 032 W

YFR 005C

YFR 006W

YP L198 W

YD R 216W

YDR 183 W

Y BR2 66C

YBR 268 W

YER 023W

YER 065 C

YH L042 W

YI L110 W

Y NL0 81C

YMR 043 W

YAL0 40C

YD L044 C

YDR 389 W

Y DR 461W

YG R 047C

YGR 048 W

YK L058 W

Y KL05 9C

YKL 209C

YLR 261 C

YL R26 2C

YL R27 3C

YLR 274W

YMR 253 C

Y NL0 53W

Y NR 062C

Y NR 063W

YO R0 66W

YOR 360 C

YP R11 2CYPR 113 W

YLR 403W

YDR 039 C

YG R2 50C

Y JL067W

Y JL068 C

YO R 151C

YPL0 85W

YP L086 C

YP L183 W- A

YP L184 C

Y GL2 54W

Y EL03 0W

Y LR2 77C

YM R13 1C

Y MR 238W

YB R24 0C

YJL20 3W

YJL 204C

YM R16 4C

Y DR 119W

Y HR 118C

Y HR 119W

YJL11 8W

Y JL121 C

Y JR04 1C

YJR 042W

YN L202 W

YN L204 C

YN L20 5C

YNL 206CYP R08 2C YPR 083W

YP R1 33C

YP R13 3W- A

YP R1 34W

YOR 358 W

YD R49 5C

YO L091W

YOR 038 C

YBR 053 C

YD R1 02CYD R10 3W

YG R10 7W

YN L146W

YPL0 60W

YNL 199C

YC R 024C YD R0 48C

YE R01 3W

YJL003 W

YJL00 4C

Y JR00 9C

YE R04 0W

YD R0 30C

Y DR 031W

YE L004W

YEL0 05C

Y NR 006W

YO R10 2W

YLR 182 W

YDL 141W

Y DL1 42C

Y DR 279W

YE R12 8W

YER 139 C

YE R1 40W

Y FL05 5W

YG L239 C

YJL0 92W

Y JL093 C

YK L161C

YKL2 14C

YM L082W

Y ML0 83C

YO L160 W

YOR 106 W

YD R53 3C

YFL0 56C

Y FL057 C

YH R0 48W

YJL 098W

Y ML13 0C

YO L115W

Y BL05 6W

YB L057C

YD L15 5W

Y GL1 41W

Y GL1 42C

YK L211 C

Y BL02 1C

Y BR 216CYBR 217W

YC R0 28C

YG L133 W

Y JL145 WY JR08 2C

Y ML1 19W

YML 120C

YFL 044C

YA L009W

YA L010C

YB L019 W

YB R01 2C

YBR 013C

YC R02 0C-A

YC R02 0W- B

YC R 021C

YCR 044 CYD R 239C

YD R2 41W

YE L013W

Y EL01 4C

YE R00 7W

YE R0 22W

YG R 127W

YLR 016C

YLR 017 W

YLR 409C

Y LR4 10W

YP R05 6W

YC L074W

YI R0 40C

Y IR 041W

YJR 067C

YJR 068 W

YMR 324 C

YM R3 25W

YO L166 C

YFR 034C

Y CL0 34W

YC L035 C

YC R0 66W

YDR 164 C

Y DR 165W

YD R 281C

YDR 294 C

YDR 474C

YER 085 C

YER 086 W

YE R12 9W

YF L009W

Y FL010 C

YG L05 8W

YG L079 W

YG L099WYG R27 3C

Y HR 130C

YHR 131C

YH R13 2W- A

YI L136 W

YI L137C

YJL17 2W

YJL1 73C

YJ R00 4C

YJR 005W

YJR 098C

Y JR09 9W

YJR 150 C

YK R0 84C

YLR 266C

Y LR2 67W

Y MR 267W

YO L148 C

YO R16 2C

Y OR 163W

YO R1 64C

YO R1 65W

YP L132W

YPL1 33C

YD R27 7C

YDL1 24W

YDL 125C

YD L195W

Y GL1 97W

YG L19 9C

YH R1 83W

Y IL0 64W

YI L065C

YJR0 38C

YJR0 40W

YOL1 03W

YLR 176C

YB R04 2C

YDL 229WYD R4 10C

YI L06 6C

YM R2 79C

YO R28 0C

YOR 282 W

YO R284 W

YO R3 78W

Y PL1 88W

YJL 056C

YBR 254 C YB R25 5WYD R0 66C

YG L255W

YH R14 5C

YJL05 5W

YK L175W

YKL1 76C

YLR 130C

YM L065 W

YML 066C

YN L253 W

YN L254 C

YA L030W

YAL03 1C

YBL0 13WYBL 014C

Y BL05 8W

YB L059 C-A

YBR 236C

Y BR 237W

YC R0 32W

YD L212W

YDL 213C

YD R25 7C

YD R 313C

YD R3 22C

- A

Y DR 323C

YDR 465 C

YDR 466 W

YEL0 12W

YER 156 C

YE R1 57W

Y FL005 W

Y GL0 88W

Y GL0 89C

Y GL1 19W

YGL 120C

Y GL1 51W

Y GL1 52C

YGL 182C

YG R1 47C

YGR 184 C

Y GR 241C

Y GR 243W

Y HR 026W

YH R0 39C- A

YH R 040W

YI L022 W

Y IL02 3C

YIL0 75C

Y IL1 08W

YI L109C

Y IR 002C

YIR 003W

YJL0 01W

YJL00 2C

YJR0 90C

YKL 081W

YK L082C

YK L104C

Y KL14 6W

YK L148C

YK R06 8C

YKR 069 W

YLL0 40C

Y LR2 23C

YLR 224W

YLR 378C

YLR 380W

Y ML0 01W

Y ML1 29C

Y MR 039C

YMR 040 W

YM R 121C

YMR 122W

- A

YMR 186 W

YM R 214W

Y MR 280C

Y MR 281W

YN L008 C YN L090W

YN L09 1W

YN L16 1W

YN L169 C

YN L171C

YNL 262W

YN L26 3C

YN L284 C

YN R0 11C

YN R0 12W

Y NR 039C

YN R0 40W

YO L00 4W

YOL 005C

YO L006 C

YO R14 9C

YO R1 50W

Y OR 187W

YO R3 35C

YO R336 W

YPL1 80W

Y PL18 2C

YPL 204W

Y PL20 6C

YP R02 5C

YPR 026 W

YP R0 52C

YPR 053C

YPR 055 W

YP R0 67W

YPR 074 C

YP R07 6W

Y PR 163C

YPR 164 W

Y PR 181C

YPR 182W

YH R00 6W

Y DR 215C

YF L002W-A

YFL00 2W- B

Y FL017 C YGR 199 WYJL024 C

Y LR3 74C

YNL 070W

Y OR 192C

- A

YO R1 92C- B

YO R23 8W

YMR 037C

Data from Harbison et al Nature (2004) 431, 99-104, Lee et al Science (2002), 298, 799-804

Page 36: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

36

∆A ∆B ∆C

gene B

gene C

gene D

gene AA D

B C

Gene expression in deletion mutants

Biological data for topological networks

Mutant network for Saccharomyces cerevisiae

Data from Hughes et al (2000) Cell, Vol. 102, 109–126

Page 37: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

37

• ChIP on chip (Harbison et al.) IP +microarray

• Mutant network (Hughes et al.) microarray

Gene networks for Saccharomyces cerevisiaethree types of topological networks ...

45.673.8135.736.593.3edges per source gene

103561743632017617018842edges

39594798565429304980genes

227236250169202source genes

mutant network (γγγγ=3.0)

mutant network (γγγγ=2.5)

mutant network (γγγγ=2.0)

ChIPnetwork (p<0.001)

ChIPnetwork (p<0.01)

Clustering coefficientComponents….

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

0.0001

0.001

0.01

0.1

1

1 10 100 1000

ChIP network(Lee) in-silico network mutant networkconnections per gene

prop

ortio

n of

gen

es

connections per gene

prop

ortio

n of

gen

es

connections per gene

prop

ortio

n of

gen

es

Log-log plot of the node connectivity (degree) in the ChIP-network (Lee), in-silico network and the mutant network

degree distributions

degree = 7indegree = 3outdegree = 4

Page 38: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

38

1)Parts list – genes, transcription factors, promoters, binding sites, …

2)Architecture – a graph depicting the connections of the parts

3)Logics – how combinations of regulatory signals interact (e.g., promoter logics)

4)Dynamics – how does it all work in real time

Gene Networksfour different levels

Gene Networksfour different levels

Yuh, C.H., Bolouri, H. and Davidson, E.H. (1998). Science 279, 1896-902.

Page 39: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

39

Davidson E.H., McClay D.R., Hood, L. (2003) PNAS 100(4), 1465-1480

Gene Networksfour different levels

1)Parts list – genes, transcription factors, promoters, binding sites, …

2)Architecture – a graph depicting the connections of the parts

3)Logics – how combinations of regulatory signals interact (e.g., promoter logics)

4)Dynamics – how does it all work in real time

Gene Networksfour different levels

Page 40: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

40

• Various approaches have been developed – Boolean models (e.g., Kaufman, Somogi, Miyano)– Differential equation based models (e.g., Church et al)– Hybrid models (Thiefry and Thomas, Miyano et al)

… but these models were developed before large-scale data sets were available and they are of quite limited use for these large data sets

Gene NetworksDynamic simulation of gene regulation networks

&

X

Y Z

Y=X&Z, X=Y, Z= ¬X A B

X Y Z X Y Z0 0 0 0 0 10 0 1 0 0 10 1 0 1 0 10 1 1 1 0 11 0 0 0 0 01 0 1 0 1 01 1 0 1 0 01 1 1 1 1 0

t t+1C

000 001

010011

111

101

110 100

D State transitions

Gene NetworksDynamic simulation of gene regulation networks - Boolean model example

Page 41: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

41

von Dassow, G., Meir, E., Munro, E.M. and Odell, G.M. (2000). Nature 406, 188-92

Gene NetworksDynamic simulation of gene regulation networks – Differential equation model example

&

¬¬¬¬

ri=(-1.5, 0.5)

Alvis Brazma and Thomas Schlitt: Genome Biology 2003 4(6):P5

S1

S2

S3

Binding Sites Controlfuntion

Substance generator

Gene NetworksDynamic simulation of gene regulation networks –the finite state linear model

Page 42: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

42

&

¬¬¬¬

1

1

0

ri=(-1.5, 0.5)1[S1

t]=time*0.5+[S1t-1]

S1

S2

S3

Binding Sites Controlfuntion

Substance generator

Gene Networks

Alvis Brazma and Thomas Schlitt: Genome Biology 2003 4(6):P5

Dynamic simulation of gene regulation networks –the finite state linear model

brep Srep¬ assorep

dissorep

time

concentrationof repressor

b cf S0 1 r+

Gene Networks

0 1 r>0

Dynamic simulation of gene regulation networks –the finite state linear model

Page 43: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

43

brep Srep¬ assorep

dissorep

time

concentrationof repressor

b cf S0 1 r+

Gene Networks

0 1 r>0

t1

Dynamic simulation of gene regulation networks –the finite state linear model

brep Srep¬ assorep

dissorep

time

concentrationof repressor

b cf S0 1 r+1 0 r-

Gene Networks

1 0 r<0

t1

Dynamic simulation of gene regulation networks –the finite state linear model

Page 44: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

44

brep Srep¬ assorep

dissorep

time

concentrationof repressor

b cf S0 1 r+1 0 r-0 1 r+

Gene Networks

0 1 r>0

t1 t2

Dynamic simulation of gene regulation networks –the finite state linear model

brep Srep¬ assorep

dissorep

time

concentrationof repressor

t1 t2

A B

b cf S0 1 r+1 0 r-0 1 r+1 0 r-0 1 r+1 0 r-0 1 r+1 0 r-0 1 r+

1 0 r<0

Dynamic simulation of gene regulation networks –the finite state linear model

Page 45: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

45

N

cro/cIPL

intPint

cII

Nxis

cIII

PL1PL1

PL2PL2

N

StrucPR’

Q

0,1

0,1

0,1,2

0,1

0,1

cIIPE

PM

PE PcI cI

N

cI/cro

cI/cro

cI/cro

PM

Q

PR1

PR cII

O

cro

P

PR2

N

0,1,2,3

0,1,2

0,1,2,3,4,5,6

0,1,2,3

0,1

0,1,2OR1

OR2

OR3

Gene Networks – a simple example?finite state linear model – lambda phage model

0.00

1.00

2.00

3.13

3.33

3.55

4.00

4.22

4.67

4.88

5.33

5.55

6.00

6.22

6.67

6.88

7.33

7.55

8.00

8.22

8.67

8.88

9.33

9.55

cI

cII*

x is

c III

Qcro

ON

int

0

5

10

15

20

25

conc

entra

tion

time

00.

9

1.5

1.9

2.4

2.7

3.05 3.3 4 5 6 7

7.55 8.5 9

10 11 12 13 14 15 16 17 18

xis

cII*

cIII

Q

croO

Nint

cI

0

1

2

3

4

5

6

7

8

9

10

conc

entra

tion

time

xis

cII*

cIII

Q

cro

O

N

int

cI

lysislysogeny

Simulation of phage λ model leading to lysogenic behaviour or lytic behaviour. In the lysogenic mode the initially active genes are inactivated, and the substance concentrations decrease rapidly, only CI is produced. The fluctuations of the CI concentration are due to the negative feedback loop involving the binding site OR3. In the lytic mode, CI and CII are not produced, but the other substance generators are active. The concentrations of Int, N, and Q increase infinitely because of the lack of a negative feedback control.

Gene Networksfinite state linear model – lambda phage model

Page 46: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

46

00.

9

1.5

1.9

2.4

2.7

3.05 3.3 4 5 6 7

7.55 8.5 9

10 11 12 13 14 15 16 17 18

xis

cII*

cIII

Q

croO

Nint

cI

0

1

2

3

4

5

6

7

8

9

10

conc

entr

atio

n

time

xis

cII*

cIII

Q

cro

O

N

int

cI

N

cro/cI

PL int

Pint

cIIxis

cIII

PL1

PL2

Struc

PR’

Q

cII

PE

PcI cI

N

PM

Q

PR1

PR

cII

O

cro

P

PR2 ?

Gene Networksfinite state linear model – reverse engineering problem

given experimental data:-what are the key molecules in a system?-how are these connected to each other?-what values do the parameters (growth rates) have?

Gene Network ModellingIntroduction to Gene Network ModellingFour different levels of Gene Network Models

Network TopologyOngoing work:

Finding disease genes in networks

Outline

Page 47: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

47

Exploring gene networks and gene function

Network Topology

Y IR 02 3 W

YB R 06 8 C

Y B R 06 9C

Y B R1 9 5C

Y C L0 12 W YC L 01 4 W

YC L 01 8 W

Y CL 0 24 W

Y C L0 2 5C

Y DL 1 82 W

YD R 0 46 C

Y D R 04 7W

Y D R 06 5W

YD R 5 08 C

YD R 51 0 W

Y EL 04 4 W

YE L 04 5 C

YE L 04 6C

YG R 05 5 W

YG R 1 20 C

Y H R0 1 1W

YJ L 21 0W

Y J L2 12 C

Y K L1 1 8W

Y K R0 3 8C

YK R 0 40 C

Y K R0 4 2W

YK R 09 3 W

Y LR 0 31 W

YL R 06 2 C

Y L R0 6 3W

Y L R0 9 4C

Y LR 1 12 W

Y MR 0 56 C

YO L 12 5W

Y O L1 2 6C

Y P L1 2 3C

YP L 12 4 W

Y P L2 65 W

YP R 14 5 W

Y OR 3 72 C

YA L 06 2 W

YA R 01 8C

Y B R 03 7C

YB R 03 8W

Y B R 07 7C

Y BR 0 78 W

Y B R0 9 2C

YB R 13 8 C

Y B R 13 9W

YC R 0 24 C-A

YD R 03 3 W

Y D R1 4 6C

Y D R1 4 7W

YD R 14 8 C

Y D R1 5 0W

Y D R1 8 7C

Y D R 18 9W

YD R 32 4 C

Y D R3 2 5W

Y D R4 5 1C

YD R 45 2 W

Y DR 5 25 W-A

YD R 52 6 C

YE L 00 7 W

YE R 07 8 C

Y E R0 7 9W

YE R 18 9 WYE R 1 90 W

YF R 01 7C

Y G L0 0 7W

Y G L0 08 C

YG L 02 1 W

Y G L1 1 4W

YG L 11 6 W

YG R 09 2 W

YG R 22 9 C

Y GR 2 30 W

Y H L0 28 W

Y H L0 29 C

YH R 15 1 C

Y HR 1 52 W

Y IL 15 8 W

Y JL 0 51 W

YJ L0 7 9C

YJ L 10 0W

YJ L1 0 1C

YJ L 14 8W

Y JL 1 58 C

Y JR 0 44 C

Y JR 09 1 C

YJ R 09 2W

Y JR 1 27 C

Y KL 0 96 W-A

Y KL 09 7 C

YK R 04 3 C

Y K R0 4 4W

YL R 08 4 C

YL R 10 5C

Y LR 1 31 C

YL R 18 9 C

Y LR 1 90 W

Y L R3 6 7W

Y L R4 3 8C- A

YL R 43 9 W

Y ML 0 28 W

YM L 05 0 W

Y ML 0 52 W

Y ML 0 53 C

YM L 10 0W

YM L 10 1C

Y M L1 02 C

-A

YM R 0 01 C

YM R 0 02 W

YM R 14 4 W

Y M R 16 5C

YM R 1 99 W

Y NL 0 56 W

Y N L0 5 8C

YN L 14 5 W

YN L 24 1 C

Y N R 01 8W

YO R 0 23 C

YO R 02 4 W

YO R 02 5 W

Y O R2 3 6W

YO R 24 6 C

YO R 2 47 W

Y O R2 4 8W

YO R 31 5 W

Y O R3 1 6C

Y PL 14 1 C

YP L 24 2 C

Y P R1 1 9W

Y PR 1 49 W

YP R 15 1C

YO L 08 9 C

YC R 10 2 C

Y C R1 0 2W- A

YD L 06 8 W

Y DL 0 71 C

YD R 0 42 C

Y D R 06 4W

Y D R3 1 7W

Y E R0 0 1W

YG L 00 1C

Y GR 2 59 C

YK L0 6 2W

Y L R4 6 0C

YN L 13 4C

Y O L0 8 4W

Y OL 0 85 C

Y O L1 2 1C

Y NL 0 68 C

Y A L0 2 4CY A L0 2 8W

Y A L0 29 C

Y B L0 3 2W

Y BL 0 33 C

YB L 09 7W

Y BR 1 33 C

YB R 13 5W

Y D L0 17 W

Y D L0 1 8C

Y D R 11 3C

YD R 11 5 W

YD R 2 61 C

YD R 2 63 C

Y D R 50 0 C

YD R 5 01 W

Y E R 07 0W

Y E R 12 4C

YE R 12 5 W

Y FL 02 1 W

Y FL 0 22 C

Y F R0 2 3W

Y G L2 5 6W

YG L 25 7 C

Y G R 05 0C

Y H L0 2 3C

YH L 02 4 W

Y HR 0 31 C

YH R 03 2 W

Y H R0 6 1C

YH R 14 3 W

Y JR 1 09 C

Y JR 1 10 W

YL R 28 6 C

Y L R3 0 0W

YL R 39 9C

Y LR 4 00 W

YM L 06 3W

Y M L0 6 4C

Y MR 0 76 C

Y M R1 1 7C

Y M R1 9 7C

YM R 1 98 W

YM R 2 15 W

Y NL 1 70 W

Y N L1 7 2W

YN L 17 3C

Y O L0 2 3W

Y O R0 7 3W

YP L 11 6 W

Y P L1 17 C

YP L 13 9C

Y P L1 5 5C

YP R 03 5 W

YM R 0 42 W

Y C R0 1 8C-A

YC R 0 19 W

Y D R1 3 1C

Y D R2 1 0W-D

Y D R 43 4W

Y E R0 6 9W

YF L 02 3W

Y FL 02 4 C

Y JL 0 88 W

Y JL 1 75 W

YM R 19 4 W

Y N L0 5 0C

Y NL 2 58 C

YO L 14 0 W

YO R 30 2 W

Y O R3 0 3W

Y JL 2 06 C

YB R 23 4 C

YB R 2 35 W

Y D R1 6 6C

YD R 16 7 W

Y GL 2 62 WY GR 2 00 C

Y H L0 0 8C

YH R 0 28 C

Y KL 0 72 W

Y KR 0 66 C

YK R 06 7 W

YN R 0 71 C

Y NR 0 72 W

Y O R3 8 8C

Y O R3 8 9W

YP L 01 4W

Y P L0 15 C

YP L 03 4W

YP L 27 5 W

Y PL 2 76 W

Y PL 2 77 CYP L 27 8 C

Y PR 0 13 C

YN L 30 9 W

Y G R 22 1 C

Y G R 27 7C

YG R 27 8 W

YL R 34 2 W

YM R 30 6 C-A

Y M R3 0 7W

Y N L2 89 W

Y O L0 07 C

Y OR 1 38 C

YO R 14 0 W

YO R 37 3 W

Y PL 0 56 C

Y PL 2 55 W

Y PL 2 56 C

YA L 01 7 W

Y AL 0 18 C

Y B L1 08 W

YB R 15 8 W

Y D L0 3 2W

YD L 03 5C

YD L 12 7W

YD L 17 3 W

YD L 17 4C

Y DL 1 79 W

Y D L2 2 3C

Y D L2 25 WY DL 2 26 C

Y D R 05 4 C

Y DR 0 55 W

Y D R1 1 2W

Y D R5 4 3C

Y EL 0 76 W-C

Y E L0 7 7C

Y F L0 1 3W- A

Y F L0 63 W

Y FL 06 4 C

Y FL 06 5 C

Y GL 1 08 C

Y G L 13 9W

Y GL 1 40 C

YG L 22 7 W

Y GR 0 41 W

YG R 08 6 C

YG R 1 89 C

Y G R 23 4W

YG R 25 4 W

Y G R 29 6 W

YH R 09 1 C

YH R 1 38 C

YI L0 0 9W

YIL 0 50 W

YI L0 51 C

Y IL1 2 8W

Y IL 12 9C

Y IR 00 4 W

YJ L0 7 8C

Y J L0 80 C

Y JL 1 59 W

Y JL 16 0 C

YK L1 6 3W

YK L 16 4C

Y KL 1 85 W

Y K L1 8 6C

Y LR 0 12 C

Y LR 0 13 W

Y LR 0 49 C

Y L R0 7 8CY L R0 7 9W

Y LR 1 94 C

Y L R2 0 7W

Y L R2 7 6C

Y LR 3 46 C

Y ML 1 11 W

YM L 12 5 C

Y ML 1 31 W

Y MR 1 35 W-A

YM R 25 1 W-A

Y M R 26 1 C

YM R 26 2 W

YN L 19 2 W

Y N L2 38 W

YN L 26 9 W

YN L 27 1 C

Y N L3 27 W

YN L 32 8 C

YN L 33 7 W

Y O R2 6 4W

YP L 05 0 CY P L1 5 7W

Y PL 1 58 C

Y PL 2 83 C

YP R 1 83 W

Y E R1 1 1C

Y B R 07 0C

YB R 07 1 W

Y B R1 6 2C

Y B R1 6 2W- A

Y C R 06 4C

YC R 0 65 W

Y DL 2 27 C

Y DR 2 24 C

Y D R2 2 5W

YD R 2 27 W

Y D R 30 9C

YD R 44 1 C

Y D R 44 2W

Y D R 50 7C

YD R 5 09 W

Y EL 04 0 W

YE R 11 2W

YE R 17 7 W

YG L 03 8 C

Y G L1 7 9C

Y GR 0 13 W

Y G R0 1 4W

Y G R 10 9 C

Y GR 1 09 W-A

Y GR 1 0 9W

-B

YG R 1 51 C

Y GR 1 52 C

Y G R 15 3 W

Y HR 0 63 C

Y H R1 4 9C

Y H R 15 0W

YI L1 2 3W

Y IR 02 0C

Y IR 0 20 W-B

YJ L 18 4W

YJ L1 8 5C

Y J L1 86 W

Y JL 1 87 C

Y JL 19 4W

Y J L1 96 C

Y J R0 5 4W

YJ R 14 8W

YK L 00 7 W

Y KL 00 8 C

YK L0 9 6W

YK L 10 3C

Y K R 01 1C

Y K R0 1 3W

Y L R0 5 5C

Y L R0 5 6W

Y LR 2 55 C

Y LR 2 56 W

Y L R2 5 6W- A

Y LR 3 32 W

Y M L 02 7W

Y MR 0 16 C

Y M R0 1 7W

YM R 07 0 W

YM R 1 35 C

Y MR 1 36 W

YM R 1 79 W

YM R 3 04 C-A

Y M R3 0 5C

YM R 30 6 W

YM R 3 08 C

Y NL 1 78 W

Y NL 2 31 C

Y NL 2 95 W

Y NL 2 97 C

Y NL 2 98 WY N L3 00 W

Y NL 3 01 C

Y NR 0 44 W

YO L 01 1W

Y O L0 1 2C

Y OL 0 19 W

Y O L1 13 W

Y O L1 1 4C

YO R 34 4 C

YP L0 2 4W

Y P L0 25 C

Y PL 1 26 W

YP L 12 7C

YP L 16 3C

Y PL 2 67 W

Y PR 0 09 W

Y C R0 1 8C

YB R 1 67 C

YB R 16 8W

YD L 19 2 W

YD L 20 6 W

YD R 2 45 W

YD R 28 2 C

Y G R 16 6W

YL L0 3 2C

YM R 25 7 C

Y N L1 0 3W

Y AL 01 2 W

Y BR 0 11 C

YD L 05 8 W

Y DL 0 59 C

Y E R 09 1C- AY E R0 9 2W

Y FR 0 30 W

YF R 03 5 C

Y G R0 6 0W

Y G R2 0 4W

Y IR 01 7 C

YI R 01 8W

YK L 00 1C

YL R 09 2 W

YL R 17 9C

YL R 18 0 W

Y N L0 16 W

Y NL 0 17 C

YN L 03 0 W

Y NL 0 31 C

YN L 27 7 W

Y HR 2 0 6W

YA R 05 0W

Y B L0 2 9W

Y BR 0 66 C

YD L 03 0W

Y DR 0 4 3C

Y D R1 5 4C

YD R 1 55 C YD R 15 6 W

YD R 1 57 W

YD R 15 8 W

Y E R0 1 1W

Y G L 22 6C

- A

Y G L2 26 W

Y G R1 0 8W

YG R 2 10 C

Y G R2 1 1W

Y G R 24 9 W

YH L 01 2 W

Y H L0 13 C

Y H R0 0 4C

Y IL 09 9W

Y IL1 1 8W

Y IR 01 9 C

YIR 0 31 C

Y J L1 15 W

Y JL 11 6 C

Y JL 1 56 C

Y JR 0 77 C

YJ R 07 8 W

Y JR 07 9 W

Y KL 1 42 W

Y KR 0 45 C

Y KR 0 97 W

Y LL 01 7 W

YL L0 1 8C

YL L 01 8C- A

YL L0 1 9C

YL L0 2 0C

YL R 11 0 C

Y L R1 1 1W

Y LR 2 57 W

YL R 39 7 C

YM L 00 7W

Y M L0 08 C

YM R 0 19 W

Y M R 07 1C

YM R 07 2 W

Y M R1 3 7CY M R1 3 8W

Y M R 17 2C- A

Y M R 17 3W

Y M R 17 3W- A

Y NL 0 14 W

Y O L0 48 C

Y O L0 49 W

Y O L0 5 0C

Y OL 0 52 C-A

Y OL 1 54 W

Y OR 2 73 CYO R 27 4 W

YO R 2 75 C

Y OR 2 76 W

YP R 06 5 W

Y D R4 2 3C

Y B L0 0 8W

Y BR 1 02 C

Y BR 1 03 W

YC L 02 6 C-A

Y D L2 3 9C

YD R 13 2 C

Y GL 1 57 W

Y G R 23 3C

Y H L0 3 9W

Y H L0 4 0C

Y HR 0 53 C

Y H R0 5 5C

Y K L0 3 9W

Y KL 0 40 C

Y K L0 8 6W

YK L 08 7C

Y KL 1 01 W

Y KL 1 02 C

Y K L1 5 4W

Y KL 15 5 C

Y K R0 5 2C

YK R 07 1 C

Y L L0 6 0C

Y LL 06 5 W

Y L R 10 8C

YL R 10 9 W

Y M L1 1 6W

Y MR 0 38 C

Y O L1 1 9C

Y OR 1 73 W

Y PR 0 48 W

YG L 07 3 W

Y AL 00 3 W

Y AL 0 05 C

Y BR 0 49 C

Y BR 0 51 W

YB R 08 2 C

YC L 05 0 C

Y D L0 20 C

Y DR 0 10 C

YD R 0 11 W

Y D R0 6 1W

YD R 17 1 W

Y D R 21 4W

YG R 14 2 W

Y G R1 4 6C

YG R 1 92 C

YG R 1 97 C

YG R 19 8 W

YJ L0 3 4W

Y JL 03 5 C

YJ R 04 5 C YJ R 04 6W

YK L 05 1 W

YK L 05 2 C

Y KL 1 52 C

Y LL 02 4 C

YL L 02 6W

YL L 03 7WY LL 0 39 C

YL R 21 6 C

Y NL 0 06 W

Y N L0 0 7C

Y N L0 6 3W

YN L 06 4 C

YN L 07 7W

Y N L2 81 W

YO L 08 1W

Y O R 29 8C- A

Y O R 29 9W

YO R 3 00 W

Y PR 1 58 W

YK L1 0 9W

Y BL 00 1 C

YB L 04 4 W

Y BL 0 45 C

YB L0 9 9W

Y BR 0 39 W

YC L 06 5 W

Y CL 0 66 W

Y CL 0 67 C

Y C R0 3 9C

YC R 0 40 W

YC R 0 41 W

Y CR 1 0 5W

Y D L0 66 W

YD L 06 7C

YD L 18 1W

YD R 29 8 C

YD R 2 99 W

Y D R 37 7W

YD R 4 73 C

YD R 52 9 C

Y DR 5 44 C

YD R 54 5 W

YE L 02 4 W

Y EL 0 25 C

Y G L1 87 C

Y G L1 91 W

YG L 19 3C

YG R 1 83 C

Y HR 0 01 W

Y HR 0 01 W

-A

YH R 05 1 W

Y H R1 9 3CY HR 1 94 W

YJ R 12 1 W

Y K L0 15 W

Y K L0 1 6C

YL R 03 8 C

YL R 16 8 C

Y LR 1 69 W

Y L R1 7 1WYL R 29 4 C

Y LR 2 95 C

YL R 29 6 W

YL R 39 5 C

Y L R4 6 3C

Y LR 4 65 C

YL R 46 7 W

Y M L0 88 W

YM L 08 9C

Y M L0 9 1C

YM R 2 56 C

Y N L0 52 W

YN L 33 8 W

Y N L3 39 C

Y O R0 6 4C

Y O R0 6 5W

Y PL 2 70 W

YP L 27 1W

Y P R0 2 0W

Y P R1 9 0C

Y P R1 9 1W

YK L0 3 8W

Y B L0 74 C

YD L 13 7 W

Y LR 4 51 W

YD L 22 8C

Y D R5 2 2C

Y E L0 5 0C

Y G L0 09 C

Y H R2 0 7C

Y H R 20 8WY HR 2 09 W

YK L 12 0W

YM R 1 08 W

Y M R1 9 3W

Y N L1 0 4C

Y OR 0 01 W

Y OR 1 08 W

Y O R2 7 1C

Y N L1 67 CY HR 0 9 4C

YH R 0 95 W

Y P L2 48 C

Y B L1 09 W

Y BL 1 11 C

Y BL 1 12 C

Y BL 11 3 C

Y B R0 1 7C

Y BR 0 18 CY BR 0 19 C

Y B R0 2 0W

Y BR 0 21 W

Y B R 06 0C

Y B R1 8 8C

Y BR 1 8 9W

Y C R1 0 6W

YC R 10 7 W

Y D R 00 8CY D R0 0 9W

YI L1 74 W

Y IL 17 5 W

YL R 08 1W

Y L R 40 8C

YM L 05 1W

Y M L1 32 W

Y N L3 3 6W

Y O R3 1 8C

YO R 31 9 W

Y P R2 0 2WYP R 2 03 W

Y M L0 9 9C Y B R0 9 9C

Y D R0 8 4C

Y FL 0 34 C-A

Y F L0 34 W

Y JL 08 5W

Y OL 0 58 W

Y MR 0 21 C

Y A R0 0 9C

YC R 0 25 C

Y D R 05 8C

YD R 07 5 W

YD R 20 7 C

YD R 20 8 W

Y ER 1 45 C

YE R 14 6 W

Y G R1 3 6W

Y G R1 3 7W

Y GR 2 56 W

YIL 1 12 W

Y JL 21 7 W

Y JR 0 49 CYJ R0 5 0W

YJ R 10 0C

Y LR 2 13 C

Y LR 2 14 W

Y LR 4 10 W-A

YL R 41 0 W

-B

Y MR 3 1 9C

YM R 3 20 WYN L 25 0 W

Y N L2 5 1C

YO L 15 2W

Y P L1 77 C

Y P R1 1 0C

YP R 1 11 W

YP R 12 3 C

Y PR 1 24 W

YP R 12 5 W

Y N L3 1 4W

Y BR 0 02 C

YB R 00 3W

YD L 03 8 C

Y D L0 39 CYG R 0 07 W

YI R0 2 7C

YI R 02 8W

YL R 4 37 C

Y LR 4 38 W

YO L 07 5 C

YP L 01 0 W

YP L0 1 1C

YP L0 7 5W

Y AL 0 38 WYB L1 0 7W

- A

Y BR 1 96 C

Y DR 2 10 W

-C

YD R 2 40 C

YD R 31 6 W-A

YD R 31 6 W

-B

YD R 47 2 W

Y E R1 3 7C- A

YE R 1 38 W

-A

YG R 0 38 C-A

YG R 21 5 W

YK R 02 7 W

YN L 20 3C

Y P R1 5 8W- A

YP R 1 58 W-B

Y D R4 2 1W

Y D R2 8 9C

Y D R3 7 5C

YD R 37 6 W

Y DR 3 7 8C

Y D R 37 9W

Y D R 38 0W

Y D R3 8 1W

Y D R3 8 2W

Y DR 5 23 C

Y FL 05 9 W

Y F L0 60 C

Y H R1 3 5C

YH R 1 36 C

Y HR 1 37 W

YH R 13 9 CYH R 13 9 C

- A

Y HR 1 40 W

YH R 1 56 C YH R 15 7 W

YN L 12 4 W

YN L 12 5C

YN L 12 6 W

YP L 25 1 W

Y PL 25 3 C

Y O R0 2 8C

Y AL 0 37 W

Y A L0 63 C

Y BL 0 30 C

YB L 04 3W

Y BR 0 07 C

Y B R 00 8C

YB R 0 54 W

Y B R1 5 7C

YC L 04 6W

Y CL 0 47 C

YD L 03 7C

Y D L1 2 9W

YD L 16 8W

YD L 16 9C

Y D R0 7 7W

Y D R2 5 9C

Y DR 3 00 C

Y D R 30 1W

YD R 3 38 C

YD R 34 3 C

Y D R 34 5C

YD R 50 4 C

YD R 50 5 C

YE L 05 2W

Y E L0 53 C

Y E R0 2 8C

Y ER 0 38 C

YE R 04 4 C

YE R 04 5 C

Y ER 0 73 W

YF L0 0 4W

Y G L2 53 W

Y GR 0 35 C

Y GR 1 44 W

Y G R1 7 1C

Y G R 17 8C

YG R 18 0 C

Y GR 1 82 C

YG R 2 38 C

Y HL 0 34 C

YH L 04 1 W

YH R 04 7 C

Y HR 0 49 W

Y H R1 2 8W

Y HR 1 79 WY IL 1 19 C

YI L1 70 W

YI L1 71 W

YI L1 72 C

Y IR 0 01 C

YJ L 02 8W

YJ L0 29 C

Y JL 0 30 W

Y J L0 31 C

Y JL 0 76 W

YJ L 07 7C

YJ L1 7 1C

Y JL 21 9 W

Y JL 2 21 C

Y JR 0 03 C

Y JR 0 11 C

YJ R 06 1W

YJ R 09 4C

Y JR 1 02 C

YJ R 10 3W

Y JR 1 32 W

Y JR 1 45 C

YJ R 14 6 W

Y JR 1 47 W

YK L0 4 4W

Y KL 10 7 W

Y K L1 1 0C

Y KL 1 32 C

Y K L1 8 2W

Y KR 0 39 W

YL L 00 6 W

Y LL 0 07 C

Y LL 05 2 C

Y LL 05 3 C

Y L L0 54 C

YL L0 6 1W

Y LL 0 62 C

YL R 00 1 C

Y LR 0 34 C

YL R 12 1C

Y LR 2 2 8C

Y L R4 1 3W

YM R 0 81 C

YM R 0 82 C

YM R 17 7 W

Y MR 2 44 C-A

Y M R2 4 6W

Y MR 2 83 C

Y N L0 8 7W

Y N L1 79 C

Y N L1 80 C

Y O L0 5 9W

YO L 06 0 C

Y O L1 16 W

Y OL 1 51 W

YO L 15 6W

YO L 15 7C

Y OR 0 19 W

YO R 04 7 C

Y O R0 4 9C

YO R 1 78 C

Y O R 23 0W

YO R 3 06 C

YO R 31 1 C

Y O R3 8 1W

Y P L0 6 1W

Y P L0 6 2W

YP L 16 5 C

YP R 01 4 C

YK L 04 3W

Y B L0 25 W

Y BR 1 59 W

Y C L0 4 5C

YC R 0 98 C

YC R 0 99 C

Y C R1 0 0C

Y DL 2 14 C

Y D L2 4 5C

YE L 03 8W

YE L 03 9C

Y EL 0 66 W

YE L 06 8C

Y E L0 69 C

Y E L0 70 W

Y EL 07 4 W

YE L0 7 5C

Y EL 0 76 C

YE L 07 6C- A

YE R 03 3C

Y E R0 3 4W

YG R 14 0 W

Y GR 2 51 W

YG R 27 9 C

YH R 00 5 C

YH R 2 13 W

Y IL 01 0 W

YI L0 12 W

YI L0 13 C

Y IL1 3 2C

Y IL 17 3 W

YJ L1 0 5W

YJ L2 1 8W

YJ R 15 8 W

YJ R 15 9 W

YK L 06 3C

YK R 10 2 W

Y LL 06 3 C

YM L 11 5C

Y MR 0 11 W

YM R 0 15 C

Y M R0 1 8W

Y M R0 2 0W

Y M R 19 4C- A

YM R 1 95 W

Y O R1 0 7W

YO R 0 77 W

Y BR 2 10 W

YG L 00 5 C

Y J L1 62 CYL R 36 3 C

YL R 36 4 W

Y M L0 45 W

Y M L0 4 5W- A

Y P R0 7 2W

Y EL 0 09 C

Y B L0 7 6C

Y B R0 4 3C

YB R 1 12 C

YB R 11 4 W

Y B R1 1 5C

YB R 11 9W

Y B R2 4 9C

Y BR 2 50 W

Y C L0 3 0C

YD L 17 0W

YD L 17 1 C

YD L 19 8C

YD R 0 35 W

Y D R1 2 7W

Y DR 3 41 C

Y D R 51 2C

YE L 06 1 C

Y E L0 62 W

YE L 06 3C

Y ER 0 52 C

YE R 05 5 C

YE R 0 89 C

YE R 09 0 W

Y G L0 5 9W

YG L 23 1C

Y H R 01 8C

YH R 01 9 C

Y H R0 2 0WYH R 07 1 W

YH R 16 1 C

Y H R1 6 2W

YJ L0 7 1W

Y JL 07 2 C

YJ L 08 6C

YJ L 08 7C

YJ L2 0 0C

Y JR 02 5 C

Y JR 0 2 6W

Y JR 0 27 W

Y J R0 2 8W

Y JR 0 29 W

YJ R1 1 1C

Y JR 1 12 W

Y L R3 5 5C

YL R 35 6 W

Y LR 3 58 C

Y L R3 5 9W

Y M R 06 2C

YM R 06 3 W

Y M R0 6 5W

Y O L0 5 6W

Y O L0 64 C

Y OR 1 10 W

Y OR 1 30 C

Y O R2 2 1CY O R2 2 2W

Y OR 3 37 W

Y P L1 1 1W

Y B L0 27 W

YB L 02 8 C

YD L 24 6 C

Y EL 04 7 C

Y IL 1 22 W

YI R 02 1W

YI R 02 2W

YI R 03 5C

Y J L2 16 CY JR 0 94 W

-A

Y NR 0 67 C

Y OL 1 58 C

Y OR 0 29 W

Y O R 17 9C

Y OR 1 80 C

YO R 1 81 W

Y P R1 4 8C

YG L 18 1W

Y AL 0 64 C

-A

YA L 06 5C

Y AL 0 67 CY AL 0 68 C

Y DL 0 98 C

Y EL 01 8 W

YE L 01 9 C

Y L R4 2 8C

Y L R4 3 0W

YO L 08 2W

Y K L0 3 2C

Y AL 06 4 W

Y B L0 87 C

Y DL 2 47 W

Y D R 08 6C

Y G R 10 0W

Y HR 0 96 C

YJ L0 2 3C

Y JR 1 60 C

YK L 16 7 C

YK R 07 5 C

Y L L0 33 W

YL L 03 4C

YL L0 4 1C

Y LR 1 66 C

YL R 16 7W

Y LR 2 87 C

YM R 26 0 C

YO R 25 3 W

Y CR 0 9 7W

YD L 17 8W

YE L 00 2 C

YG L 23 0CYH R 0 52 W

Y NL 0 03 C

YG L 01 3C

Y AL 0 22 C

Y B R 04 0W

Y BR 0 57 C

Y D L0 8 2W

Y DL 0 83 C

YD L 14 7 WY DL 1 48 C

Y D R3 6 6C

YD R 36 8 W

Y D R4 7 0C

YD R 4 71 W

YE L 05 4 C

YE R 0 74 W

Y H L0 4 9C

YH R 08 7 W

YH R 1 41 CYH R 14 2 W

Y H R1 9 7W

Y IL1 3 3C

YJ L 18 8C

Y ML 1 33 C

YM R 24 0 CY M R2 4 1W

Y NL 0 69 C

Y NL 2 36 W

Y OL 0 77 W-A

Y OL 0 78 W

Y OL 0 80 C

Y O R 26 8C

Y O R2 6 9W

Y OR 3 07 C

Y P L2 49 C-A

Y M R 18 2C

YF L 06 2 W

YH L 04 8W

Y JR 1 61 C

Y NR 0 29 C

YN R 03 0 W

YH L 02 7W

YB R 0 97 W

Y B R2 7 5C

Y D R0 8 3W

Y DR 2 78 C

YJ L0 2 2W

YL R 03 5 C

YM R 1 06 C

Y O L0 2 2C

Y H R 12 9C

YH R 0 37 W

Y L R1 3 9C

YL R 14 2 W

YO R 2 72 W

Y B R1 8 2C

Y B L0 3 7W

YB R 1 83 W

YB R 29 5 W

Y C R 07 3C

Y CR 0 73 W-A

YD R 35 0 C

Y DR 3 51 W

YD R 41 8 W

Y D R 42 0W

Y G L1 1 5W

Y G L2 59 W

Y GR 2 82 C

YG R 2 83 C Y H R0 3 0C

YH R 08 0 C

Y HR 0 81 W

YI R0 1 3C

YI R0 1 4W

YK R 1 05 C

YK R 10 6 W

YL R 38 9C

YL R 39 0 W

YL R 39 0W-A

Y L R3 9 2C

Y LR 3 93 W

YL R 39 4 W

Y N L1 13 W

YN L 11 5C

Y PL 05 9 W

YP L 06 0C- A

Y P L1 7 5W

YP L 17 6C

Y PR 1 98 W

YP L 23 0 W

YIL 1 11 W

YP L 04 9 C

Y B L0 1 6WY B L0 17 C

Y BR 0 83 W

Y CL 0 27 W

YC L 05 5W

Y C L0 5 6C

Y CR 0 89 W Y ER 0 48 W-A

Y ER 1 38 C

Y E R1 5 9C-A

Y E R1 6 0C

Y H R 08 4W

Y H R0 8 6W

Y IL 03 6 WY IL0 3 7C

Y IL 08 2 WY IL 08 3 C

Y K L0 95 W

Y KL 1 89 W

Y L R0 3 5C- A

YL R 45 2 C

Y M R 05 2W

Y N L2 79 W

YN L 28 0 C

YN R 02 8 W

YO R 34 3 C

Y PR 0 60 C

YG L 19 2 W

YD R 05 2 C

Y E R 08 3C

Y ER 1 42 C

YE R 14 3 W

Y JR 13 5 C

Y K L0 85 W

YP R 11 5 W

Y G R2 8 8W

YN L 21 6 W

YA L 03 3 W

Y AL 0 34 C

YB L0 2 2C

YB R 08 4 C

-A

YB R 08 5 W

Y B R1 8 1C

Y B R1 9 0W

YB R 1 91 W

Y D L0 75 W

Y D L0 7 6C

YD L 13 3C- A

Y D L1 3 3W

YD L 13 6W

YD L 18 4C

YD L 18 6W

Y D L 18 8C

YD L 19 1 W

Y D R0 2 4W

YD R 0 25 W

Y DR 1 00 W

Y DR 1 86 C

Y DR 1 88 W

Y D R3 1 2W

YD R 3 93 W

Y D R4 4 7C

Y D R4 4 8W

Y D R 44 9C

YD R 45 0 W

Y EL 0 22 W

Y E L0 23 C

YE R 03 1C

Y E R0 3 2W

Y ER 0 46 W

Y E R 10 1C

Y ER 1 02 W

YE R 11 6 C

Y E R1 1 7W

Y ER 1 68 C

Y E R1 6 9W

Y FL 01 4 W

YF L 01 5C

YF L 01 6C

YF R 03 1C

- A

Y G L0 3 0W

Y GL 0 31 C

Y G L0 7 1W

Y G L0 7 2C

Y G L0 74 C

YG L 07 5 CY G L1 0 3W

YG L 10 4C

YG L 12 3W

YG L 12 4 C

Y GL 1 35 W

YG L 13 6C

Y G L1 89 C

YG R 0 33 C

YG R 0 34 W

Y GR 1 17 C

YG R 11 8 W

Y G R 14 3W

Y GR 1 48 C

YG R 14 9 W

Y H L0 0 1W

YH R 0 21 C

Y H R0 2 1W- A

Y HR 0 33 W

Y H R 20 3C

YH R 2 04 W

YI L0 1 8W

Y IL 1 21 W

YIL 1 48 W

YIL 1 49 C

Y IL 17 7C

YJ L0 8 9W

YJ L0 90 C

YJ L1 3 4W

Y J L1 35 W

Y JL 13 6C

YJ L 17 7W

Y JL 1 78 C

Y J L1 89 W

Y JL 19 0 C

YJ L1 9 1WYJ L 19 2C

Y JR 0 58 C

YJ R 05 9W

YK L0 0 6C- A

Y KL 0 06 W

YK L 18 0W

YL R 04 7 C

YL R 04 8 W

YL R 28 7 C

-A

Y LR 3 25 C

Y L R3 2 6W

YL R 33 3 C

Y L R3 4 4W

Y LR 3 87 C

Y LR 3 88 W

Y L R4 4 1C

Y LR 4 47 C

Y LR 4 48 W

Y ML 0 24 W

YM L 02 5 C

YM L 02 6 C

Y M L0 73 C

YM R 1 42 C

Y MR 1 43 W

Y M R 24 2C

YN L 09 6C

Y N L1 44 C

YN L 16 2 W

Y N L1 63 C

Y N L3 0 2C

Y N L3 2 9C

Y OL 0 39 W

Y OL 0 40 C

Y O L1 09 W

YO L 12 0C

YO L 12 7 W

Y O L1 2 8C

YO L 13 6 C

Y OR 0 95 C

YO R 0 96 W

Y OR 1 00 C

Y O R 10 1W

Y O R2 3 4C

Y O R2 3 5W

Y OR 2 92 C

YO R 29 3 W

YO R 3 12 C

Y OR 3 17 W

Y O R 33 8W

Y OR 3 42 C

Y OR 3 59 W

YO R 3 65 C

YO R 36 7 W

YP L 01 6W

Y PL 0 17 C

YP L0 7 9W

Y PL 0 80 C

Y PL 1 43 W

YP L 14 4 W

Y PL 1 45 C

YP R 02 9 C

Y PR 0 30 W

Y PR 0 80 W

YP R 10 2 C

YP R 10 3 W

YP R 13 1 C

Y P R 13 2W

Y D R 37 2C

Y D R 37 3W

Y E R0 0 7C

- A

YI L0 15 C- A

Y IL1 0 2C

YM L 0 55 W

YM L 05 6 C

Y JR 06 0 W

Y AL 02 6 C

YB R 0 89 C-A

Y B R0 9 0C

Y BR 2 22 C

Y BR 2 24 W

YB R 22 5 W

Y DR 4 38 W

Y H R0 9 8C

Y HR 0 99 W

Y IL0 7 4C

YI L0 88 C

YI L1 26 W

Y IL1 2 7C

Y JL 1 67 W

Y JL 16 8 C

Y JL 2 09 W

Y J L2 11 C

YJ R 01 0 W

Y KL 1 91 W

Y KL 1 92 C

YL L0 0 8W

Y L L0 09 C

Y L L0 1 0C

YL L0 5 5W

Y LL 0 56 C

Y LR 1 74 W

YN L 09 4W

Y N L0 9 5C

YN L 28 2 W

YN L 28 3 C

Y K R0 9 9W

Y A L0 44 C

Y A L0 45 C

Y DR 0 19 C

Y DR 4 08 C

Y D R4 0 9W

Y ER 0 61 C

Y ER 0 91 C

Y G L1 8 6C

YG R 0 61 C

Y KL 2 17 W

Y KR 0 10 C

YK R 0 79 C

Y KR 0 80 WY LR 0 58 C

Y LR 0 61 W

YM R 09 8 C

Y M R1 2 0C

YM R 1 87 C

Y M R1 8 8C

Y MR 1 89 W

Y M R 19 0C

Y M R 19 1W

Y MR 2 4 3C

YM R 24 4 W

Y M R3 0 0C

Y N L0 0 9W Y O R 12 8 C

YP L 01 8 W

Y P L0 1 9C

Y B R0 7 4W

Y D R 18 0W

Y D R 22 8 C

Y DR 2 29 W

YE R 10 3W

Y IL 1 76 C

Y LR 4 62 W

Y L R4 6 4W

YL R 46 6 W

Y P L1 5 0W

YP L 15 1C

Y P R 10 4C

Y B R0 3 3W

Y K L0 33 W

YL R 31 3 C

Y LR 0 98 C

Y D R0 5 6C

Y DR 0 57 WYD R 32 2 W

YG L 12 1 C

Y P R1 3 9C

Y P R 14 0W

YC R 0 17 C

Y D R1 7 9C

Y DR 1 79 W-A

Y FL 02 6 W

Y FL 02 7 C

Y G L0 3 2C

YG L 06 2 W

Y G L0 6 7WY G L0 69 C

YG R 03 8 C-B

Y GR 2 18 W

Y H R 08 2C

YH R 0 83 W

Y IL 01 5W

YK R 0 91 W

Y N L0 43 C

Y O R0 9 2W

Y O R3 4 3C-A

Y O R 34 3C- B

YP L 15 6C

Y PR 0 02 C-A

YB R 22 9C

Y E R0 8 1W

Y E R 12 6C

Y ER 1 27 W

Y ER 1 52 C

Y E R1 5 3C

Y E R1 5 4W

YG L 02 8 C

Y G L2 28 W

YG L 22 9 C

Y G R1 2 5W

YH R 19 5 W

Y JL 0 75 C

Y J R1 4 9W

Y KL 0 84 W

Y KL 1 50 W

YK L 15 1 C

Y D L1 14 WY D L1 1 5C

Y D L2 24 C

Y D R2 5 2W

Y DR 3 7 0C

Y D R 37 1W

Y GL 0 93 W

Y GL 0 94 C

YG R 1 75 CY GR 1 76 W

Y GR 2 95 C

Y HR 0 36 W

Y JL 04 2W

Y J L2 25 C

Y K L0 49 C

Y L L0 66 C

Y LL 06 7 C

Y PL 0 90 C

YO L 06 7C

Y JL 03 7 WYJ L0 3 8C

Y K L0 7 1W

YK R 09 8 C

YM R 1 25 W

YM R 12 6 C

YD R 4 63 W

YD L 14 6 W

Y D R4 9 4WYG R 22 5 W

Y M R2 5 8C

YA R 00 2W

Y BR 0 09 C

Y BR 0 10 W

Y BR 0 95 C

Y BR 0 96 W

YE L 00 1C

YF R 00 1 W

Y JR 0 07 W

YK L 20 2W

Y K L2 03 C

YL R 00 2C

Y O L1 4 4W

Y OL 1 45 C

YP R 1 95 C

Y KL 1 12 W

Y AL 02 3 C

YA L0 4 1W

Y AL 04 2 W

Y A L0 43 C

Y AL 0 43 C-A

YA L0 5 3W

Y AL 05 4 C YB L0 0 7C

Y BL 03 8 W

Y B L0 3 9C

YB L 07 9W

YB L0 8 0C

Y B R 02 9C

YB R 03 0 W

Y BR 0 80 C

Y B R1 0 1C

YB R 14 6 W

Y B R 21 1C

Y B R2 1 2W

Y B R2 8 3C

Y B R2 8 4W

Y C L0 0 1W

Y CL 0 01 W-A

Y CL 0 04 W

Y C L0 0 5W

Y C L0 1 1C

Y C L0 1 6C

Y C L0 1 7C

Y C L0 3 1C

YC R 0 01 W Y CR 0 03 W

YC R 05 3 W

YC R 0 82 W

Y DL 0 10 W

YD L 01 2 C

YD L 10 5W

YD L 10 6 C

YD L 11 6W

Y D L1 22 W

YD L 13 0W-A

Y D L1 45 C

Y D L1 59 W

Y D L1 60 C

YD L 18 9 W

Y D L1 90 C

YD L 19 3 W

YD L 20 8W

Y DL 2 09 C

Y D R 02 7C

Y D R 13 0C

YD R 19 4 C

YD R 19 5 W

Y D R2 3 3C

Y DR 2 34 W

YD R 28 0 W

Y D R2 8 3C

YD R 2 84 C

YD R 2 85 W

Y D R2 9 5C

Y D R 29 6 W

YD R 32 6 C

Y DR 3 27 W

YD R 3 29 C

Y DR 3 30 W

YD R 3 39 C

Y D R3 4 0W

Y DR 3 61 C

YD R 36 3 W

-A

Y D R3 8 4C

YD R 38 5 W

Y D R 40 4C

Y D R 40 5W

Y D R4 2 2C

YE L 01 7 C-A

YE L 01 7 W

YE L 03 7 C

Y F L0 1 7W- A

Y FL 0 18 C

Y F L0 47 W

YF L 04 8C

YG L 10 6 W

YG L 10 7 C

Y G L1 2 2C

YG L 19 4 C

Y G L1 95 W

YG L 22 2C

Y G R0 5 6W

Y G R0 5 9W

Y GR 1 19 C

YG R 1 28 C

YG R 1 29 W

Y G R 18 5C

Y GR 1 86 W

Y GR 2 31 C

YG R 23 2 W

Y G R 25 2 W

YG R 2 53 C

YG R 26 8 C

Y GR 2 70 W

YH R 0 64 C

Y H R0 7 7C

YH R 07 8 W

YH R 09 0 C

Y H R 11 5C

Y H R 11 6W

YH R 16 5 C

Y HR 1 99 C

Y H R2 0 0W

Y IL0 3 1W

YI L0 3 2C

YIL 0 48 W

YI L1 3 5C

Y IR 01 0 W

Y JL 00 8 C

Y JL 06 2 W

YJ L 06 3C

YJ L 11 1W

Y JL 1 74 W

Y J L1 76 C

Y JL 18 3 W

Y J R1 0 4C

Y JR 1 05 W

YJ R 11 6 W

Y JR 1 37 C

Y JR 1 38 W

YK L0 0 4W

YK L 00 5 C

Y KL 01 4 C

YK L 02 8W

Y K L0 29 C

Y KL 06 0 C

YK L 13 5 C

Y K L1 43 W

Y KL 1 44 C

YK L 17 2W

YK L 17 7W

YK L 17 9 C

Y K L1 90 W

Y KL 19 5 W

YK L 19 6 C

YK R 02 9C

Y KR 0 30 W

Y KR 0 56 W

Y KR 0 5 9W

Y K R 08 1C

Y KR 0 82 W

YL L0 1 1W

Y L R0 2 4C

Y LR 0 25 W

Y LR 0 95 C

Y L R0 9 6W

Y LR 2 22 C

Y LR 2 64 W

Y LR 2 93 C

YL R 3 30 W

Y LR 3 96 C

Y ML 0 12 W

YM L 01 3 C-A

YM R 00 5 W

YM R 0 33 W

Y M R0 6 0C

YM R 0 61 W

YM R 07 8 C

Y MR 0 79 W

Y MR 0 92 C

Y M R0 9 3W

Y M R 12 9W

YM R 20 0 W

Y M R2 9 6C

YM R 29 7 W

YN L 03 6 W

Y N L0 3 7C

YN L 05 7 W

Y N L0 59 C

Y N L0 85 W

YN L 11 6 W

Y NL 1 17 W

YN L 11 8 C

Y N L1 1 9W

Y NL 1 21 C

Y N L1 4 9C

Y N L1 8 3C

Y NL 1 89 W

Y N L2 12 W

Y N L2 1 3C

Y NL 2 55 C

Y NL 2 67 W

Y NL 2 87 W

Y N L3 06 W

YN L 30 7C

Y N L3 12 WY N L3 13 C

YN L 32 1 W

YN L 32 2C

YN R 03 7 C

Y N R0 3 8W

YN R 04 6 W

Y O L0 3 6WY OL 0 68 C

Y OL 0 76 WYO L 07 7 C

Y O R 04 5W

Y O R 05 6C

Y OR 0 5 7W

YO R 1 16 C

YO R 11 7 W

Y OR 1 4 5C

Y O R1 4 7W

YO R 2 05 C

Y OR 2 06 W

YO R 20 7 C

YO R 20 8 W

YO R 2 09 C

Y O R 21 0W

Y O R2 6 1C

Y OR 2 6 2W

YO R 30 9 C

Y O R3 1 0C

Y O R 32 2 C

Y P L0 12 W

YP L 01 3C

Y P L0 36 W

Y P L0 3 7C

YP L 15 9 C

YP L 22 8 W

YP R 01 7 C

Y P R0 1 8W

YP R 1 28 C

YP R 12 9 W

Y P R1 7 6C

YP R 17 8W

Y P R 18 6C

Y P R1 8 7W

Y G R 04 4C

YD R 14 4 C

Y DR 1 45 W

Y IL0 1 4W

YK R 06 2W

YL R 06 5 C

YL R 06 6W

YL R 36 9W

Y N L0 3 2W

Y GL 2 37 CYB R 0 25 C

YB R 24 3 C

Y B R 24 4W

YE R 17 4 C

Y L L0 27 W

Y L R2 2 0W

Y O R 15 4W

Y PL 20 7 W

Y P L0 38 W

YE L 07 2W

Y G L1 84 C

Y JL 06 0 W

Y N L0 2 5C

YN R 0 50 C

YP R 16 7 C

Y PR 1 68 W

YB R 28 1 C

YB R 28 2W

YK L 20 1C

Y K R 01 4C

Y NL 1 68 C

YN R 0 68 C

YD R 1 23 C

Y D R 49 7C

YE R 04 3 C

Y PL 2 31 W

YG L 03 5C

Y A L0 11 W

Y B R1 2 1C

Y GL 0 02 W

Y G L0 0 3C

YL R 41 5C

Y LR 4 16 C

YL R 41 7W

YM R 0 66 WY N R 04 7W

Y PR 0 58 W

YD R 1 33 CYD R 1 34 C

Y DR 5 35 C

Y DR 5 36 W

Y HR 0 12 W

Y IL 0 19 W

YI L0 2 0C

Y IL0 9 7W

Y IL 09 8 C

Y NL 1 11 C

Y N R0 7 0W

Y AR 0 23 C

Y M R2 2 1C

YP L0 8 9C

Y GL 1 58 W

YI L0 40 W

YK R 10 3 W

Y L R1 5 4C

Y M R 01 3 C

YM R 01 4 W

Y O L1 59 C

Y O R0 5 8C

Y BR 2 62 C

Y B R 26 4C

Y BR 2 65 W

YD R 51 4 C

Y D R5 1 5W

Y K L1 9 7CY N L 15 2W

Y N L1 5 3C

Y BL 0 41 W

YB L 04 2C

Y B R0 4 8W

YN L 02 8 W

Y B L1 0 3C

YC R 09 6 C

Y ER 0 02 W

Y D R1 5 1C

Y D R1 5 2W

Y DR 5 41 C

YG L 09 8 W

YG R 18 7 C

YH R 1 71 W

YI L1 61 W

Y K R0 2 6C

Y MR 1 34 W

Y O L1 05 C

YIL 1 31 C

Y BL 0 81 W

Y BL 0 82 C

Y BL 0 83 C

YB R 10 4W

YB R 10 9 C

Y BR 1 10 W

YB R 16 3 W

YC L 06 3 W

Y CL 0 64 C

YC R 07 5 C

YC R 0 92 C

YC R 09 3 W

YD L 08 4 W

Y DR 0 03 W

Y E L0 15 W YE L 01 6C

YF R 00 3 C

Y FR 0 04 W

Y G L0 92 W

Y G R0 9 8C

Y G R 09 9W

Y K L1 5 7W

Y KR 0 54 C

YK R 0 55 W

Y LR 2 09 C

Y LR 2 10 WYL R 35 3 W

Y MR 0 86 W

Y MR 1 83 C

Y MR 1 84 W

Y O L0 3 0W

Y O L0 3 1C

Y P L0 3 2C

Y P L1 40 C

Y D L0 5 6W

Y AR 0 07 C

YA R 00 8W

Y B L0 23 C

Y D L0 03 W

YD L 10 1C

YD R 09 7 C

Y D R 52 8W

YE R 07 1 C

Y ER 0 72 W

Y E R0 8 7C-A

YE R 0 94 C

Y ER 0 95 W

Y FL 0 11 W

Y G R1 1 0W

YG R 1 88 C

Y HR 1 53 C

Y H R1 5 4W

YI L0 26 C

Y JL 04 5 W

YJ L0 7 3W

YJ L0 74 C

YJ R0 3 0C

Y K L1 1 3C

Y LR 1 03 C

Y LR 1 04 W

Y N L1 54 C

YN L 27 3W

YN L 27 4 C

Y N R0 0 9W

Y O R0 7 4C

YO R 0 75 W

YP L 08 1 W

Y P L0 82 C

Y P R0 7 5C

YP R 12 0C

YD R 31 0 C

Y BR 0 76 W

Y BR 1 48 W

YB R 17 9 C

YB R 18 0 W

Y C L0 48 W

Y CL 0 49 C

Y D L0 26 W

Y DL 0 28 C

Y D R3 1 1W

YD R 40 2 C

Y DR 4 0 3W

Y FL 0 10 W-A

YF L 04 0W

YF R 03 2 C

YG L 13 7W

Y G L1 38 C

Y G L1 7 0C

YG R 25 8 C

Y G R 26 0W

YH R 1 85 C

YK L 17 8 C

Y K R0 3 3C

Y KR 0 35 W-A

YL L 00 4W

YL L 00 5C

YL R 0 82 C

Y LR 3 43 W

YM R 27 2 C

YN L 31 7 W

YN L 31 8C

YO L 10 2C

YO R 2 13 C

Y O R2 5 4C

Y OR 2 55 WYO R 31 3 C

Y O R 31 4WYP L1 2 9W

YB R 0 44 C

YB R 1 61 W

Y B R2 3 8C

Y D R0 2 3W

Y DR 3 07 W

Y GR 1 06 C

YG R 2 05 W

Y IL 0 76 W

Y IL 07 7C

Y JL 1 61 W

Y KL 1 06 W

Y LR 1 52 C

YM R 05 5 C

Y NL 0 27 W

Y NL 0 29 CYP R 19 4 C

Y BL 07 2 C

YB L 09 2 W

Y BL 09 3 C

Y B R 11 6C

Y BR 1 17 C

Y B R1 1 8W

Y B R1 2 6C

Y D L0 6 0W

YD L 06 1CY D L1 30 W

Y E L0 08 W

Y E R0 5 6C-A

YE R 1 30 C

Y ER 1 31 W

Y F R0 3 2C- A

Y GL 1 00 W

YG L 14 7C

Y G R0 2 7C

Y G R0 8 5C

Y G R2 1 3C

Y G R2 1 4W

Y HL 0 15 W

Y H L0 16 C

Y H L0 3 3C

YI L0 69 C

YJ R1 2 3W

Y KL 1 56 W

YK R 05 7 W

Y KR 0 94 C

YK R 09 5W

Y LL 0 43 W

Y L L0 4 5C

YL R 02 9 C

Y L R0 3 0W

Y L R0 7 4C

Y L R0 7 5W

Y LR 1 8 3C

Y LR 1 84 W

Y LR 1 85 W

Y L R3 3 7C

YL R 34 0 W

Y L R4 0 6C

YL R 40 7 W

YM R 11 6 C

Y MR 2 29 CY MR 2 30 W

Y N L0 6 7W

Y O R1 8 2C

YO R 1 83 W

Y O R 35 5W

Y O R3 6 9C

Y P L1 3 1W

Y PL 1 99 C

YP R 04 2 C

Y PR 0 43 W

Y O L1 08 C

Y DR 0 50 C

Y D R 12 9C

Y E R0 2 6CYG R 15 7 W

YH R 1 23 W

YM R 08 4 W

Y OR 1 13 W

Y O R 03 2C

YE R 13 6 W

Y GR 0 24 C

Y JR 1 39 C

Y ER 0 88 CYB R 1 56 C

YH R 15 5 W

Y P R 03 2W

Y F R0 0 5C

Y F R0 0 6W

YP L 19 8W

YD R 21 6 W

YD R 1 83 W

YB R 26 6 C

Y BR 2 68 W

YE R 02 3 W

Y E R0 6 5C

Y HL 0 42 W

Y IL 11 0 W

Y N L0 8 1C

YM R 0 43 W

Y AL 0 40 C

YD L 04 4 C

YD R 3 89 W

Y D R4 6 1W

Y G R0 4 7C

YG R 04 8 W

Y KL 0 58 W

YK L0 5 9C

Y KL 20 9 C

Y L R2 6 1C

Y L R2 6 2C

YL R 27 3 C

Y LR 2 74 W

Y MR 2 53 C

YN L 05 3W

YN R 06 2 C

Y N R 06 3W

YO R 06 6 W

YO R 3 60 C

Y PR 1 12 CY P R1 1 3W

YL R 40 3 W

YD R 03 9 C

Y G R2 5 0C

Y J L0 67 W

Y J L0 68 C

Y O R 15 1C

Y PL 0 85 W

Y P L0 86 C

Y P L1 8 3W- A

Y P L1 84 C

YG L 25 4 W

Y EL 0 30 W

Y LR 2 77 C

Y MR 1 31 C

Y MR 2 38 W

YB R 2 40 C

YJ L2 0 3W

YJ L2 04 C

YM R 16 4 C

Y DR 1 19 W

Y H R 11 8 C

Y H R 11 9W

YJ L1 1 8W

Y J L1 21 C

Y J R0 4 1C

Y JR 0 42 W

YN L 20 2 WY N L2 0 4C

Y N L2 0 5C

Y NL 2 06 CY P R0 8 2C YP R 08 3 W

YP R 1 33 C

YP R 1 33 W-A

Y P R 13 4W

Y OR 3 5 8W

Y D R 49 5C

Y O L0 9 1W

Y OR 0 38 C

YB R 05 3 C

YD R 1 02 CY D R1 0 3W

YG R 10 7 W

YN L 14 6W

Y PL 0 60 W

YN L 19 9 C

YC R 02 4 C YD R 04 8 C

YE R 01 3 W

YJ L0 0 3W

Y J L0 04 C

Y JR 0 09 C

YE R 04 0 W

Y D R 03 0C

Y DR 0 31 W

Y E L0 0 4W

Y EL 0 05 C

YN R 00 6 W

YO R 1 02 W

Y LR 1 82 W

Y DL 1 41 W

Y D L1 42 C

Y D R2 7 9W

Y ER 1 28 W

Y E R1 3 9C

Y ER 1 40 W

YF L 05 5W

Y GL 2 39 C

Y J L0 92 W

YJ L 09 3C

YK L 16 1C

Y K L2 1 4C

Y ML 0 82 W

YM L 08 3 C

Y O L1 6 0W

YO R 1 06 W

Y D R5 3 3C

Y FL 05 6 C

YF L 05 7C

YH R 0 48 W

YJ L0 9 8W

YM L 13 0 C

Y OL 1 15 W

YB L 05 6W

YB L0 5 7C

Y D L1 5 5W

Y GL 1 41 W

YG L 14 2C

Y KL 2 11 C

Y B L0 21 C

Y BR 2 16 CY B R 21 7W

Y C R 02 8C

Y G L1 33 W

Y JL 14 5 WY JR 08 2 C

Y ML 1 19 W

Y M L1 20 C

Y FL 0 44 C

YA L0 0 9W

YA L 01 0C

Y B L0 19 W

YB R 01 2 C

YB R 01 3C

YC R 02 0 C-A

YC R 02 0 W-B

YC R 0 21 C

Y CR 0 44 CY D R2 3 9C

YD R 24 1 W

Y EL 0 13 W

YE L 01 4 C

Y E R 00 7W

Y E R0 2 2W

Y G R 12 7W

YL R 01 6C

Y LR 0 1 7W

Y LR 4 09 C

Y LR 4 10 W

YP R 05 6 W

Y C L0 7 4W

Y IR 0 40 C

Y IR 0 41 W

Y JR 0 67 C

YJ R 06 8W

YM R 32 4 C

YM R 3 25 W

YO L 16 6 C

Y FR 0 34 C

YC L 03 4 W

YC L 03 5 C

Y C R 06 6W

YD R 1 64 C

YD R 16 5 W

Y DR 2 81 C

YD R 2 94 C

YD R 47 4 C

Y E R0 8 5C

YE R 0 86 W

YE R 12 9 W

Y FL 00 9 W

YF L0 1 0C

Y GL 0 58 W

YG L 07 9 W

Y G L0 9 9WY G R2 7 3C

Y HR 1 30 C

YH R 1 31 C

Y H R1 3 2W- A

Y IL 13 6 W

YIL 1 37 C

YJ L1 72 W

YJ L1 7 3C

Y JR 00 4 C

Y JR 00 5 W

YJ R 09 8C

Y J R0 9 9W

Y JR 1 50 C

YK R 0 84 C

Y LR 2 66 C

YL R 26 7 W

Y MR 2 67 W

Y O L1 4 8C

YO R 1 62 C

Y OR 1 63 W

Y OR 1 64 C

Y OR 1 65 W

Y PL 1 32 W

Y P L1 3 3C

YD R 2 77 C

Y D L1 24 W

Y D L1 2 5C

YD L 19 5 W

Y G L1 97 W

YG L 19 9C

YH R 18 3 W

Y IL 06 4W

Y IL0 6 5C

Y JR 0 38 C

Y JR 0 40 W

YO L 10 3 W

YL R 17 6 C

Y B R0 4 2C

Y DL 2 29 WYD R 41 0 C

YI L0 6 6C

Y M R2 7 9CY OR 2 80 C

Y OR 2 82 W

YO R 28 4 W

Y O R 37 8W

Y P L1 8 8W

YJ L 05 6C

YB R 25 4 C Y B R2 5 5WYD R 0 66 C

Y G L2 55 W

YH R 1 45 C

YJ L0 5 5W

Y KL 17 5 W

YK L 17 6C

YL R 13 0 C

Y M L0 6 5W

YM L 06 6C

Y NL 2 53 W

YN L 25 4 C

Y A L0 3 0W

YA L0 3 1C

YB L 01 3WY BL 0 14 C

YB L 05 8 W

Y B L0 59 C-A

Y B R2 3 6C

YB R 23 7 W

Y CR 0 32 W

Y D L2 1 2W

Y D L2 1 3C

Y DR 2 57 C

YD R 3 13 C

YD R 32 2 C- A

Y D R3 2 3C

Y D R4 6 5C

YD R 4 66 W

Y E L0 12 W

YE R 1 56 C

Y E R 15 7W

Y FL 00 5 W

Y G L0 88 W

Y G L0 89 C

Y GL 1 19 W

Y GL 1 20 C

Y G L1 5 1W

Y G L1 5 2C

YG L 18 2 C

Y GR 1 47 C

Y G R1 8 4C

Y G R2 4 1C

YG R 24 3 W

Y H R 02 6W

Y HR 0 39 C-A

YH R 0 40 W

Y IL 02 2 W

Y IL 02 3 C

YI L0 75 C

Y IL 1 08 WY IL 10 9 C

Y IR 00 2 C

Y IR 00 3W

YJ L 00 1W

Y J L0 02 C

Y J R0 9 0C

Y K L0 8 1W

Y K L0 82 C

Y KL 10 4 C

Y K L1 46 W

Y KL 1 48 C

Y K R 06 8C

Y KR 0 69 W

Y L L0 40 C

Y LR 2 23 C

Y LR 2 24 W

Y L R3 7 8C

Y LR 3 80 W

Y ML 0 01 W

YM L 12 9 C

YM R 03 9 C

Y MR 0 40 W

YM R 12 1 C

YM R 1 22 W-A

Y M R1 8 6W

YM R 21 4 W

Y M R2 8 0C

YM R 28 1 W

Y NL 0 08 C YN L 09 0 W

Y N L0 9 1W

YN L 16 1W

Y NL 1 69 C

Y N L 17 1C

YN L 26 2 W

Y NL 2 63 C

YN L 28 4 C

Y NR 0 11 C

YN R 0 12 W

Y NR 0 39 C

YN R 0 40 W

Y O L0 04 W

YO L 00 5C

Y OL 0 06 C

YO R 14 9 C

YO R 1 50 W

YO R 18 7 W

Y O R 33 5C

Y O R 33 6W

Y P L1 8 0W

YP L 18 2 C

Y PL 2 04 W

Y PL 20 6 C

Y PR 0 25 C

Y PR 0 26 W

YP R 05 2 C

Y P R 05 3C

YP R 05 5 W

YP R 06 7 W

Y PR 0 7 4C

YP R 07 6 W

YP R 16 3 C

YP R 16 4 W

Y P R 18 1C

Y PR 1 82 W

YH R 00 6 W

YD R 21 5 C

Y FL 0 02 W

-A

Y F L0 02 W- B

YF L 01 7CY GR 1 99 W

Y JL 0 24 CY LR 3 74 C

Y N L0 7 0W

Y O R 19 2C- A

YO R 1 92 C-B

Y O R2 3 8W

Y MR 0 37 C

source gene

target gene

target set

Genes predicted to be functionally related, because they share many targets

Genes that are functionally related share similar target sets in gene networks. These functional relationships can be utilized to annotate unknown genes or to characterize causative genetic factors of complex diseases.

Page 48: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

48

known relationships

Hypothesis: target set similarity indicates functional similari ty

known relationships

target set overlap small

target set overlap large

Hypothesis: target set similarity indicates functional similari ty

Page 49: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

49

known relationships

target set overlap small

target set overlap large

target set overlap large

Hypothesis: target set similarity indicates functional similari ty

known relationships

target set overlap small

target set overlap large

target set overlap large

predicted relationship

Hypothesis: target set similarity indicates functional similari ty

Page 50: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

50

1) s1 - s2 - p-value 2) s1 - s3 - p-value 3) s1 - s4 - p-value 4) s2 - s3 - p-value 5) s2 - s4 - p-value 6) s3 - s4 - p-value

Comparison of gene neighbourhoods in graphs... and then rank the gene pairs according to their p-value from the neighbourhood comparison

• protein-protein interaction (Y2H, cellzome, etc.)

• MIPS (C. v. Mering „reference set“)• Co-citation network (PubMed)

Comparison of gene neighbourhoods in graphs... and three more networks ...

Page 51: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

51

choose a p-value cut-off and calculate the

true-positive rate (sensitivity) as tp/(tp+fn) false-positive rate (1-specificity) as fp/(fp+tn)

Comparison of gene neighbourhoods in graphs... and evaluate the predictions ...

MiTP 3FP 1TN 3FN 4

3/71/4

Page 52: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

52

choose a p-value cut-off and calculate the

true-positive rate (sensitivity) as tp/(tp+fn) false-positive rate (1-specificity) as fp/(fp+tn)

Comparison of gene neighbourhoods in graphs... and evaluate the predictions ...

But which cut-off to choose?

1) s1 - s2 - p-value1 – tp2) s1 - s3 - p-value2 – tn3) s1 - s4 - p-value3 – fn4) s2 - s3 - p-value4 – tn5) s2 - s4 - p-value5 – fn6) s3 - s4 - p-value6 – tntp rate 1/3fp rate 0/3

Comparison of gene neighbourhoods in graphs... and then evaluate the predictions …

reminder true-positive rate as tp/(tp+fn) false-positive rate as fp/(fp+tn)

Do it for all possible cut-offs …

Page 53: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

53

1) s1 - s2 - p-value1 – tp tp2) s1 - s3 - p-value2 – tn fp3) s1 - s4 - p-value3 – fn fn4) s2 - s3 - p-value4 – tn tn5) s2 - s4 - p-value5 – fn fn6) s3 - s4 - p-value6 – tn tntp rate 1/3 1/3fp rate 0/3 1/3

Comparison of gene neighbourhoods in graphs... and then evaluate the predictions …

reminder true-positive rate as tp/(tp+fn) false-positive rate as fp/(fp+tn)

Do it for all possible cut-offs …

1) s1 - s2 - p-value1 – tp tp tp2) s1 - s3 - p-value2 – tn fp fp3) s1 - s4 - p-value3 – fn fn tp4) s2 - s3 - p-value4 – tn tn tn5) s2 - s4 - p-value5 – fn fn fn6) s3 - s4 - p-value6 – tn tn tntp rate 1/3 1/3 2/3fp rate 0/3 1/3 1/3

Comparison of gene neighbourhoods in graphs... and then evaluate the predictions …

reminder true-positive rate as tp/(tp+fn) false-positive rate as fp/(fp+tn)

Do it for all possible cut-offs …

Page 54: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

54

AUC : area under curveMeasure of discrimination of test

Sen

sitiv

ity

1-specificity

True group0 1

Hyp

othe

sise

d0

1

Receiver operating characteristics

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

1-specif icity(false positive rate)

sens

itivi

ty(t

rue

pos

itive

rat

e)

ppi1ppi2mipsrandom

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

1-specif icity(false positive rate)

sens

itivi

ty(t

rue

pos

itive

rat

e)

ppi1ppi2mipsrandom

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

1-specificity(false positive rate)

sens

itivi

ty(t

rue

pos

itive

rat

e)

mi2mi3mipsrandom

10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

1-specificity(false positive rate)

sens

itivi

ty(t

rue

pos

itive

rat

e)

mi2mi3mipsrandom

1

Page 55: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

55

SPF1 ANP1

GAS1

HAP4

HAP3

TUP1

CYC8

SWI5

PDR1

HMLALPHA1

MCM1

SHE4

MBP1

SWI4

FKH1

YAP5

ACE2

SWI6

FKH2

NDD1

STE12

RAD57

ERG2

CLB2

YAP6

ROX1

IME4

MSN4

GAT3

RGM1

GCN4

BAS1

RTG1

VMA8

CUP5

AFG3

RSM18

MAC1

YMR293C

QCR2

BUD21 RPL8A

RPL12A

CEM1

UBR1

STE24ERG3

RRP6

CKB2

MSU1

CKA2

ARG5,6

SKN7 STB1

GAL4

FIL1

SST2

RAP1

FHL1

FKS1

MET4

RTS1

ADE2

TOP3

HOG1

DIG1

STE7

STE18 STE11

STE4

CUP9

PHD1

HMLALPHA2

YMR031W-A

YAP1

SSA3

SOK2

NRG1

CIN5

KIN3

YEL008W

YEL033W

YHL029C

BUD14

ERG28

RMD7

BUD22

AEP2

MET31

CAD1

HSF1

ARG81 ARG80

SMP1 CAT8

MIG1

SGS1

pheromoneresponse genes

cell cycle genes

Schlitt et al Genome Research (2003)

“Guilt by association” – assign the function that the majority of neighbours have

Gene function prediction

find “modules” – identify “modules” and assign a function to each module

Page 56: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

56

Network modularity

Network modularity

Page 57: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

57

0

500

1000

1500

2000

2500

3000

0 20 40 60 80 100 120

nodes removed in ChIP network

size

of b

igge

st c

ompo

nent

0

5

10

15

20

25

30

size

of s

ec.c

ompo

nent

, num

ber

of c

ompo

nent

s

size of biggest Component

Number of Components

size of second biggest component

Network modularity in the ChIP-on-chip network

“Guilt by association” – assign the function that the majority of neighbours have

Gene function prediction

find “modules” – identify “modules” and assign a function to each module

⇒ need for good graph clustering/cutting algorithm

⇒ a number of these algorithms exist, e.g. MCL, MCode …

Page 58: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

58

“Guilt by association” – assign the function that the majority of neighbours have

Gene function prediction

find “modules” – identify “modules” and assign a function to each module

⇒ need for good graph clustering/cutting algorithm

⇒ a number of these algorithms exist, e.g. MCL, MCode …

Gene Network ModellingIntroduction to Gene Network ModellingFour different levels of Gene Network Models

Network TopologyOngoing work:

Finding disease genes in networks

Outline

Page 59: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

59

Finding disease genes in networksHypothesis: Functionally related genes are close in networks

Hybridize DNA with Genotyping Array

Finding disease genes in networksGenome-wide association data

Wellcome Trust Case Control Consortium Nature 2007(447) 661-678

Page 60: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

60

Finding disease genes in networksHypothesis: Networks can provide additional information for the identification of disease genes

Finding disease genes in networksApproach: Looking for network topological features that allow to enrich disease genes

Page 61: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

61

Aim: identification of disease genes based on the analysis of GWAS and NGS data in the context of data on molecular networks

Computational Analyses of Complex Diseases at the Gene and Network Levels

Assign SNPs to Genes

Derive Gene-wide p-values

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Protein-Protein Interaction Databases: Keeping up with Growing Interactomes.Lehne, B., Schlitt T., Human Genomics 2009

Page 62: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

62

Assign SNPs to Genes

Derive Gene-wide p-values

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Protein-Protein Interaction Databases: Keeping up with Growing Interactomes.Lehne, B., Schlitt T., Human Genomics 2009

Assign SNPs to Genes

Derive Gene-wide p-values

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Protein-Protein Interaction Databases: Keeping up with Growing Interactomes.Lehne, B., Schlitt T., Human Genomics 2009

Page 63: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

63

Assign SNPs to Genes

Derive Gene-wide p-values

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Protein-Protein Interaction Databases: Keeping up with Growing Interactomes.Lehne, B., Schlitt T., Human Genomics 2009

Metadatabases:StringPinaApiduniHI

Derive Gene-wide p-values

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Exome Localization of Complex Disease Association Signals.Lehne, B., Lewis, C. M., Schlitt T., BMC Genomics 2011

Assign SNPs to Genes

SNPs

chromosome

Genes

Page 64: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

64

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Assign SNPs to Genes

Derive Gene-wide p-values

From SNPs to Genes: Disease Association on the Gene Level.Lehne, B., Lewis, C. M., Schlitt T., PLoS one(corrections to be submitted)

-log

10(P

)

chromosome

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Assign SNPs to Genes

Derive Gene-wide p-values

De-novo Pathway Discovery based on Genome-wide Association Studies.in preparation

JAK2

IFNGR1

IGF1R

IL12RB2MAP3K7

MAPT

MPL

PDGFRBPPP2CA

PTPN2

RYR1

STAT1

CAMK2G

FGFR3

STAT3

THPO

YWHAE

MAP3K7IP1

PRKCD

PSEN2

CACNA1S

STAT5A

CDK5

HD PRPF40A

NKD1

RPS6KB1

APPGSK3B CASP6

EEF2K PRKAA1

Page 65: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

65

Associated region for CD within HuPPI3Nodes represent genes and edges represent physical interactions between the corresponding gene products. Colours of edges and borders represent the threshold α at which a gene was added to the region, whereas background colours indicate the rank of each gene.

STAT3

PRKCD

IFNGR1

JAK2

STAT1FGFR3

MAPT

gene/protein

ppi

STAT3

PRKCD

IFNGR1

JAK2

STAT1FGFR3

MAPT

α≤100

α≤300

α≤500

α>500

no rank

gene specific p-values derived from GWAS

STAT3

PRKCD

IFNGR1

JAK2

STAT1FGFR3

MAPT

PRKCD

FGFR3

α≤100

α≤300

α≤500

α>500

no rank

expand regions to neighbours with significant p-values

STAT1“jump” over nodes if next node has significant p-value

Associated region for CD within HuPPI3Nodes represent genes and edges represent physical interactions between the corresponding gene products. Colours of edges and borders represent the threshold α at which a gene was added to the region, whereas background colours indicate the rank of each gene.

Page 66: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

66

GRB2

EGFR

SHC1

STAT3

SRC

STAT1

JAK2

BRCA1

UBE2D1

PRKCD

FGFR3

MAPT

CDK5

CAMK2G

MAPK1

PTPN11

NCOA1

CREBBP

GSK3B

TP53

AR

USP7

NR3C1

POU2F1

EEF2K

RPS6KB1

RET

PRKAA1

RAF1

FANCD2

TNFRSF6B

FASLG

YWHAE

IFNGR1

MKNK2

KIT

APP

Associated region for CD within HuPPI3Nodes represent genes and edges represent physical interactions between the corresponding gene products. Colours of edges and borders represent the threshold α at which a gene was added to the region, whereas background colours indicate the rank of each gene.

CDK5

APP

GSK3B

HDCASP6

JAK2

IFNGR1

IGF1R

IL12RB2

MAP3K7

MAPT

MPL

PDGFRB

PPP2CA

NKD1

PRKAA1

PRKCD

EEF2K

PSEN2 PRPF40A

RPS6KB1

PTPN2

RYR1

CACNA1S

STAT1

CAMK2G

FGFR3

STAT3

THPO

YWHAE

STAT5A

MAP3K7IP1

Associated region for CD within the HuPPI2-d25 networkNodes represent genes and edges represent physical interactions between the corresponding gene products. Colours of edges and borders represent the threshold αat which a gene was added to the region, whereas background colours indicate the rank of each gene.

Page 67: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

67

BRCA1

AR

STAT5A

CASP3

APP

CDC2

CDK4

TFDP1

CDC37

CDK5

CDKN2C

FGFR3

CHUK

CREBBP

CSNK2A1

STAT3

E2F1

EGFR

CAMK2G

E2F5

EP300

TP53RKESR1

TRAF6

FANCD2

TRRAP

FAF1

ABL1

UBE2D1

GRB2

DAG1

THPO

BCR

TK1

GSK3B

TNFRSF6B

FASLG

TP53

GSN

CASP6

CYLD

IRF8

IRF1

IKBKG

JAK1

IRS1

IGF1R

IFNGR1

YWHAE

IL12RB2

VAV1

JAK2

USP7

KIT

KDR

FYN

MAPT

MAPK1

MAP3K7IP1

MAP3K7

MAD1L1

MAX

NCK1

MPL

MKNK2

MED28

NCOA1

MAP4K4

PDGFRB

POU2F1

NR3C1

PRKCD

PRPF40A

PPP2CA

NKD1

PRKAA1

EEF2K

PTPN11

PTMAPSEN2

PTK2

PTPN6

CD72

PTPN2

CACNA1S

SHC1

RYR1

RET

RPS6KB1

RAF1

SIRPA

SMAD2

SP1

SMAD3

STAT1

SRC

PCAF

HD

Associated region for CD within HuPPI2Nodes represent genes and edges represent physical interactions between the corresponding gene products. Coloursof edges and borders represent the threshold α at which a gene was added to the region, whereas background coloursindicate the rank of each gene.

Find Associated Modules

Re-sequence Candidate Genes

Integrate Protein Interactions

Assign SNPs to Genes

Derive Gene-wide p-values

Read Alignment

Quality Control

Variant Calling

Capture/Coverage

Compare WTCCC alleles to read counts

Variant Annotation

Association Testing

SoftwareNovoalign, SAM Tools, Picard Tools, BED Tools, PLINK, SIFT, SNPClassifier, SYZYGY, R/Unix, python scripts

Pooled Sequencing of 500 Candidate Genes in 500 Crohn’s Disease Cases and 500 Controls

Page 68: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

68

Target gene selection strategiesTarget gene selection strategies

GWAS hitsAll genes within loci

that have been identified and

confirmed by GWAS to date in CD and other autoimmune disorders such as

celiac disease, UC, Psoriasis, RA, SLE

and T1D

Suggestive GWAS SNPs

Genes identified as containing potential functional variants that are in linkage disequilibrium with significant GWAS

SNPs.

Network analysisAnalysis of public genome-wide and

proteome-wide data sets to identify

genes that cluster in similar functional pathways to other

genes from CD associated regions

Researcher-based pathway analysisGenes manually identified through literature searches as having a role in

functionally relevant pathways such as

autophagy and IL23.

541 genes6,979 exons plus splice sites and proximal promoter

1,300,965bp of DNA

fragment size

coveragecumulativecoveragefragment size

read depth read depth

pre alignmenttotal reads 79,902,552pre alignment QC 6,356,836reads considered for mapping 73,545,716alignmentreads mapped 72,781,601Reads mapped as pairs 59,248,998unique alignments 70,153,021Mean fragment size 208 +- 49.4post alignment

reads mapped to multiple locations 2,628,580

pcr duplicates 2,553,984 8%Final number or reads 63,288,412Capturereads mapped to targeted regions 23,922,022 38%Coveragemean coverage of selected exons (per basepair)

1,009

standard deviation 565Bases with at least 100x 1,144,871 95%

SEQUENCE ANALYSIS

Pre-alignment QCAlignment

reference genome

Post alignment QC

Alignment File

Variant Calling

CoverageCapture WTCCCgenotype

Compare allele frequencies to read counts

Variant analysis

sequence reads

SEQUENCE ANALYSIS

Pre-alignment QCAlignment

reference genome

Post alignment QC

Alignment File

Variant Calling

CoverageCapture WTCCCgenotype

Compare allele frequencies to read counts

Variant analysis

sequence reads

Page 69: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

69

rsid position chr A C G T A C G T fisherrs16931739 116592012 9 0 0 0 24 0 0 2 311 1.000rs7899457 101595493 10 0 24 0 0 1 346 1 4 1.000rs7897024 35378679 10 0 24 0 0 1 89 0 0 1.000rs1531550 35504784 10 0 19 0 5 2 298 0 78 1.000rs3750479 21176931 9 0 22 0 2 0 355 1 41 1.000…rs11715915 49430334 3 0 14 0 10 3 243 0 161 1.000rs710100 102672031 14 5 0 19 0 106 8 365 21 0.947rs4955420 49183869 3 0 11 0 13 4 250 22 332 0.941rs4988957 102334507 2 0 11 0 13 9 310 1 324 0.887rs907092 35175785 17 13 0 11 0 345 5 323 2 0.874…rs936227 72919012 15 13 0 11 0 59 0 29 0 0.336rs2296409 67271231 16 11 0 11 0 201 0 329 1 0.299rs10909625 24171469 14 0 2 0 22 0 13 0 367 0.222rs3764147 43355925 13 16 0 6 0 227 0 32 1 0.165rs2230427 31298621 16 0 1 0 21 0 1 3 368 0.116

wor

st

agreement NGS-WTCCCNGSWTCCC

best

mid

dle

QC by comparison with WTCCC genotypes

Pool Estimation

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0 10 20 30 40 50 60

Individuals

% D

NA

0

2

4

6

8

10

12

V65G

D113H

D113Y

D113A

T189M

A211A

Q233K

L248

ML2

48R

N289S

G299G

P330S

P330Q

R393S

R393H

N414S

R426S

S431*

L485

L

Q519*

L535

L

L547

P

A611V

A611A

A611A

S652S

R702W

R703C

A755V

E778K

R790L

R790Q

V793M

G908R

D925Y

V942G I9

95I

Fre

quen

cy %

CD CASES CONTROLS

DNA variants at NOD2

R702W

G908R

Allele frequencies in 306 CD cases and 312

controls based on sequence read counts

Page 70: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

70

BioGranat: Our platform for the analysis, simulation and visualisation of biological networks

www.BioGranat.org

• Collaboration with Computer Science Department at University of Applied Sciences Hanover

• aimsto provide a platform for network analysis

• Future plans publication of biogranat is planned for this year – currently code review, polishing of user interface, writing of documentation, elimination of bugs

www.dilbert.com

Page 71: Bioinformatics - King's College London · •Bioinformatics as a toolbox •Use cases •Books What this lecture is not about •particular commercial software tools such as Vector

71

Summary

Many different data sets can be used to build networks

Network representation can be usefulData integration of various networks is a goalOngoing work:

Finding disease genes in networks

Thomas Schlitt – [email protected]