55
“software” of life

“software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Embed Size (px)

Citation preview

Page 2: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Genomes to function

Page 3: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Lessons from genome projects

• Most genes have no known function

• Most genes w/ known function assigned from sequence-similarity matches to other organisms

• Need methods to experimentally assay gene activity on a genome-wide scale

Page 4: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Condition 1 RNA

Condition 2 RNA

gene enriched in condition 1

gene enrichedin condition 2

17,997 genes94% of genome

Measure expression on genome-wide scale: DNA Microarrays

Page 5: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Global Analyses of Gene Expression

• Collect all microarrays from the world

• Gene activity across thousands of conditions

conditions(~5k)genes

(20k)

Page 6: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Digital Age of Biology

• Biologists drowning in data

• Bottleneck now is developing computational resources for discovery

• Think Genbank before BLAST...

Page 7: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Discovering Gene Function on a Global Scale

• Gene Networks

• Search Engines

Page 8: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

MattWeirauch

CoreyPowell Chad

Chen

CharlieVaske Alex

WilliamsMartinaKoeva

Gene Networks

Page 9: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Gene Networks

• link 2 genes together if they are co-activated in multiple organisms

• build networks from all the links

• discover function from a gene’s links

• understand bigger picture of gene regulation

Page 10: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Principle #1

Gene networks are “scale free”

Page 11: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

• Scale free – gene networks may arise from processes like expansion of WWW

some links on the WWW

Page 12: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Principle #2

Genes self assemble into modular subcomponents

Page 13: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

http://www.cse.ucsc.edu/~jstuart/multispecies

Page 14: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Principle #2

Genes self assemble into modular subcomponents

0

10

20

30

40

50

605 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

105

110

115

Core Size

Per

cen

t o

f C

ore

s

Network

Random

Page 15: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Principle #3

Coordinated activity is a signature of gene function

proliferation

transcription

ribosomebiogenesis

ribosomalsubunits

respirationprotein modification

secretion

fatty acidmetab.tissue growth

neuronal

immune response

development /hox genes

cell polarity,cell structure

Newly evolved

Page 16: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Proteasome “module”

http://www.cse.ucsc.edu/~jstuart/multispecies

Page 17: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

integrator subunits

Principle #4

Local network topology reports on gene function

Page 18: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

top 3 integrators:

Page 19: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Integrators have more cis-regulatory complexity

integrators subunits

Page 20: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

integrators have different phenotypes

0

10

20

30

40

50

60

70

80

90

WT UNC LVA STP RUP EMB PCH GRO Other

gen

es (%

)

integrators

subunits

Page 21: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Current Directions for Gene Networks

• Gene isoform networks to capture alternative splicing

• Predict drug targets from synthetic lethal nets

Page 22: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Current Directions for Gene Networks

• Gene isoform networks to capture alternative splicing

• Predict drug targets from synthetic lethal nets (w/ Lokey Lab)

Page 23: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

MattWeirauch

CoreyPowell Chad

Chen

CharlieVaske Alex

WilliamsMartinaKoeva

Gene Isoform Networks

Page 24: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Gene Isoform Networks

• Most human genes (>60%) are alternatively spliced.

• Alternative splicing gives rise to different proteins from the same gene

• The particular variant expressed can be very important (e.g. sex determination in flies)

• The functional implications of alt. splicing in humans is still largely unexplored.

• Provides a higher resolution understanding of gene expression and its relationship to health & disease

Page 25: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Splicing Microarrays

• Measure particular subparts of the gene structure (e.g. exon-exon junctions)

• Data now available for human and mouse tissue compendiums

• Infer isoforms from expression of subparts across the tissues

• Identify isoform modules

Page 26: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

A functional network of gene isoformsisoform patterns isoform network

• assemble into modules

• functional signatures

• global network design

Page 27: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

MattWeirauch

CoreyPowell Chad

Chen

CharlieVaske Alex

WilliamsMartinaKoeva

Search Engines

Page 28: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Search engines to discover gene function

Page 29: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

identify every member of a pathwayRetinoblastoma pathway

Page 30: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

(slidefrom

Art Owen)

Page 31: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

gene recommender

query

search for regulating conditions

Page 32: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

gene recommendersearch for regulating conditions

query

Page 33: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

query

gene recommender

searchfor new

candidates

regulating conditions

Page 34: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

query +“hits”

gene recommender

regulating conditions

Page 35: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Rbhda-1lin-36rba-2lin-9

queryScore

experiments

1 Score genes

2

gene recommender procedure

dpl-1rba-2K12D12.1RbR06C7.8hda-1B0464.6R06F6.1T16G12.5F55A3.7plk-1lin-9lin-36

hits

Page 36: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

computational validation

Score experiments

1 Score genes

2hda-1lin-36rba-2lin-9

query

(no Rb)

1. rba-22. lin-93. dpl-14. R06C7.85. hda-16. B0464.67. R06F6.18. K12D12.19. T16G12.510. F55A3.711. plk-112. Rb13. lin-36

hits

Page 37: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Searching 1 organism

0

50

100

150

200

250

300

Riboso

me*

Calci

um C

hannel

s

Glyco

lysi

s*

Elect

ron T

ransp

ort*

Prote

asom

e*

tRNA S

ynth

etas

es

Fatty

Aci

d Deg

*

TCA Cyc

le

Transl

atio

n Fac

tors

Cell c

ycle

Cholest

erol*

Collagen

Pre

cisi

on

at

50%

Rec

all

BacteriaYeastPlantWormFlyHuman

Page 38: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

H.sapquery

Ecdy hits

Anim hits

Opishits

Euk hits

Cell hits

OrthologMap

Ecdy

Opis

Euk

Anim

Cell

H.saphits

H.sap

A.tha hits

H.pyl hits

S.cer hits

C.ele hits

D.mel hits

D.mel

C.ele

S.cer

A.tha

H.pyl

Multiple SpeciesSearch Engine

Page 39: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Orthology Map

cdk-4mcm-5mcm-7n/apcn-1hda-1…

Cdk4Mcm5Mcm7E2fMus209Rpd3…

C.ele

D.mel

MCM3 (8)MCM6 (9)

MCM5 (28)HDAC1 (69)RBBP4 (86)RPA1 (428)

BUB1 (1866)...

GR

H.sap hits

CDK4MCM5MCM7E2F1PCNAHDAC1…

H.sapcell cycle

query

Anim

Ecdy

MCM3* (1) MCM6* (2)HDAC1* (3)MCM5* (4)RBBP4 (5)

...

Animhits

MCM6* (1)BUB1* (2)

HDAC1* (3)MCM3* (4)

RPA1 (5)...

Ecdyhits

H.sap

Hdac1Bub1

Mcm6Rpa1

Mcm3...

mcm-3rpa-1

mcm-6bub-1rba-2hda-1

...

H.sap BTPsof C.ele hits

GR

H.sap BTPsof D.mel hits

GR

HDAC1 (3)BUB1 (21)MCM6 (26)RPA1 (48)MCM3 (60)

...

MCM3 (6)RPA1 (9)

MCM6 (15)BUB1 (24)

RBBP4 (25)HDAC1 (114)

...

Page 40: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Related genes sort to the top of the search lists

Page 41: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Multiple species search is more precise

Page 42: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Multiple species search is more precise

Page 43: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity
Page 44: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

immunological synapse

Gene product Comment

CD8 antigen query

unknown tyrosine kinase lymphocyte specific

T-cell receptor zeta query

CD2 antigen participates in T-cell activation

CD4 antigen (p55) query

unknown Src-like adaptor

negative regulator of T-cell receptor signaling

CD8 antigen query

unknown transcription factor T-cell specific

paired box gene 8 (PAX8) new association

Page 45: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

17

34

2

11

4

21

28

14

42

12

36

26

24 23

7

1

5

22

3

15571

15572

Page 46: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Search Engine Directions

• Search gene networks for pathway members– Incorporate multiple data sources in search

– Faster than scanning raw data

• Discriminative search engines– E.g. identify genes coregulated with DNA damage genes

more so than S-phase genes

Page 47: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Search Engine Directions

• Network Recommender– Search gene networks for pathway members

– Incorporate multiple data sources in search

– Faster than scanning raw data

• Discriminative search engines– E.g. identify genes coregulated with DNA damage genes

more so than S-phase genes

Page 48: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

MattWeirauch

CoreyPowell Chad

Chen

CharlieVaske Alex

WilliamsMartinaKoeva

Network Recommender

Page 49: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Network Recommendercoexpression

synthetic lethal

physical protein interactions

Page 50: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Iterative Propagation Algorithm

1. Given a set of genes in a pathway A2. Score gene g based on how connected to

predicted pathway members in network i• Si(g) = hwighp(h) / hwigh, • where h ranges over neighbors of g in network i

3. Compute posterior each gene g in pathway• Construct a positive distribution P(Si(g)| g in A)• Construct a negative distribution P(Si(g)| g not in A)

4. Set p(g) = ∏i P(g in A | Si(g))

Page 51: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Network Recommender Performance

recall

prec

isio

n

Page 52: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Network Recommender Results

Page 53: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Network Recommender for cell cycle

- physical proteininteraction

- gene coexpression

Page 54: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

Supplemental Material

Page 55: “software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity

05

101520253035404550

Pe

rce

nt

Inte

rac

tio

ns

1 3 5 7 9 11 13 nopathnetwork distance

% Synth Leth

% Background

Genetic interactions