Upload
arella
View
32
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Identifying Candidate Pathways to Explain Phenotypes in Genome-Wide Mutant Screens. Mark Craven Department of Biostatistics & Medical Informatics University of Wisconsin-Madison U.S.A. joint work with: Deborah Chasman , - PowerPoint PPT Presentation
Citation preview
Identifying Candidate Pathways to Explain Phenotypes in
Genome-Wide Mutant Screens
Mark CravenDepartment of Biostatistics & Medical Informatics
University of Wisconsin-MadisonU.S.A.
joint work with: Deborah Chasman, Paul Ahlquist, Brandi Gancarz, Linhui Hao
Viruses take advantage of host cell genes
Figure from: C. E. Samuel, Journal of Biological Chemistry 285, 2010.
Genome-wide mutant screens
HOS1 SPE3
MED1 YPR071W NOT5
LTP1
HOS1 SPE3
MED1 YPR071W NOT5
LTP1
HOS1 SPE3
MED1 YPR071W NOT5
LTP1
HOS1 SPE3
MED1 YPR071W NOT5
LTP1
mutant phenotype
Example: determining which host genes affect viral replication [Kushner et al., PNAS 2003; Gancarz et al., PLoS One 2011]
Genome-wide mutant screensThe output of such screens are sets of genes that either inhibit or stimulate viral processes
Characterizing virus-host interactionsgiven such interaction data, we want to
• identify pathways that provide consistent explanations for the genome-wide measurements
• predict the interfaces to the virus
before inference after inference
Some interactions are deemed not consistent with the measurements
Directions and signs of interactions are specified
Interfaces to the virus are hypothesized
Integer programming approach
1. Collect candidate pathways for each “hit”
2. Use IP to identify a globally consistent subnetwork
Collecting candidate pathways for a hit
generate candidate pathways, up to a specified length, that link a hit to the virus
Variables in integer programming approach
€
σ p
€
x2, s2, d2€
x1
€
x3, s3
Variable Descriptionxe is edge e active?
se sign of edge e (up- or down-regulating)
de direction of edge e
tg phenotype (effect) of knocking out gene g
σp is pathway p active?€
tg
Constraints in integer programming approach
€
∀n ∈ hits σ pp:n∈ nodes ( p )
∑ ⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟> 0
€
σ p
€
x2, s2, d2€
x1
€
x3, s3
€
tg
all significant measurements (hits) are explained by at least one pathway
Constraints in integer programming approach
€
σ p
€
x2, s2, d2€
x1
€
x3, s3
€
∀p σ p = 0 ∨ consistent (p)( )
€
conσiσtent(p) =
∧e∈ edges ( p ){ i, j}= nodes (e )
xe =1 ∧ de = dir(p,e) ∧ se = tit j( )
€
tg
all active pathways are consistent, with edges directed toward the interface
Current objective function ininteger programming approach
minimize the number of interfaces
€
min I σ p > 0p:n∈ nodes ( p )
∑ ⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟
n∈ interfaces∑
€
σ p
€
x2, s2, d2€
x1
€
x3, s3
€
tg
before inference
after inference
How to evaluate the IP approach?
hold a measurement asidesee if we can correctly predict it using inferred networks
?
Baseline predictors
?
neighbor voting further neighbor voting
?
predict
1 neighbor votes
2 neighbors vote
predict
3 neighbors vote
2 neighbors vote
Baseline predictors
consistency neighbor voting
predict
1 neighbor votes
2 neighbors vote
?
This gene votes because it has a repressing interaction with query gene
Markov network approach
• variables are the same as in the IP
• potential functions represent• the constraints• uncertainty associated with specific interactions• the preference for a small number of interfaces
• inference done using Gibbs sampling
Predictive Accuracy (BMV)
Predictive Accuracy (FHV)
Future work
• taking into account additional sources of information• quantitative values from assays• genetic interactions• interactions automatically extracted from the scientific literature
• adapting approach to RNAi screens in mammalian cells• more genes• lower density of known interactions• more uncertainty in measurements
• devising methods that use these models to determine which follow-up experiments would be most informative