Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint Programming in Community-basedGene Regulatory Network Inference
Ferdinando Fioretto Enrico Pontelli
Dept. Computer Science, New Mexico State University
Sept. 24, 2013
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Talk Outline
1 Background
2 Constraint Programing in Community Networks
3 Experiments and Results
4 Conclusions
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Networks
A cell contains different entities (including pro-teins, RNA) which interact and perform specificfunctions.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Networks
A cell contains different entities (including pro-teins, RNA) which interact and perform specificfunctions.
DNA transcription
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Networks
A cell contains different entities (including pro-teins, RNA) which interact and perform specificfunctions.
DNA transcription
mRNA translation
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Networks
Some proteins (Transcriptor Factors (TF)) can regulate theproduction of other proteins.
Done by enhancing or inhibiting DNAtranscription or mRNA translation.
The unit of encapsulation of theseinteractions are the coding regions of the DNA: the genes.
A Gene Regulatory Network is the set of the interactions amonggenes.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory NetworksModeling
A GRN is described by a weighted directed graph G = (V,E).
V is the set of genes of the network.
E ⊆ V × V × [0, 1] is the set of the regulatory interactions.
Each regulatory interaction s→ t is associated with a confidencevalue ωs→t ∈ [0, 1].
Example
G1 regulates G2.
G2 regulates G5.
G3 is regulated by G4.
G4 regulates G2 and is regulated by G5.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Network InferenceGRN inference from high-throughput data
Motivation:
Key to understand important genetic diseases, such as cancer.
Crucial to devise effective medical interventions.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Network InferenceCurrent Methods and Challenges
Methods proposed:Correlation-based.
Information-theoretic based.
Boolean Networks.
Bayesian Networks.
Regression-based.
Stochastics.Based on different assumptions.
Exhibits peculiar limitations.
Solutions proposed:Integrating heterogeneous data into the inference model.Meta-approaches using multiple inference models (CommunityNetworks (CN)).
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Network InferenceCurrent Methods and Challenges
Methods proposed:Correlation-based.
Information-theoretic based.
Boolean Networks.
Bayesian Networks.
Regression-based.
Stochastics.Based on different assumptions.
Exhibits peculiar limitations.Solutions proposed:
Integrating heterogeneous data into the inference model.Meta-approaches using multiple inference models (CommunityNetworks (CN)).
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Network InferenceCommunity Networks
community network
G1
G2
GJ
edge ranking
Borda voting score:
ωs#→t
=1|G|
|G|∑
j=1
ω j
s#→t
ω j
s#→t
: the ranked interaction s → t
by the j-th method in G.
D. Marbach et al. “Wisdom of crowds for robust gene network inference”.Nature Methods, 9(8):796–804, Aug. 2012.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Network InferenceOur Approach
CN approach for an “initial analysis” of the GRN.Community prediction collective agreements.
Integrate additional biological knowledge (when available).Leverage specific GRN properties.
Why CP ?
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Network InferenceOur Approach
CN approach for an “initial analysis” of the GRN.Community prediction collective agreements.
Integrate additional biological knowledge (when available).Leverage specific GRN properties.
Why CP ?
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constraint ProgrammingConstraint Satisfaction Problem (CSP)
Variables X : xi = position of the queen in the ith column.
Domains D: Dxi = {1, . . . , n}.Constraints C: ∀i,∀j with i < j:
xi 6= xj
xi + i 6= xj + j
xi − j 6= xj − j
Search = Labeling +Constraint Propagation
Solution = assignmentfor X satisfying all c ∈ C
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Gene Regulatory Network InferenceOur Approach
CN approach for an “initial analysis” of the GRN.Community prediction collective agreements.
Integrate additional biological knowledge (when available).Leverage specific GRN properties.
Why CP ?Separation between prediction methods and model.Declaratively.Constraint expressions allow incremental model refinement.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constrained Community NetworksCSP Modeling
GRN inference (GRNi) problem:
Given a set of n genes, a GRNi is a CSP 〈X ,D, C〉X = 〈x1, . . . , xn2−n〉(regulatory relations, exuding self regulations).
D = 〈D1, . . . ,Dn2−n〉, with each Dk = {0, . . . , 100}(possible confidence values).
C is a list of constraints expressing properties of the GRNs.
Notation:
xs→t: “s regulates t” and Ds→t its domain.
d(xs→t): the value assigned to xs→t.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constrained Community NetworksCSP Modeling
A solution to the GRNi defines a GRN prediction G = (V,E)
V = {1, . . . , n},E = {〈s, t,w〉 | d(xs→t) > 0}, where w = d(xs→t)/100.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constrained Community NetworksE.coli2 size 10 (from DREAM3)
G2 G10G9
G1 G5
G8
G7
G6
G4
G3
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Constrained Community NetworksE.coli2 size 10 CN prediction
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Analysis and Domains ReductionThe pre resolution phase
Leverage the collection of GRN predictions G by:(i.) Reducing the size of the solution search space.
(ii.) Integrate the Gj ∈ G taking into account their discrepancies.
Set up domains of each variable xs→t ∈ X , such that:
Ds→t = Ds→t ∩ Bs→twhere:
Bs→t ={
ωs#→t︸ ︷︷ ︸
if σs→t <θd
}
σs→t =1(|G|2
)|G|∑
j=1
|G|∑
i=j+1
∣∣ω j
s#→t− ω i
s#→t
∣∣
θd ∈ [0, 1] is a “disagreement threshold”.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Analysis and Domains ReductionThe pre resolution phase
Leverage the collection of GRN predictions G by:(i.) Reducing the size of the solution search space.
(ii.) Integrate the Gj ∈ G taking into account their discrepancies.
Set up domains of each variable xs→t ∈ X , such that:
Ds→t = Ds→t ∩ Bs→twhere:
Bs→t ={ω
s#→t− σs→t
2, ω
s#→t, ω
s#→t
+σs→t
2︸ ︷︷ ︸if σs→t ≥ θd ∧ 0.1<ω
s#→t< 0.9
}
σs→t =1(|G|2
)|G|∑
j=1
|G|∑
i=j+1
∣∣ω j
s#→t− ω i
s#→t
∣∣
θd ∈ [0, 1] is a “disagreement threshold”.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsSparseness
Elements of a GRN are considered to be controlled by a smallnumber of genes: GRN are sparse.
Combining predictions in a CN does not guarantee sparseness.
Enforce a sparseness constraint by:
atleast k ge(kl,X, θl) :∣∣{xi ∈ X | d(xi) > θl}
∣∣ ≥ kl
and
atmost k ge(km,X, θm) :∣∣{xi ∈ X | d(xi) > θm}
∣∣ ≤ km
with kl,m > 0 and 0 ≤ θl,m ≤ 100, andwhere d(xi) indicates the value of an assignment for xi
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsSparseness
atleast k ge(10,X , 65) ∩ atmost k ge(25,X , 65)
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsSparseness
atleast k ge(10,X , 65) ∩ atmost k ge(25,X , 65)
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsRedundant edge
Several state-of-the art inference methods rely on techniqueswhich cannot discriminate causality (e.g., M.I., Correlation).
Given a collection of predictions G={G1, . . . ,GJ} for a GRNG=(V,E) and a non-empty set of non causal based methodsH ⊆ G, an edge t→ s is redundant if:
∀Gi ∈ G \ H . ω is→t > ω i
t→s + β
If an edge t→ s is redundant we call the edge s→ t required.
Let XR be the set of all the required and redundant variables,
red edge(xs→t, xt→s, θR, θr) : xs→t > θR ∧ xt→s < θr
with θR, θr ∈ N, and 0 ≤ θR ≤ 100.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsRedundant edge
∀xs→t, xt→s ∈ XR red edge(xs→t, xt→s, 75, 50)
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsRedundant edge
∀xs→t, xt→s ∈ XR red edge(xs→t, xt→s, 75, 50)
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsSparseness + Redundant edge
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsTranscriptor Factor
Information about DNA-binding motifs often available frompublic sources (e.g., BDB, Gene Ontology).
Existing methods do not often allow integration of suchinformation (treated in postprocess).
A gene s ∈ V is a transcriptor factor (TF) if it regulates theproduction of other genes.
Express this property on the out-degree of s:
tf(s) : atleast k ge(ks,Xs, θs)
where: Xs = {xs→t ∈ X | t ∈ V}k is the co-expressing degree (the number of genes targeted by the TF).
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsTranscriptor Factor
atleast k ge(2,Ni, 85) with Ni = {xi→s | (∀Gj ∈ G) ω ji→s > 0.10}, (i = 1, 5, 9)
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsTranscriptor Factor
atleast k ge(2,Ni, 85) with Ni = {xi→s | (∀Gj ∈ G) ω ji→s > 0.10}, (i = 1, 5, 9)
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsCo-transcriptor Factors
Multiple TFs cooperate to regulate a specific gene(Co-regulators).Let s′, s′′ ∈ V be two TFs, which are co-regulators.
coregulator(k,X, θ) : ∀xs′→t′ , xs′′→t′′ ∈ X
| {(s′, s′′, t′) | s′ 6=s′′ ∧ t′= t′′ ∧ d(xs′→t′)>θ ∧ d(xs′′→t′′)>θ} | ≥ k
with k ∈ N and 0 < θ < 1
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsCo-transcriptor Factors
coregulator(1,V, 75), with s′=1, s′′=5
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ConstraintsCo-transcriptor Factors
coregulator(1,V, 75), with s′=1, s′′=5
Background Constraint Programing in Community Networks Experiments and Results Conclusions
GRN Consensus
We implement two solution strategy prop-labeling (DFS) and aMonte Carlo (MC) based prop-labeling tree exploration.
No consensus on objective function to drive the solution search.
We propose 3 metric to generate a GRN consensus ConstrainedCommunity Network (CCN).Given a set S of m solutions, the consensus value a∗k associatedwith the variable xk is computed by:
Max Frequency: a∗k = arg maxa∈S|xk
(freq(a, k))
Average: a∗k =1m
m∑
i=1
aik.
Weighted average: a∗k =1∑
a∈S|xk
freq(a, k)2
∑
a∈S|xk
freq(a, k)2a.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsCommunity Networks
The CN was built from 4 top ranking methods of last DREAMcompetitions:
1 TIGRESS (Regression model)2 Genie3 (Random Forest approach)3 Infleator (MCZ + tlCLR + linear ODE)4 CLR (Mutual Information model)
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsDatasets and validation
Benchmarks: DREAM{3,4} (110 GRNs of various sizes).
Subnetworks from GRNs of E. coli and S. cerevisiae.Datasets:
steady state expressions for wild typessteady state expressions measured after gene knockouts.time-series data.
Validation: AUROC score.
CCNs generated via MC search with 1, 000 samplings.
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsSettings
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsSettings
CCNs generated via MC search with 1, 000 samplings.
Domains Setup.
✓d =1
|ECN |X
(s,t,w)2ECN
�s!t
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsSettings
Sparseness constraint.
atleast k ge(kl, X , ✓l) \ atmost k ge(km, X , ✓m)
Ordered ECN1 g1 ! g3 0.9982 g1 ! g8 0.981
. . .n g4 ! g6 0.856
. . .n log(n) g7 ! g3 0.633
. . .
kl |{xi|xi 2 X ^ max(Dxi) > ✓l}|km � |{xi|xi 2 X ^ min(Dxi) > ✓m}|
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsSettings
Redundant edge constraint.
8Gi 2 G \ H . ! is!t > ! i
t!s + �
1|G||ERR|
X
Gi2G\H(! i
s!t � ! it!s))
red edge(xs!t, xt!s, ✓R, ✓r)
1|G \ H||EREQ|
X
Gi2G\H! i
s!t
1|G \ H||ERED|
X
Gi2G\H! i
t!s
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ResultsCCN with sparsity and redundant edge constraints
AU
RO
C %
impr
ovem
ent
05
1015
b f a w b f a w b f a w b f a w b f a w
DREAM3 10 DREAM4 10 DREAM3 50 DREAM3 100 DREAM4 100
s,rs,r,t
Average AUC score improvements (in percentage) w.r.t. CN rank
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsIntegrating GRN knowledge: TFs
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsSettings
CCNs generated via MC search with 1, 000 samplings.
Domains Setup.
✓d =1
|ECN |X
(s,t,w)2ECN
�s!t
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsSettings
Sparseness constraint.
atleast k ge(kl, X , ✓l) \ atmost k ge(km, X , ✓m)
Ordered ECN1 g1 ! g3 0.9982 g1 ! g8 0.981
. . .n g4 ! g6 0.856
. . .n log(n) g7 ! g3 0.633
. . .
kl |{xi|xi 2 X ^ max(Dxi) > ✓l}|km � |{xi|xi 2 X ^ min(Dxi) > ✓m}|
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsSettings
Redundant edge constraint.
8Gi 2 G \ H . ! is!t > ! i
t!s + �
1|G||ERR|
X
Gi2G\H(! i
s!t � ! it!s))
red edge(xs!t, xt!s, ✓R, ✓r)
1|G \ H||EREQ|
X
Gi2G\H! i
s!t
1|G \ H||ERED|
X
Gi2G\H! i
t!s
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ExperimentsIntegrating GRN knowledge: TFs
Transcription Factor constraint.
atleast k ge(blog(n)c, X, ✓)
Ordered ECN1 g1 ! g3 0.9982 g1 ! g8 0.981
. . .n g4 ! g6 0.856
. . .
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ResultsCCN with additional GRN knowledge integrationA
UR
OC
% im
prov
emen
t
05
1015
b f a w b f a w b f a w b f a w b f a w
DREAM3 10 DREAM4 10 DREAM3 50 DREAM3 100 DREAM4 100
s,rs,r,t
Average AUC score improvements (in percentage) w.r.t. CN rank
Background Constraint Programing in Community Networks Experiments and Results Conclusions
ResultsCCN with additional GRN knowledge integrationA
UR
OC
% im
prov
emen
t
05
1015
b f a w b f a w b f a w b f a w b f a w
DREAM3 10 DREAM4 10 DREAM3 50 DREAM3 100 DREAM4 100
s,rs,r,t
Average AUC score improvements (in percentage) w.r.t. CN rank
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Conclusions
CP-based approach to infer GRNs by integrating severalmethods in a CN.Introduces a set of constraints able to:
1 enforce the satisfaction of GRNs specific properties;2 take account of the community predictions agreements and
methods limitations.
No assumptions on datasets nor on the type of inferencemethods.Take Home Message:
GRN knowledge integration offer improvements in predictionaccuracy.Constraints are a powerful tool to model and integrate GRNproperties.
Thank you!
Background Constraint Programing in Community Networks Experiments and Results Conclusions
Conclusions
CP-based approach to infer GRNs by integrating severalmethods in a CN.Introduces a set of constraints able to:
1 enforce the satisfaction of GRNs specific properties;2 take account of the community predictions agreements and
methods limitations.
No assumptions on datasets nor on the type of inferencemethods.Take Home Message:
GRN knowledge integration offer improvements in predictionaccuracy.Constraints are a powerful tool to model and integrate GRNproperties.
Thank you!