View
217
Download
1
Embed Size (px)
Citation preview
Mass-action equilibrium and non-specific interactions in
protein interaction networks
Sergei MaslovBrookhaven National
Laboratory
Living cells contain crowded and diverse molecular environments
Proteins constitute ~30% of E. coli and ~5% of yeast cytoplasm by weight
~2000 protein types are co-expressed co-localized
in yeast cytoplasm
If that’s not difficult enough:they are all interconnected
>80% of proteins are all connected in one giant cluster of PPI network
Small-world effect median network distance – 6 steps
Map of reproducible (>2 publications) protein-protein interactions in yeast
Why small-world property might cause problems?
Interconnected binding networks could indiscriminately spread perturbations Systematic changes in expression: large
changes in concentrations of a small number of proteins SM, I. Ispolatov, PNAS and NJP (2007)
Noise: small changes in concentrations of a large number of proteins K.-K. Yan, D. Walker, SM, PRL (2008)
How individual pathways can be turned on and off without upsetting the whole system ?
What about non-specific interactions?
Proteins form transient non-specific bonds with random, non-functional partners
For an organism to function specific interactions between proteins must dominate over non-specific ones:
How much stronger ~N specific interactions between N proteins need to be to overcome ~N2 non-specific interactions?
What limits it imposes on the number of protein types and their concentrations?
J. Zhang, SM, E. Shakhnovich, Molecular Systems Biology (2008)
My “spherical cow” assumptions Protein concentrations Ci of all yeast
proteins (under the rich growth medium conditions) and subcellular localizations are experimentally known (group of Weissman @ UCSF)
Consider only reproducible independently confirmed protein-protein interactions for non-catalytic binding (kinase-substrate pairs~5%)
The network: ~4000 heterodimers and ~100 multi-protein complexes (we assume no cooperative binding in complexes) connecting ~1700 proteins
Know the relevant average of dissociation constants Kij ~10nM. Turned out their distribution around this average DOES NOT MATTER MUCH!!!
Use “evolutionary motivated” binding strength: Kij=max(Ci, Cj)/const, which is sufficient to bind considerable fraction of twoproteins in a heterodimer
102
103
104
105
10610
0
101
102
103
104
protein abundance (copies/cell)hi
stog
ram
Law of Mass Action (LMA)
dDAB /dt = r (on)AB FA FB – r (off)
AB DAB
In the equilibrium:
DAB=FA FB /KAB ; CA= FA+DAB ; CB= FB +DAB
or FA = CA /(1+ FB /KAB ) and FB = CB /(1+ FA /KAB )
In a network:A system of ~2000 nonlinear equationsfor Fi that can be solved only numerically .
1 /i
ij ij
j nn i
CF
F K
Propagation of perturbations: the in silico study
Calculate the unperturbed (wildtype) LMA equilibrium
Simulate a twofold increase of the concentration CA 2CA of just one type of protein and recalculate equilibrium free concentrations Fi of all other proteins
Look for cascading perturbations: A B C D with sign-alternation: A ( up), B ( down), C ( up), D ( down)
Cascades of perturbations exponentially decay
(and sign alternate) with network distance
S. Maslov, I. Ispolatov, PNAS, (2007);
Mapping to resistor network
Conductivities ij – heterodimer
concentrations Dij
Losses to the ground iG – free (unbound)
concentrations Fi
Perturbations spread along linear chains loosely conducting to neighbors and ground
Mapping is exact for bi-partite networks odd-length loops dampen perturbations
S.Maslov, K. Sneppen, I. Ispolatov, New J. Phys, (2007)
• Perturbations – large changes of few proteins
• Fluctuations – small changes of many proteins
Two types of fluctuations in equilibrium concentrations
• Driven fluctuations: changes in Dij driven by stochastic variations in total concentrations Ci (random protein production/degradation)
• Spontaneous fluctuations: stochastic changes in Dij at fixed Ci – described by equlibrium thermodynamics
• Both types propagate through network <Dij
2>network <Dij2>isolated
Image by Cell Signaling Technology, Inc: www.cellsignal.com
Mitochondrial control of apoptosis
What limits do non-specific interactions impose on robust functioning of
protein networks?
J. Zhang, S. Maslov, E. Shakhnovich, MSB (2008) see talk on 8:48 AM in Room 411 (V39)
The effect of non-specific interactions grows with genome diversity m -- the number of co-expressed & co-localized proteins
Compare 3 equilibrium concentrations of a typical protein: free (monomer) specific heterodimer, all non-specific heterodimers
Need to know: protein concentrations: Ci
specific and non-specific dissociation constants:K(s)=K0exp(E(s)/kT), K(ns)=K0exp(E(ns)/kT
Competition between specific and nonspecific interactions
log(C/K0)
Kij
“Evolutionary motivated” Kij=max(Ci, Cj)/10
1 M1 nM
Ci
We estimate the median non-specific energy to beE(ns)=-4kT 2.5kT or K(ns)=18mM
Still thousands of pairs are
below the 1M (-14kT) detection threshold of Y2H which is 3.6 std. dev. away
literature Species Fraction
(Ito et al. , 2001) Yeast 4549 3.6 (Li et al. , 2004) C. elegans 4027 1873 10000 3.6
(Giot et al. , 2003) Drosophila 20439 11282 10306 3.5(Stelzl et al. , 2005) Human 3186 4456 5632 3.6(Rual et al. , 2005) Human 2754 3.87200
6000
# s t dbaitN preyN
41053.2 41015.2 41052.3 41054.2 41006.1
J. Zhang, SM, E. Shakhnovich, Molecular Systems Biology (2008)
How to estimate E(ns)?
1M
42 1018mM
log K(ns)
Use false-positives in noisy high-throughput data!
<C>
cytoplasm
mitochondria
nucleus
Phase diagram in yeast
J. Zhang, SM, E. Shakhnovich, Molecular Systems Biology (2008)
Evolution pushes the number of protein types m up for higher functional complexity, while keeping the concentration <C> is as low as possible to reduce the waste due to non-specific interactions
Still, on average proteins in yeast cytoplasm spend 20% of time bound in non-specific complexes
Collaborators and support
Koon-Kiu Yan, Dylan Walker, Tin Yau Pang (BNL/Stony Brook)
Iaroslav Ispolatov (Ariadne Genomics/BNL)
Kim Sneppen (Center for Models of Life, Niels Bohr Institute, Denmark)
Eugene Shakhnovich, Jingshan Zhang (Harvard)
DOE DMS DE-AC02-98CH10886 NIH/NIGMS R01 GM068954
Thank you!
Conclusions Time to go beyond topology of PPI networks! Interconnected networks present a
challenge for robustness: Perturbations and noise Non-specific interactions
We were the first to attempt quantifying these effects on genome-wide scale
Estimates will get better as we get better data on kinetic & equilibrium constants
Collaborators, papers, and support Koon-Kiu Yan, Dylan Walker, Tin Yau Pang (BNL/Stony Brook) Iaroslav Ispolatov (Ariadne Genomics/BNL)
Kim Sneppen (Center for Models of Life, Niels Bohr Institute, Denmark) Eugene Shakhnovich, Jingshan Zhang (Harvard)
DOE Division of Material Science, DE-AC02-98CH10886 NIH/NIGMS, R01 GM068954
1. Propagation of large concentration changes in reversible protein binding networks, S. Maslov, I. Ispolatov, PNAS 104:13655 (2007);
2. Constraints imposed by non-functional protein–protein interactions on gene expression and proteome size, J. Zhang, S. Maslov, E. Shakhnovich, Molecular Systems Biology 4:210 (2008);
3. Fluctuations in Mass-Action Equilibrium of Protein Binding NetworksK-K. Yan, D. Walker, S. Maslov, Phys Rev. Lett., 101, 268102 (2008);
4. Spreading out of perturbations in reversible reaction networksS. Maslov, K. Sneppen, I. Ispolatov, New Journal of Physics 9: 273 (2007);
5. Topological and dynamical properties of protein interaction networks. S. Maslov, book chapter in the " Protein-protein interactions and networks: Identification, Analysis and Prediction“, Springer-Verlag (2008);
Collective Effects Amplify Spontaneous Noise
Collective effects significantly amplify (up to a factor of 20) spontaneous noise
Is there an upper bound to this amplification?
Stochastic fluctuations in D*ij at fixed Ci
*
*
*
1
*
( )
*
log( / ) log(
log( / )
/ ) N
B
B ij ijiij
i i i i
i
k
j j
k T F F
k TD D
e
e
e
D
C C
G
Free energy G, for a given occupation state }{ *ijD
Here is not independent but related to via * *i i im
m i
F C D
*iF
*imD
What limits do non-specific interactions impose on robust functioning of
protein networks?
J. Zhang, S. Maslov, E. Shakhnovich, Molecular Systems Biology (2008)
The effect of non-specific interactions grows with m -- the number of co-expressed & co-localized proteins
Assume a protein is biologically active when bound to its unique specific interaction partner
Compare 3 equilibrium concentrations: free (monomer), specific dimer, all non-specific dimers
Need to know the average and distributions of: protein concentrations: C specific and non-specific dissociation constants:
K(s)=K0exp(E(s)/kT), K(ns)=K0exp(E(ns)/kT) Dimensionless parameters: log(C/K0), E(s)/kT, E(ns)/kT
Competition between specific and nonspecific interactions
Limits on parameters
For specific dimers to dominate over monomers: C K(s)= =K0exp(E(s)/kT)
For specific interactions to dominate over non-specific: C/K(s) mC/K(ns) or mexp[(E(ns)-E(s))/kT]
m
C
( )
*
**
1
* *
* *log( / )]
log( / ), where
[
i i i i
ij Eij ij
ij
N
Bi j
ij ijBG
k T
k D D
F F e
e
C
D T
F D
ò
ò
Intra-cellular noise Noise typically means fluctuations in total concentrations
Ci (e.g. cell-to-cell variability measured for of all yeast proteins by Weissman lab @ UCSF)
Needs to be converted into noise in biologically relevant dimer (Dij) or monomer (Fi) concentrations
Two types of noise: intrinsic (uncorrelated) and extrinsic (correlated) (M. Elowitz, U. Alon, et. al. (2005))
Intrinsic noise could be amplified by the conversion (sometimes as much as 30 times!)
Extrinsic noise partially cancels each other Essential proteins seem to be more protected from noise
and perturbations
PNAS (2007), Phys. Rev. Lett. (2008)
Going beyond topology We already know a lot about topology of
complex networks (scale-free, small-world, clustering, etc)
Network is just a backbone for complex dynamical processes
Time to put numbers on nodes/edges and study these processes
For binding networks – governed by law of mass action
SM, I. Ispolatov, PNAS (2007)
The total number of cascades is still
significant• The fraction of significantly (> noise level ~ 20%) affected proteins at distance D quickly decays --> exp(- D) • The total number of neighbors at distance D quickly rises --> exp( D)• The number of affected proteins at distance D slowly decays --> exp(- (- )D)
D
Robustness with respect to assignment of Kij
Spearman rank correlation: 0.89
Pearson linear correlation: 0.98
Bound concentrations: Dij
Spearman rank correlation: 0.89
Pearson linear correlation: 0.997
Free concentrations: Fi
SM, I. Ispolatov, PNAS, 104,13655-13660 (2007)
OK, protein binding networks are robust, but
can cascading changes be used to send
signals?
Robustness: Cascades of perturbations on average
exponentially decay
S.Maslov, K. Sneppen, I. Ispolatov, NJP (2007)
1 2 3 4 5 610
-14
10-12
10-10
10-8
10-6
10-4
10-2
100
distance from perturbation source
aver
age
rela
tive
chan
ge in
Fi
1nM10nM0.1M1M10M0.1mM
How robust is the mass-action equilibrium against perturbations?
Less robust
More robust
SM, I. Ispolatov, PNAS, 104,13655-13660 (2007)
HHT1
SM, I. Ispolatov, PNAS, 104,13655-13660 (2007)
SM, I. Ispolatov, PNAS, 104,13655-13660 (2007)
Perturbations propagate along dimers with large concentrations
They cascade down the concentration gradient and thus directional
Free concentrations of intermediate proteins are lowSM, I. Ispolatov, PNAS, 104,13655-13660 (2007)
Non-specific phase diagram
Three states of a protein
Each protein i has 3 possible states: Ci=[ii’]+[i]+[iR]
Concentrations are related by the Law of Mass Action
Compare the 3 concentrations: [ii’] should dominate
Non-specific binding energies
Assume for nonspecific interactions
scales with sum of surface hydrophobicities of two proteins
Distribution of fraction of hydrophobicAas on protein’s surface
Distribution of is Gaussian(proportional to hydrophobicity)
Model of nonspecific interactions
E. J. Deeds, O. Ashenberg, and E. I. Shakhnovich, PNAS 103, 311 (2006)
Parameters of non-specific interactions out of high-throughput Y2H experiments
Detection threshold Kd* of Kij in Yeast 2-Hybrid experiments
J. Estojak, R. Brent and E. A. Golemis. Mol. Cell. Biol. 15, 5820 (1995)
If pairwise interactions are detected among N protein types
< E*Interaction detected in Y2H if
Chemical potential description of
non-specific interactions between proteins
Chemical potential of the system
More hydrophobic surface more likely to bind nonspecific. Probability to be monomeric follows the Fermi-Dirac distribution
[i]>[iR] for Ei > , and vise versa
Find the chemical potential by solving
/ ( ) /[ ] /[ ] i B i B
E k T E k Ti iR const e e
( ) /
( ) /10
[ ] 1( )
[ ] 1
i j
i
j
E EE kT R kT
E kTj j
CiRe e f E dE
i C e
100
101
102
103
104
105
10610
0
101
102
103
104
protein abundance (copies/cell)
hist
ogra
m
1000 E. coli proteins3868 yeast proteins
Network Equilibrium
Given a set of total concentrations and the protein interaction network, we can determine the equilibrium bound and unbound concentrations
jiijj
ii KF
CF
/1We can numerically solve these equations by iteration
ji ij
jiii K
FFFCAt equilibrium:
This leads to a set of nonlinear equations:
ii CF )0(
jiij
nj
ini
KF
CF
/1 )(
)1(
Of course, the network is not always in equilibrium. There are fluctuations away from equilibrium:
iii FFF * ijijij DDD *
Thus, given a set of total concentrations and a set of dissociation constants, equilibrium free and bound concentrations are uniquely determined.
Empirical PPI Network
Curated genome-wide network of PPI interactions in Baker’s Yeast (S. cerevisiae)
BIOGRID database:
Interactions independently confirmed in at least two published experiments
Genome wide set of protein abundances during log-phase growth
Retain only interactions between proteins of known total concentration
1740 proteins involved in 4085 heterodimers
PPI Net
Protein abundance
Dissociation constants
Dissociation constants are not presently empirically known
Denominator is chose to conform to the average association from the PINT database
Evolutionary motivated dissociation:20
),max( jiij
CCK
Minimum association necessary to bind a sizable fraction of dimers
and 77 multi-protein complexes
Driven Fluctuations
Consider a set of total concentrations that are typical in the cell (i.e., when the cell is in log growth phase)
We want to examine small deviations in total concentration that arise as a result of:
1) Upstream noise in genetic regulation
2) Stochastic fluctuations in protein production/degradation mechanisms
iii CCC *
iC
The typical time-scale of these small fluctuations in total concentration is minutes.
We refer to these as driving fluctuations because they propagate through the network and drive fluctuations in dimer concentration
networkiC ijD
Driving fluctuations
Driven fluctuations
Collective effects amplify fluctuations
Significant amplification (up to 20-fold) compared to isolated dimers
Is Collective Amplification Bound?
ijij
ijij
ijij
ij
ij
ijij
ij
ij
DD
DD
DD
D
D
DD
D
D
1)1(
)()()(
**
2*2*2 where:
ijij DD *
To answer this question, let us calculate the noise from the partition function using an alternate formalism
Calculate average statistical quantities in the usual way: iC
x
)}{,()( ijjii CCZCZ We can think of the set of total copy numbers as the size of the system
notation:
Suppressed concentration are unchanged