Mathematical Models of dNTP Supply
Tom Radivoyevitch, PhDAssistant Professor
Epidemiology and Biostatistics
CCCC Developmental Therapeutics Program
Single-Loop Temperature Control
5
0
10
+-
setpointKp
Ki∫ Σ hot plate water temperature
-5 0 5 10 15 20
2535
45
minutes
tem
pera
ture
(C)
-5 0 5 10 15 20
0246
8
minutes
cont
rol e
ffort
-5 0 5 10 15 20
2535
45
minutes
tem
pera
ture
(C)
-5 0 5 10 15 20
0123
45
minutes
cont
rol e
ffort
-5 0 5 10 15 20
2535
45
minutes
tem
pera
ture
(C)
-5 0 5 10 15 20
0123
45minutes
cont
rol e
ffort
-5 0 5 10 15 20
2535
45
minutes
tem
pera
ture
(C)
-5 0 5 10 15 20
02
46
8
minutes
cont
rol e
ffort
controllerprocess
process output = controller input; process input = controller output (= control effort)
Power Plant Process & Controls
air/o2
stack
turbine
turbine
condenser
gas flow
recirc
T
T
fuel
P
θ
PI+
-Megawatts Demanded
Megawatts Supplied
PI Reheat TempSetpoint (1050F: variations break expensive thick pipes)
PI Setpoint (3500 psi)
+
-+
-
Boiler pressure ~ plasma [dN]Fuel input ~ de novo dNTP supply Cold Water bypass ~ cN-II/dNK cycle ~ police car in idle to catch speedersTwo temp control ~ drugs kill bad and good, rescue saves good and bad
Ultimate Goal
• Better understanding => better control• Conceptual models help trial designs today • Computer models train pilots and autopilots
• Safer flying airplanes with autopilots• Individualized, feedback-based therapies
FuturePresent
EXPERIMENTALBIOLOGY
MODELS CONTROLLER-MODEL PAIRSdata
simulations
proposed controller
process model development control system design
or
dNTP Supply
Many anticancer agents target or traverse this system.
UDP
CDP
GDP
ADP
dTTP
dCTP
dGTP
dATP
dT
dC
dG
dADNA
dUMP
dU
TS
DCTD
dCK
DN
A po
lym
eras
eTK1
cytosol
mitochondria
dT
dC
dG
dA
TK2
dGK
dTMPdCMP
dGMP
dAMP
dTTPdCTP
dGTP
dATP
5NT
NT2
cytosol
nucleus
dUDP
dUTPdUTPase dN
dN
dCK
flux activation inhibition
ATPordATP
RN
R
dCK
Metabolism of Metabolism of dNTPsdNTPs + Analogs+ Analogs
Metabolism of Metabolism of DNA + DrugDNA + Drug--DNADNA
Damage DrivenDamage Drivenor or
SS--phase Drivenphase Driven
Nucleoside Nucleoside demand is eitherdemand is either
Focus on cancers Focus on cancers caused by DNA repair caused by DNA repair system failuressystem failures
DNA repairDNA repair
Problem StatementProblem Statement
salvage
De novo
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
x
y
0 2 4 6 8 10
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
x
y2 -
y
MMR+
MMR-
Best time to hit with IRIU-DNA levels
Time
MMR- Treatment Hypothesis
dNTPs + Analogs DNA + Drug-DNA
Damage Drivenor
S-phase Driven
dNTP demand is either
DNA repair
Salvage
De novo
IUdR
p53- Treatment Hypothesis
• Residual DSBs at 24 hours kill cells• dNTP supply inhibition retards DSB repair• p53- cells are slower at DSB repair• Best dNTP supply inhibition timing post IR
is just after p53+ cells complete DSB repair
• Questions: Prolonged RNR↓ => plasma [dN]↓? Compensation by RNR
overexpression? Is dCK expression also increased?
-+
24 h
R Packages
Combinatorially Complex Equilibrium Model Selection
(ccems, CRAN 2009)
Systems Biology Markup Language interface to R
(SBMLR, BIOC 2004)
Model networks of enzymes Model enzymes
R1
R2 R2
R1 R1
R1 R1
R1 R1
R1 R1
R1
R1
R1 R1
R1 R1
R1
R1
R2 R2
R2 R2
Enzyme Modeling Overview
• Model enzymes as quasi-equilibria (e.g. E ES) • Combinatorially Complex Equilibria:
• few reactants => many possible complexes• R package: Combinatorially Complex Equilibrium Model Selection
(ccems) implements methods for activity and mass data• Hypotheses: complete K = ∞ [Complex] = 0 vs binary K1 = K2
• Generate a set of possible models, fit them, and select the best • Model Selection: Akaike Information Criterion (AIC)
• AIC decreases with P and then increases• Billions of models, but only thousands near AIC upturn
• Generate 1P, 2P, 3P model space chunks sequentially• Use structures to constrain complexity and simplicity of models
Ribonucleotide Reductase
R1
R2 R2
R1 R1
R1 R1
R1 R1
R1 R1
R1
R1
R1 R1
R1 R1
R1
R1
R2 R2
UDP, CDP, GDP, ADP bind to catalytic site
ATP, dATP, dTTP, dGTP bind to selectivity site
dATP inhibits at activity site, ATP activates at activity site?
5 catalytic site states x 5 s-site states x 3 a-site states x 2 h-site states = 150 statesÞ (150)6 different hexamer complexes => 2^(150)6 models 2^(150)6 = ~1 followed by a trillion zeros1 trillion complexes => 1 trillion (1 followed by only 12 zeros) 1-parameter models
ATP activates at hexamerization site??
R2 R2
RNR is Combinatorially Complex
Michaelis-Menten Model
RNR: no NDP and no R2 dimer => kcat of complex is zero,else different R1-R2-NDP complexes can have different kcat values.
E + S ES
mm
m
m
mT
Tmm
mT
T
TT
TT
KSSV
KSKSV
KSKSkE
kEKSKS
KSkEv
EESEESEESkE
kEEES
EEES
ESkE
EPEESPEkEESkv
][][
1/][/][
1/][/][][
][1/][
101/][
/][][
1]/[][10
1]/[][]/[][][
][][][
][0][][
][][
)(][0)(][][0][
maxmax1
1
1
1
1
1
mm K
SEES
ESESK ][
][][
][]][[
Þbut so
Key perspective
Free Concentrations Versus Totals
0.005 0.010 0.020 0.050 0.100 0.200 0.500
020
4060
8010
0
Total [r] (uM)
Per
cent
Act
ivity solid line = Eqs. (1-2)
dotted = Eq. (3)
Data from Scott, C. P., Kashlan, O. B., Lear, J. D., and Cooperman, B. S. (2001) Biochemistry 40(6), 1651-166
Model Parameter Initial Value Optimal Value Confidence Interval RRGGttr1.1.0 RRGGtt_r 0.020 0.012 (0.007, 0.024) SSE 1070.252 823.793 AIC 45.006 42.650
MM Kd 0.020 0.033 (0.022, 0.049) SSE 2016.335 1143.682 AIC 50.706 45.603
R=R1 r=R22
G=GDP t=dTTP
)2(]][[][][0
)1(]][[][][0
_
_
SET
SET
KSESS
KSEEE
Þ
Þ
SE
T
SE
T
T
SE
SET
SE
TSE
T
KS
KS
EESversus
KS
KS
EES
KS
EEKSEE
_
_
_
_
_
_ ][1
][
][][][1
][
][][][1
1][][][1][][
Substitute this in here to get a quadratic in [S] whose solution is
Bigger systems of higher polynomials cannot be solved algebraically => use ODEs (above)
][4][][(][][(5.][][ _2
__ TSETTSETTSET SKESKESKSES
0)0]([,0)0]([
]][[][][][
]][[][][][
_
_
SE
KSESS
dSd
KSEEE
dEd
SET
SET
[S] vs. [ST]
(3)
Enzyme, Substrate and InhibitorE ES
EI ESI
E ES
EI ESI
SEIIEIET
SEIIESET
SEIIEIESET
KKSIE
KIEII
KKSIE
KSESS
KKSIE
KIE
KSEEE
___
___
____
]][][[]][[][][0
]][][[]][[][][0
]][][[]][[]][[][][0
E ES
EI
EIT
EST
EIEST
KIEII
KSESS
KIE
KSEEE
]][[][][0
]][[][][0
]][[]][[][][0
ESIEIT
ESIEST
ESIEIEST
KISE
KIEII
KISE
KSESS
KISE
KIE
KSEEE
]][][[]][[][][0
]][][[]][[][][0
]][][[]][[]][[][][0
E ES
EI ESI
E
EI ESI
E ES
ESI
E
EI ESI
E ES
ESI
=
=E
EI
E
ESI
E ES E
Competitive inhibition
uncompetitive inhibition if kcat_ESI=0
E | ES
EI | ESI
noncompetitive inhibition Example of K=K’ Model
==
dTTP induced R1 dimerizationComplete K Hypotheses
Radivoyevitch, (2008) BMC Systems Biology 2:15
RRttRRtRt
T
RRttRRtRRRtT
KtR
KtR
KtRtt=
KtR
KtR
KR
KtRRRp=
222
2222
20
2220
.0)0(;0)0(
2
222
222
2222
tRKtR
KtR
KtRtt=
dtd
KtR
KtR
KR
KtRRRp=
dRd
RRttRRtRtT
RRttRRtRRRtT
R RR
RRtt
RRt Rt
R
RRtt
RRt Rt
R RR
RRtt
Rt
R RR
RRt Rt
R RR
RRtt
RRt
R
RRtt
Rt
R
RRt Rt
R RR
Rt
IJJJJJIJ JJJI JIJJ
IJIJ IJJI JJII
JJJJ
R
RRtt
RRt
R RR
RRtt
R RR
RRt
R
Rt
R
RRtt
R
RRt
JIIJIIJJ JIJI
R RR RIJII IIIJ IIJI JIII IIII
R
Rt
R
RRtt
R
RRt
R RRI0II III0 II0I 0III
R = R1 t = dTTP
(RR, Rt, RRt, RRtt)
Binary K Hypotheses
R Rt t
RRt t
Rt R t
RRt t
Rt Rt RRtt
Kd_R_R
Kd_Rt_R
Kd_Rt_Rt
Kd_R_t
Kd_R_tKd_RRt_t
Kd_RR_t=
=
=
=
|
|
|
R Rt t
RRt t
Rt R t
RRt t
RRtt
KR_R
KR_t
KRRt_t
KRR_t=
=
=
==
===
==
=
==
===
==
=
R Rt t
RRt t
Rt R t
RRt t
Rt Rt RRtt
KR_R
KRt_R
KRt_Rt
KR_t
KR_t
|
|
|
|
|
|
| |
|
|
|
|
| |
|
?
?
HIFF
HDFFHDDD
=
5 10 15
100
120
140
160
180
Total [dTTP] (uM)
Ave
rage
Mas
s (k
Da)
III0mIIIJHDFF
Fits to Data
][][2][2][22
][)1]([][)(
)(
11TT
T
RRRttRRtRRM
RpRRMyE
yEy
AICc = N*log(SSE/N)+2P+2P(P+1)/(N-P-1)
Data from Scott, C. P., et al. (2001) Biochemistry 40(6), 1651-166
Radivoyevitch, (2008) BMC Systems Biology 2:15
HDFF
==
R
RRtt
IIIJIII0m
Model Parameter Initial Value
Optimal Value
Confidence Interval
1 III0m m1 90.000 82.368 (79.838, 84.775)
SSE 4397.550 525.178
AIC 71.965 57.090
2 IIIJ R2t2 1.000^3 2.725^3 (2.014^3, 3.682^3)
SSE 2290.516 557.797
AIC 67.399 57.512
27 HDFF R2t0 1.000 12369.79 (0, 1308627507869)
R1t0_t 1.000 1.744 (0.003, 1187.969)
R2t0_t 1.000 0.010 (0.000, 403.429)
SSE 25768.23 477.484
AIC 105.342 77.423
RRttRRtRt
T
RRttRRtRRRtT
KtR
KtR
KtRtt=
KtR
KtR
KR
KtRRRp=
222
2222
20
2220
.0)0(;0)0(
2
222
222
2222
tRKtR
KtR
KtRtt=
dtd
KtR
KtR
KR
KtRRRp=
dRd
RRttRRtRtT
RRttRRtRRRtT
jitR
ji
KtR=tR
ji
ATP-induced R1 Hexamerization
2+5+9+13 = 28 parameters => 228=2.5x108 spur graph models via Kj=∞ hypotheses28 models with 1 parameter, 428 models with 2, 3278 models with 3, 20475 with 4
R = R1X = ATP
18
6
612
4
46
2
22
1
18
6
612
4
46
2
22
1
642
642
0
6420
i XR
i
i XR
i
i XR
i
i RX
i
T
i XR
i
i XR
i
i XR
i
i RX
i
T
iiii
iiii
KXRi
KXRi
KXRi
KXRiXX=
KXR
KXR
KXR
KXRRR=
Yeast R1 structure. Dealwis Lab, PNAS 102, 4022-4027, 2006
Data of Kashlan et al. Biochemistry 2002 41:462
Fits to R1 Mass Data
28 of top 30 did not include an h-site term; 28/30 ≠ 503/2081 with p < 10-16. This suggests no h-site. Top 13 all include R6X8 or R6X9, save one, single edge model R6X7 This suggests less than 3 a-sites are occupied in hexamer. Radivoyevitch, T. , Biology Direct 4, 50 (2009).
2088 Models with SSE < 2 min (SSE)
142 144 146 148 150 152 154
0.00
0.05
0.10
0.15
0.20
0.25
AIC
dens
ity
no h site (508)h site (1580)
A
142 144 146 148 150
0.05
0.15
0.25
AIC
dens
ity
no h site (77)h site (78)
B
146 148 150 152 154
0.00
0.05
0.10
0.15
0.20
0.25
0.30
AIC
dens
ity
no h site (287)h site (1174)
C
148 150 152 154
0.0
0.1
0.2
0.3
0.4
0.5
AIC
dens
ity
no h site (146)h site (329)
D
50 100 200 500 1000 200010
020
030
040
050
0
[ATP] (uM)
Mas
s (k
Da) R6X8 141.23
R6X9 142.88R6X7 143.12R6X10 146.12R6X6 148.92R6X11 149.60R6X12 152.82R6X13 155.68R6X14 158.17R6X15 160.33R6X16 162.20R6X17 163.84R6X18 165.26
Data from Kashlan et al. Biochemistry 2002 41:462
~1/2 a-sites not occupied by ATP? R
1 R
1
R1 R1
R1 R1
aa
[ATP]=~1000[dATP]So system prefers to have 3 a-sites empty and ready for dATPInhibition versus activation is partly due to differences in pockets
a
a
a
a
UDP
CDP
GDP
ADP
dTTP
dCTP
dGTP
dATP
dT
dC
dG
dADNA
dUMP
dU
TS
DCTD
dCK
DN
A po
lym
eras
e
TK1
cytosol
mitochondria
dT
dC
dG
dA
TK2
dGK
dTMPdCMP
dGMP
dAMP
dTTPdCTP
dGTP
dATP
5NT
NT2
cytosol
nucleus
dUDP
dUTPdUTPase dN
dN
dCK
flux activation inhibition
ATPordATP
RN
R
dCK
Fits to RNR Activity Data
36
3
26
2
12
1
2
6622
. 66220iii XR
i
XR
i
XR
i
RT K
XRKXR
KXR
KRRR=
ij XR
ijij
KXR=XR
][][][][ 321 63
62
21
20
iii XRkXRkXRkRkk
[ATP] (M)
CD
P R
educ
tase
Act
ivity
(1/s
ec)
0.1
0.15
0.2
0.25
0.3
0 1000 2000 3000 10000
4.13.18 (AIC=-65, SSE=0.00257) 2.7.12 (AIC=-58.9, SSE=0.00386)i1 4 i2 13 i3 18 k0 0.31 K0 3.66 K1 25.86 K2 21.26 K3 56.23i1 2 i2 7 i3 12 k0 0.31 K0 3.86 K1 14.29 K2 12.48 K3 54.67
Distribution of Model Space SSEs
Models with occupied h-sites are in red, those without are in black. Sizes of spheres are proportional to 1/SSE.
0.002 0.004 0.006 0.008 0.010 0.012
050
100
150
200
250
300
SSE
Pro
babi
lity
Den
sity
occupied h-sites (171 models)no occupied h-sites (54 models)
Microfluidics
Figure 8. T. Thorsen et al. (S. R. Quake Lab) Science 2002
Figure 9. J. Melin and S. R. Quake Annu. Rev. Biophys. Biomol. Struct. 2007. 36:213–31
Figure 9 shows how a peristaltic pump is implemented by three valves that cycle through the control codes 101, 100, 110, 010, 011, 001, where 0 and 1 represent open and closed valves; note that the 0 in this sequence is forced to the right as the sequence progresses.
Adaptive Experimental DesignsFind best next 10 measurement conditions given models of data collected.Need automated analyses in feedback loop of automatic controls of microfluidic chips
µFluidic M-inputCMPM
C1
Mixing Control bits
C2C3
CM
…
TPC1 …C2 C2 C3 buff buff buff
N-plug stream (C1:2C2:C3:C4)/N
Output
Output
C4
Streams of pulses
Filtered output
(a)
(b)
Output
Output Mixer
Dye-3 (C3)Dye-2 (C2)
Solvent Dye-1 (C1)
0b
2b
1b
3b
Dye-3
C3
Dye-1
Dye-2
C2
Water
C1
Mixing Channel
Output
Control Lines
3 mm
C1
C2
C3
Flow velocity = 2 cm/sTP=100 ms, M=4N=20, Levels = 64
Why Systems Biology
Emphasis is on the stochastic component of the model.
Is there something in the black box or are the input wires disconnected from the output wires such that only thermal noise is being measured? Do we have enough data?
Model components: (Deterministic = signal) + (Stochastic = noise)
Statistics Engineering
Emphasis is on the deterministic component of the model
We already know what is in the box, since we built it. The goal is to understand it well enough to be able to control it.
Predict the best multi-agent drug dose time course schedules
Increasing amounts of data/knowledge
Indirect Approach pro-B Cell Childhood ALL
• T: TEL-AML1 with HR • t : TEL-AML1 with CCR• t : other outcome
• B: BCR-ABL with CCR• b: BCR-ABL with HR• b: censored, missing, or
other outcome
B
b
b
b
b
b
b
b
bb
bb
b
b
b
b
tt t
tt
t
t t
ttt
tt
t
t
t
t
t
t
t
t
t
tt
t
t
t
t
t
t
tt
t
t
t
t
t
t
t
t
t
tt
t
t
tt
t
t
t
t
t
tt
tt
t
T
T
T
t
t
t
t
t
tt
t
t t
tt
t
tt
t
t
t
t
0 2 4 6 8 10 12 140
200
400
600
800
1000
1200
DNTS Flux (uM/hr)
DN
PS
Flu
x (u
M/h
r)
Ross et al: Blood 2003, 102:2951-2959 Yeoh et al: Cancer Cell 2002, 1:133-143
Radivoyevitch et al., BMC Cancer 6, 104 (2006)
Folate Cycle (dTTP Supply)
THF
CH2THF
CH3THF
CHOTHF
DHF
CHODHFHCHO
GAR
FGAR
AICAR
FAICAR
dUMP dTMP
NADP+ NADPHNADP+ NADPH
NADP+ NADPH
MetHcys
Ser
Gly
GART
ATICATIC
TS
ATP
ADP
11R
2R 2
3
4
10
9 8
5 6
7
12 11
13
HCOOH
MTHFDMTHFR
MTR
DHFR
SHMT
FTS
FDS
Morrison PF, Allegra CJ: Folate cycle kinetics in human breast cancer cells. JBiolChem 1989, 264:10552-10566.
Conclusions
• For systems biology to succeed:– move biological research toward systems which
are best understood– specialize modelers to become experts in
biological literatures (e.g. dNTP Supply) • Systems biology is not a service
Acknowledgements
• Case Comprehensive Cancer Center• NIH (K25 CA104791)• Charles Kunos (CWRU)• John Pink (CWRU)• James Jacobberger (CWRU)• Anders Hofer (Umea) • Yun Yen (COH)• And thank you for listening!
Comments on Methods
• Fast Total Concentration Constraint (TCC; i.e. g=0) solvers are critical to model estimation/selection. TCC ODEs (#ODEs = #reactants) solve TCCs faster than kon =1 and koff = Kd systems (#ODEs = #species = high # in combinatorially complex situations)
• Semi-exhaustive approach = fit all models with same number of parameters as parallel batch, then fit next batch only if current shows AIC improvement over previous batch.
Conjecture
• Greater X/R ratios dominate at high Ligand concentrations. In this limit the system wants to partition as much ATP into a bound form as possible
1e+02 1e+03 1e+04 1e+05 1e+06
010
020
030
040
050
0
[ATP] (uM)
Mas
s (k
Da)
R2X4.R6X8R2X3.R6X9R2X3.R6X12
ccems Sample Code
library(ccems) # Ribonucleotide Reductase Exampletopology <- list( heads=c("R1X0","R2X2","R4X4","R6X6"), sites=list( # s-sites are already filled only in (j>1)-mers a=list( #a-site thread m=c("R1X1"), # monomer 1 d=c("R2X3","R2X4"), # dimer 2 t=c("R4X5","R4X6","R4X7","R4X8"), # tetramer 3 h=c("R6X7","R6X8","R6X9","R6X10", "R6X11", "R6X12") # hexamer 4 ), # tails of a-site threads are heads of h-site threads h=list( # h-site m=c("R1X2"), # monomer 5 d=c("R2X5", "R2X6"), # dimer 6 t=c("R4X9", "R4X10","R4X11", "R4X12"), # tetramer 7 h=c("R6X13", "R6X14", "R6X15","R6X16", "R6X17", "R6X18")# hexamer 8 ) ))g=mkg(topology,TCC=TRUE) dd=subset(RNR,(year==2002)&(fg==1)&(X>0),select=c(R,X,m,year))cpusPerHost=c("localhost" = 4,"compute-0-0"=4,"compute-0-1"=4,"compute-0-2"=4)top10=ems(dd,g,cpusPerHost=cpusPerHost, maxTotalPs=3, ptype="SOCK",KIC=100)