1
Sta$s$cal learning and docking recover the reac$on coordinates of a GPCR Evan N. Feinberg, M.M. Sultan, R.T McGibbon, M.P. Harrigan, C.X. Hernandez, W.R. Fletcher and V.S. Pande Abstract GPCRs comprise oneIthird of targets of all FDAIapproved drugs. Molecular dynamics (MD) simula$ons of GPCRs can contain over 60,000 atoms, coun$ng for over 180,000 degrees of freedom. The technique described here reduces the dimensionality of GPCR MD simula$ons through a combina$on of unsupervised and supervised learning. In par$cular, $meIstructure Independent Component Analysis (tICA) [2,3] and molecular docking are used complementarily to determine reac$on coordinates relevant to agonist binding and receptor ac$va$on. Step 1: Featurize and tICA Step 2: Cluster tICA Coordinates Input : Projec'on of trajectories onto tICA coordinates Learning class : Unsupervised For illustra$on purposes, we project our K=1000 clusters onto the a priori reac$on coordinates described in [1] by Dror, et al. and color by tIC value of each cluster. Step 3: Dock Agonists to Conformers Take s sample conforma$ons from each of the K clusters, and dock each of a several agonists with known pharmacology [4] to the binding site of each conformer. In this case, this amounts to 70,000 docking calcula$ons. 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 False Positive Rate Aggregate Docking Score, Five Agonists, Two Inverse Agonists Docking score is a reliable metric of how ac$ve – as defined by [1] II a given conforma$on is, as displayed by the ROC curve above. AUC = 0.84 Step 4: Docking Score as a Response Variable to Choose tICA Coordinates 9 8 7 6 5 4 3 0.02 0.00 0.02 0.04 Log Lambda Coefficients 25 25 23 21 13 6 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Perform logis$c regression of docking score on tICA coordinates for each conforma$on. Use the LASSO to choose from the candidate tICA coordinates for further analysis. Step 5: Analyze Candidate tICs Feature importance in each tIC is displayed by color, with blue being least and red being most important. tIC9: Connector Region TM7, Tyr326 tIC7: NPxxY Interconnec$vity TM7, Ile325 TM6, Phe282 Blue: Inac$ve Green: Ac$ve Orange: MD Conformer Ile325 Asn318 Ile121 Phe328 tIC7 high tIC9 high tIC7 and tIC9 9 high Ile121 References: 1. Dror, Ron O., et al. "Ac$va$on mechanism of the β2Iadrenergic receptor." Proceedings of the Na'onal Academy of Sciences 108.46 (2011): 18684I18689. 2. Schwantes, Chris$an R., and Vijay S. Pande. "Improvements in Markov state model construc$on reveal many nonIna$ve interac$ons in the folding of NTL9." Journal of chemical theory and computa'on 9.4 (2013): 2000I2009. 3. PérezIHernández, Guillermo, et al. "Iden$fica$on of slow molecular order parameters for Markov model construc$on." The Journal of chemical physics 139.1 (2013): 015102. 4. De Graaf, Chris, and Didier Rognan. "Selec$ve structureIbased virtual screening for full and par$al agonists of the β2 adrenergic receptor." Journal of medicinal chemistry 51.16 (2008): 4978I4985. Inac$ve crystal structure Ac$ve crystal structure Ac$ve crystal structure Inac$ve crystal structure Ac$ve crystal structure tIC2 tIC7 tIC9 tIC2: Helix 6 Mo$on TM6 TM5 TM1 tIC3: Binding Pocket Affinity Agonist Phe290 0.00 0.25 0.50 0.75 1.00 1.25 7.5 10.0 12.5 15.0 TM6TM3 Distance RMSD of NPxxY to Active types Crystal MD sizes 3.0 3.5 4.0 4.5 5.0 tIC.2 (2.62,1.87] (1.87,1.32] (1.32,0.841] (0.841,0.513] (0.513,0.225] (0.225,0.0482] (0.0482,0.627] (0.627,1.33] (1.33,1.89] 0.00 0.25 0.50 0.75 1.00 1.25 7.5 10.0 12.5 15.0 TM6TM3 Distance RMSD of NPxxY to Active types Crystal MD sizes 3.0 3.5 4.0 4.5 5.0 tIC.7 (3.62,2.15] (2.15,1.25] (1.25,0.652] (0.652,0.139] (0.139,0.342] (0.342,0.846] (0.846,1.47] (1.47,2.18] (2.18,3.6] Aggregate docking score accurately predicts if a given conformer is “ac>ve.” Clusters colored by tIC 2 Clusters colored by tIC 7 Input : tICA coordinates and docking scores per cluster Learning class : Supervised Input : Trajectories Learning class : Unsupervised a. Features: Distances between all pairs of residues with an ini$al heavyIatom distance ≤ 10 A ! 3,365 features. b. Compute first 25 tICA coordinates. c. Note: This method never “sees” a priori data on the B2AR, e.g. the reac>on coordinates described in [1] tIC3 True Posi>ve Rate False Posi>ve Rate Agonist Phe290 Phe208 Trp286 Ile121 Asn318 Ile325

ri t(forum.stanford.edu/events/posterslides/Statistical...R((n (an (e (t(((((t(((((ze (A ((A (t:() A s (s:() (a)priori (Dror l (y(tIC (((s (K ((a) (((0.00 0.25 0.50 0.75 1.00 0.00

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ri t(forum.stanford.edu/events/posterslides/Statistical...R((n (an (e (t(((((t(((((ze (A ((A (t:() A s (s:() (a)priori (Dror l (y(tIC (((s (K ((a) (((0.00 0.25 0.50 0.75 1.00 0.00

Sta$s$cal(learning(and(docking(recover((the(reac$on(coordinates(of(a(GPCR(

Evan(N.(Feinberg,(M

.M.(Sultan,(R.T(M

cGibbon,(M.P.(Harrigan,(C.X.(Hernandez,(W

.R.(Fletcher(and(V.S.(Pande(

Abstract(GPCRs(com

prise(oneIthird(of(targets(of(all(FDAIapproved(drugs.(Molecular(dynam

ics((MD)(sim

ula$ons(of(GPCRs(can(contain(over(60,000(atom

s,(coun$ng(for(over(180,000(degrees(of(freedom.(The(

technique(described(here(reduces(the(dimensionality(of(GPCR(M

D(sim

ula$ons(through(a(combina$on(of(unsupervised(and(supervised(

learning.(In(par$cular,($meIstructure(Independent(Com

ponent(Analysis((tICA)([2,3](and(m

olecular(docking(are(used(complem

entarily(to(determ

ine(reac$on(coordinates(relevant(to(agonist(binding(and(receptor(ac$va$on.(Step(1:(Featurize(and(tICA(

Step(2:(Cluster(tICA(Coordinates(Input:(Projec'on)of)trajectories)onto)tICA)coordinates(Learning(class:(Unsupervised)For(illustra$on(purposes,(w

e(project(our(K=1000(clusters(onto(the(a)priori(reac$on(coordinates(described(in([1](by(Dror,)et)al.(and(color(by(tIC(value(of(each(cluster.(

Step(3:(Dock(Agonists(to(Conformers(

Take(s(sample(conform

a$ons(from(each(of(the(K((clusters,(and(dock(

each(of(a)several(agonists(with(know

n(pharmacology([4](to(the(

binding(site(of(each(conformer.(In(this(case,(this(am

ounts(to(70,000(docking(calcula$ons.(

0.00

0.25

0.50

0.75

1.00

0.000.25

0.500.75

1.00False Positive Rate

True Positive Rate

classTPR

Aggregate Docking Score, Five Agonists, Two Inverse Agonists

Docking(score(is(a(reliable(metric(of(how

(ac$ve(–(as(defined(by([1](II(a(given(conform

a$on(is,(as(displayed(by(the(RO

C(curve(above.((AU

C$=$0.84$

Step(4:(Docking(Score(as(a(Response(Variable(to(Choose(tICA(Coordinates(

−9

−8

−7

−6

−5

−4

−3

−0.02 0.00 0.02 0.04

Log L

am

bda

Coefficients

25

25

23

21

13

61

12 34 5 6 78 9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

Perform(logis$c(

regression(of(docking(score(on(tICA(coordinates(for(each(conform

a$on.(Use(the(LASSO

(to(choose(from

(the(candidate(tICA(coordinates(for(further(analysis.(

Step(5:(Analyze(Candidate(tICs(Feature(im

portance(in(each(tIC(is(displayed(by(color,(with(

blue(being(least(and(red(being(most(im

portant.(

tIC9:(Connector(Region(

TM7,(

Tyr326(

tIC7:(NPxxY(Interconnec$vity(

TM7,(

Ile325(

TM6,(

Phe282(

Blue:(Inac$ve(Green:(Ac$ve(Orange:(M

D(Conform

er((

Ile325(

Asn318(Ile121(Phe328(

tIC7%high%tIC9%high%

tIC7%and%tIC9%9%high%

Ile121(

References:(1. 

Dror,(Ron(O.,(et(al.("Ac$va$on(m

echanism(of(the(β2Iadrenergic(receptor."(Proceedings)of)the)

Na'onal)Academy)of)Sciences(108.46((2011):(18684I18689.(

2. Schw

antes,(Chris$an(R.,(and(Vijay(S.(Pande.("Improvem

ents(in(Markov(state(m

odel(construc$on(reveal(m

any(nonIna$ve(interac$ons(in(the(folding(of(NTL9."(Journal)of)chem

ical)theory)and)com

puta'on(9.4((2013):(2000I2009.(3. 

PérezIHernández,(Guillermo,(et(al.("Iden$fica$on(of(slow

(molecular(order(param

eters(for(Markov(

model(construc$on."(The)Journal)of)chem

ical)physics(139.1((2013):(015102.(4. 

De(Graaf,(Chris,(and(Didier(Rognan.("Selec$ve(structureIbased(virtual(screening(for(full(and(par$al(agonists(of(the(β2(adrenergic(receptor."(Journal)of)m

edicinal)chemistry(51.16((2008):(4978I4985.(

Inac$ve(crystal(structure(

Ac$ve(crystal(structure(

Ac$ve(crystal(structure(

Inac$ve(crystal(structure(

Ac$ve(crystal(structure(

tIC2(tIC7(

tIC9(

tIC2:(Helix(6(Mo$on(

TM6(

TM5(

TM1(

tIC3:(Binding(Pocket(Affinity(

Agonist(

Phe290(

●0.00

0.25

0.50

0.75

1.00

1.25

7.510.0

12.515.0

TM6−TM

3 Distance

RMSD of NPxxY to Active

types

●C

rystalM

D

sizes●●●●●

3.03.54.04.55.0

tIC.2

●●●●●●●●●

(−2.62,−1.87](−1.87,−1.32](−1.32,−0.841](−0.841,−0.513](−0.513,−0.225](−0.225,0.0482](0.0482,0.627](0.627,1.33](1.33,1.89]

●0.00

0.25

0.50

0.75

1.00

1.25

7.510.0

12.515.0

TM6−TM

3 Distance

RMSD of NPxxY to Active

types

●C

rystalM

D

sizes●●●●●

3.03.54.04.55.0

tIC.7

●●●●●●●●●

(−3.62,−2.15](−2.15,−1.25](−1.25,−0.652](−0.652,−0.139](−0.139,0.342](0.342,0.846](0.846,1.47](1.47,2.18](2.18,3.6]

Aggregate$docking$score$accurately$predicts$if$a$given$conform

er$is$“ac>ve.”$

Clusters$colored$by$tIC$2$Clusters$colored$by$tIC$7$

Input:(tICA)coordinates)and)docking)scores)per)cluster(Learning(class:(Supervised)

Input:(Trajectories)(Learning(class:(U

nsupervised(a. 

Features:(Distances(between(all(pairs(of(residues(w

ith(an(ini$al(heavyIatom

(distance(≤(10(A(!(3,365(features.((

b. Com

pute(first(25(tICA(coordinates.(c. 

Note:$This$m

ethod$never$“sees”$a%priori$data$on$the$B2AR,$e.g.$the$reac>on$coordinates$described$in$[1]$

tIC3(

True$Posi>ve$Rate$

False$Posi>ve$Rate$

Agonist(

Phe290(

Phe208(Trp286(

Ile121(

Asn318(

Ile325(