102
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1925 Free energy calculations of G protein-coupled receptor modulation New methods and applications WILLEM JESPERS ISSN 1651-6214 ISBN 978-91-513-0927-9 urn:nbn:se:uu:diva-407840

Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2020

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1925

Free energy calculations ofG protein-coupled receptormodulation

New methods and applications

WILLEM JESPERS

ISSN 1651-6214ISBN 978-91-513-0927-9urn:nbn:se:uu:diva-407840

Page 2: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

Dissertation presented at Uppsala University to be publicly examined in A1:111a,, UppsalaBiomedical Centre (BMC), Husargatan 3, Uppsala, Wednesday, 20 May 2020 at 13:15 forthe degree of Doctor of Philosophy. The examination will be conducted in English. Facultyexaminer: Professor Jonathan Essex (University of Southampton).

AbstractJespers, W. 2020. Free energy calculations of G protein-coupled receptor modulation. Newmethods and applications. Digital Comprehensive Summaries of Uppsala Dissertations fromthe Faculty of Science and Technology 1925. 101 pp. Uppsala: Acta Universitatis Upsaliensis.ISBN 978-91-513-0927-9.

G protein-coupled receptors (GPCRs) are membrane proteins that transduce the signalsof extracellular ligands, such as hormones, neurotransmitters and metabolites, throughan intracellular response via G proteins. They are abundant in human physiology andapproximately 34% of the marketed drugs target a GPCR. Upon activation, the receptorundergoes conformational changes to accommodate the binding of the G protein. Our insightsin the structural determinants of ligand binding and receptor activation have increasedtremendously over the past decade. This has largely been a cause of the growing amountof experimentally determined structures, which provide crucial insights in ligand bindingmechanisms. However, in a typical drug design project it is unlikely that such structures canbe generated for all ligands. In those cases, computationally derived models of the protein-ligand complex can be generated. Rigorous free energy calculations such as the free energyperturbation (FEP) method, can subsequently provide the missing link between those structuresand experimental ligand binding data, and provide further insights in the binding mechanism.

In this thesis, two workflows are presented to calculate free energies of binding for ligandsto wildtype (QligFEP) and mutant (QresFEP) receptors. Both methods were tested on a set ofsolvation free energies of sidechain mimics. QligFEP was furthermore applied on three protein-ligand binding datasets, including pair comparisons between topologically unrelated molecules(scaffold hopping). QresFEP was used to calculate protein-ligand binding affinities to mutants ofthe neuropeptide Y1, and to predicte the effect of receptor modifications on the thermal stabilityof T4 lysozyme.

The remainder of this work focussed on the application of these protocols in the design,synthesis and molecular pharmacology of ligands for the family of adenosine receptor (ARs).These receptors, involved in many physiological processes such as promotion of sleep (caffeineis a well-known inhibitor), have recently been pursued as drug targets in immuno-oncology.QligFEP was used in the design of novel series of antagonists for the A3AR and A2BAR. QresFEPwas used to study ligand binding to the A1AR and in a multidisciplinary approach to characterizebinding to the orphan receptor GPR139. Both approaches were combined to design a series ofA2AAR antagonist, and to propose a binding mode later confirmed by new crystal structures.Finally, a new application of FEP is introduced based on conformational equilibria betweenthe active and inactive A2AAR, to elucidate the regulation mechanism of receptor activation byligands and receptor mutations.

Keywords: G protein-coupled receptor, adenosine receptor, molecular dynamics, free energyperturbation, homology modeling, computer simulations, conformational selectivity, bindingfree energy.

Willem Jespers, Department of Cell and Molecular Biology, Computational Biology andBioinformatics, Box 596, Uppsala University, SE-751 24 Uppsala, Sweden.

© Willem Jespers 2020

ISSN 1651-6214ISBN 978-91-513-0927-9urn:nbn:se:uu:diva-407840 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-407840)

Page 3: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

To Florence

Page 4: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive
Page 5: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Jespers, W., Esguerra, M., Åqvist J., Gutiérrez-de-Terán H.

(2019) QligFEP: an automated workflow for small molecule free energy calculations in Q. Journal of cheminformatics, 11 (1):26

II Jespers, W.,§ Isaksen, G.V.,§ Andberg, T.A.H., Vasile, S., van Veen, A., Åqvist, J., Brandsdal B.O., Gutiérrez-de-Terán, H. (2019) QresFEP: An Automated Protocol for Free Energy Cal-culations of Protein Mutations in Q. Journal of chemical theory and computation, 15 (10):5461-5473

III Jespers, W., Åqvist, J., Gutiérrez-de-Terán, H. Free energy cal-culations for protein-ligand binding prediction. Methods in Mo-lecular Biology (In Press).

IV Jespers, W., Schiedel, A.C., Heitman, L.H., Cooke, R.M., Kleene, L., van Westen, G.J.P., Gloriam, D.E., Müller, C.E., Sotelo E., Gutiérrez-de-Terán, H. (2017) Structural Mapping of Adenosine Receptor Mutations: Ligand Binding and Signaling Mechanisms. Trends in pharmacological sciences, 39 (1):75-89

V Azuaje, J.,§ Jespers, W.,§ Yaziji, V., Mallo, A., Majellaro, M., Caamano, O., Loza, M.I., Cadavid, M.I., Brea, J., Åqvist, J., Sotelo E., Gutiérrez-de-Terán, H. (2017) Effect of Nitrogen Atom Substitution in A3 Adenosine Receptor Binding: N -(4,6-Diarylpyridin-2-yl)acetamides as Potent and Selective Antago-nists. Journal of medicinal chemistry, 60 (17):7502-7511

VI Majellaro, M., § Jespers, W., § Crespo,A., Núñez, M.J., Novio, S., Garcia-Santiago,C., Azuaje, J., Gioé, C., Prieto, R., Brea, J., Loza, M.I, 5 García-Mera, X., Caamaño, O., El Maatougui, A., Mallo-Abreu, A., Freire-Garabal, M., Sardina, F.J., Stefanachi, A., Åqvist, J., Gutiérrez-de-Terán3, H., Sotelo E. 3,4-Dihydro-pyrimidin-2(1H)-ones as Antagonists of the Human A2B Aden-osine Receptor: SAR Studies, Enantiospecific Recognition and Evidences of Antimetastatic Effect. Manuscript

Page 6: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

VII Mallo-Abreu, A.,§ Majellaro, M.,§ Jespers, W., Azuaje, A., Caamaño, O., García-Mera, X., M Brea, J., Loza, M.I., Gutiérrez-de-Terán, H., Sotelo E. Trifluorinated Pyrimidine-Based A2B An-tagonists: Optimization and Evidence of Stereospecific Recogni-tion. Journal of medicinal chemistry 62 (20):9315-9330

VIII Nøhr, A. C.,§ Jespers, W.,§ Shehata, M.A.,§ Floryan, L., Isberg, V., Andersen, K.B., Åqvist, J., Gutiérrez-de-Terán, H., Bräuner-Osborne H., Gloriam, D.E. (2017) The GPR139 reference ago-nists 1a and 7c, and tryptophan and phenylalanine share a com-mon binding site. Scientific reports 7 (1):1128

IX Jespers, W., Oliveira, A., Prieto-Díaz, R., Majellaro, M., Åqvist, J., Sotelo E., Gutiérrez-de-Terán H. (2017) Structure-Based De-sign of Potent and Selective Ligands at the Four Adenosine Re-ceptors. Molecules 22 (11):1945

X Jespers, W., Verdon, G., Azuaje, J., Keränen, H., García-Mera, X., Congreve, M., Deflorian, F., de Graaf, C., Zhukov, A., Dore, A., Mason, J.S., Åqvist, J., Cooke, R.M., Sotelo, E., Gutierrez de Teran, H. (2019) X-Ray Crystallography and Free Energy Cal-culations Reveal the Binding Mechanism of A2A Adenosine Re-ceptor Antagonists. 10.26434/chemrxiv.11444877.v1

XI Jespers, W., Torres, K.V., van Veen, A., Wang, X., van der Ent, F., Heitman, L.H., IJzerman, A.P., Sotelo, E., van Westen, G.J.P., Johan Åqvist1, Gutiérrez-de-Terán H., Predicting conforma-tional selectivity in a prototypical GPCR. Manuscript

§ Authors contributed equally

Reprints were made with permission from the respective publishers.

Page 7: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

Contents

Introduction ............................................................................................... 11

Methods..................................................................................................... 13Παντα ρει ............................................................................................ 13Kinetic energy ...................................................................................... 15Potential energy .................................................................................... 16Sampling ............................................................................................... 21Molecular dynamics engines ................................................................ 22Ligand binding ..................................................................................... 23Homology modelling ............................................................................ 24Protein-ligand docking and free energies ............................................. 25Free Energy Perturbation ..................................................................... 25GPCR signaling in a nutshell ............................................................... 27Understanding GPCRs through thermodynamic cycles ....................... 28

QresFEP and QligFEP (I-III) .................................................................... 30Performance of QresFEP and QligFEP ................................................ 33

QresFEP ........................................................................................... 34QligFEP ........................................................................................... 37

The adenosine receptor family (IV) .......................................................... 41Recognition of the core scaffold of agonists and antagonists ......... 42Selectivity hotspots .......................................................................... 45Residues involved in the recognition of agonists ............................ 45

Protein-ligand binding (V-VII) ................................................................. 47A3AR antagonists ................................................................................. 47A2B antagonists ..................................................................................... 51

3,4-Dihydropyrimidin-2(1H)-ones (VI) .......................................... 52Trifluorinated Pyrimidine-Based A2B Antagonists .......................... 59

Site-directed mutagenesis (VIII, IX) ......................................................... 62Ligand-receptor model optimization for GPR139 ............................... 62A1AR – DPCPX recognition ................................................................ 65

Two sides of the same coin (X) ................................................................ 68

Conformational selectivity (IX, XI) .......................................................... 74

Page 8: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

Ligand conformational selectivity ........................................................ 75The effect of mutations on conformational selectivity ......................... 77

Conclusions ............................................................................................... 82

Populärvetenskaplig sammanfattning ....................................................... 85

Acknowledgements ................................................................................... 88

Bibliography ............................................................................................. 91

Page 9: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

Abbreviations

AMBER Assisted Model Building with Energy refinement API Application Programming Interface AR Adenosine Receptor ATCH Adrenocorticotropic Hormone BAR Bennet’s Acceptance Ratio BPM Biophysical Mapping cAMP Cyclic Adenosine Monophosphate CDk2 Cyclin-Dependent Kinase 2 CHARMM Chemistry at Harvard Macromolecular Mechanics Chk1 Checkpoint Kinase 1 Cryo-EM Cryogenic Electron Microscopy DHPM 3,4-dihydropyrimidin-2(1H)-one EL Extracellular Loop FDA Food and Drug Administration FEP Free Energy Perturbation FN False Negative FP False Positive GBSA Generalized Born and Surface Area continuum solvation GDP Guanosine Diphosphate GPCR G Protein-Coupled Receptor GROMACS GROningen MAchine for Chemical Simulations GTP Guanosine Triphosphate HMR Hydrogen Mass Repartitioning HPLC High Pressure Liquid Chromatography HTS High Throughput Screening IL Intracellular Loop LIE Linear Interaction Energy LoMAP Lead Optimization Mapper LRF Local Reaction Field MAE Mean Absolute Error MCC Matthew’s Correlation Coefficient MCS Maximum Common Substructure MD Molecular Dynamics MM Molecular Mechanics MPP+ 1-methyl-4-phenylpyridinium

Page 10: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

mRNA Messenger Ribonucleic Acid MSH Melanocyte-stimulating hormone NMR Nuclear Magnetic Resonance NPY1 Neuropeptide Y1 OPLS Optimized Potentials for Liquid Simulations PBC Periodic Boundary Conditions PBSA Poisson–Boltzmann and Surface Area continuum solvation PCM Proteo-Chemometrics PDB Protein Databank PET Positron Emission Tomography PME Particle Mesh Ewald POMC Pro-opiomelanocortin POPC 1-palmitoyl-2-oleoyl-sn-glycero-3-

phosphatidylcholine RMSD Root Mean Square Deviation SAR Structure Affinity Relationship SBC Spherical Boundary Conditions SBDD Structure Based Drug Design SCAAS Surface Constrained All Atom Solvent SDM Site-Directed Mutagenesis SEM Standard Error of the Mean SPR Surface Plasmon Resonance StaR Stabilized Receptor TM Transmembrane TN True Negative TP True Positive

Page 11: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

11

Introduction

G protein-coupled receptors (GPCRs) are membrane proteins that transduce the signals of extracellular ligands, such as hormones, neurotransmitters and metabolites, through an appropriate cellular response [1]. This response is ca-nonically mediated through intracellular G proteins, a family of heterotrimeric proteins consisting of subunits α, β (both interacting with the GPCR), and γ. There are 16 subtypes of the α subunit of the G protein, which are grouped in 4 families. Additionally, a non-canonical signalling pathway involves the re-cruitment of β-arrestin, for which two substypes exist. GPCRs are abundant in human physiology, with over 800 genes encoding for five GPCR classes [2], of which class A (rhodopsin) is the most common. Furthermore, approxi-mately 34% of the marketed drugs target a GPCR [3]. GPCRs share a topology of seven transmembrane (TM) α-helices, connected by 3 extracellular (EL) and 3 intracellular (IL) loops. Upon activation, the receptor undergoes confor-mational changes to accommodate the binding of the intracellular signalling proteins. Whilst this mechanism of activation is relatively conserved, ligand binding sites vary greatly between receptor families. Over the past decade, our insights in the structural determinants of ligand binding and receptor activa-tion has increased tremendously. This is largely a consequence of the growing amount of solved GPCR structures from X-ray crystallography and, recently, cryo-EM. Structures of the inactive, active and G protein-bound state of the receptor have been solved with X-ray crystallography. In addition, recent ad-vances in cryo-EM have made this technique the most common approach to determine the structures of larger complexes of GPCRs bound to an intracel-lular binding partner. Consequentially, the available structures now cover a large extent of the GPCR-ome. However, a majority of these receptors still do not have a structure. In such cases, one can employ computational methods such as homology modelling. Even though the overall sequence identity be-tween GPCRs is relatively low (generally under 30%), the conserved topology of GPCRs still makes it possible to generate reasonably accurate models. Structures of GPCR-ligand complexes provide crucial insights in the binding mechanism of a compound. However, in a typical drug design project, it is unlikely that such structures can be generated for the plurality of chemotypes under investigation. In such cases, protein-ligand complexes can then be ob-tained by (computational) enumeration of the potential binding modes of the ligand, for example by docking algorithms. Whilst these algorithms are quite efficient in generating possible ligand poses, they typically fail to accurately

Page 12: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

12

score these in terms of the actual binding affinity of the ligand. Therefore, it is advisable to additionally validate the ligand-receptor model using experi-mental data, e.g. by mapping known (experimental) information on the recep-tor-ligand complex. Such experimental data includes, amongst others, struc-ture affinity relationships (SAR) within a compound series and/or site-directed mutagenesis (SDM) of putative binding site residues. To gain further insights in the determinants of binding, these data can be supplemented by rigorous free energy calculations, such as the free energy perturbation (FEP) method. The computed (relative) free energies can then be used to correlate computa-tionally determined binding mode(s) to experimental readout.

In this thesis, the development and application of computational methods to characterize and understand ligand binding will be discussed. Workflows to calculate relative free energies of binding for ligands to wildtype (QligFEP, Paper I) and mutant (QresFEP, Paper II) receptors are presented, for which practical examples are given in Paper III. These protocols were applied to characterize ligand binding on the prototypical GPCR family of adenosine re-ceptors (ARs), and a comprehensive review of these receptors from the struc-tural and mechanistic perspective is given in Paper IV. QligFEP was used in the design of novel series of ligands to study the molecular determinants of antagonist binding to the adenosine receptors A3AR (Paper V) and A2BAR (Paper VI and Paper VII). QresFEP was used in a multidisciplinary study to characterize binding to the orphan receptor GPR139 (Paper VIII) and ligand binding to the A1AR (Paper IX). Both approaches were combined to design a series of A2AAR antagonist, for which the proposed binding mode was con-firmed by new crystal structures (Paper X). Finally, our computational meth-odology is revisited to predict ligand pharmacological profiles and the role of mutations in receptor activation of the A2AAR, based on a new representation of the receptor conformational equilibrium (Paper IX and XI).

Page 13: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

13

Methods

Παντα ρει Everything around us is in constant motion, said Heraclitus of Ephesus: παντα ρει, everything flows. Still, a river remains a river, even though the water in it is constantly moving. Nothing changes, said Democritus: every-thing is built on a set of undividable entities, or ατομοσ. These two philoso-phies formed the basis of extensive debate, dividing philosophers in two camps that at the time seemed nonunifiable. The latter notion formed the basis of atomism, nature is built on a set of undividable particles, which determine the behavior of all we know around us. The first notion is also known as emer-gence, the existence of a property from a collection of entities, that a single entity does not have.

Let us imagine a single particle moving in space (vacuum). We can describe its motion, and predict with good accuracy where the particle will be, by con-sidering Newton’s laws of motion. We can expand this to a multiple particle system, where we can for instance calculate trajectories of a spacecraft landing on the moon; or describe how two small particles (atoms) are interacting with each other. Thus, such a system can be considered deterministic: we can pre-cisely calculate its evolution and properties on the basis of a set of rules. Given the initial positions and momenta and given that we could sample a system infinitely, we can obtain time averaged values of the properties of the system:

(1)

However, in practice (when considering many particles) these equations be-come quite complicated and include many variables (in the order of magnitude of Avogrado’s number!). In addition, there is no chance we will acquire all information of the initial macroscopic system, and it is nearly impossible to integrate all these equations successfully, i.e. the system is deterministically chaotic. Still, for a small enough (microscopic) system, it is possible to find these properties. Let us consider an ideal monoatomic gas as an example, and describe its microscopic configuration using the positions x, and velocities v

Page 14: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

14

of the particles. The system will have a total of 6 degrees of freedom in carte-sian space, i.e. , and . We can then relate the total energy of the system to the positions and velocities of the particles:

(2)

Where p is the momentum and r the position of each of the N particles in the system, and every configuration of the system is related to a probability den-sity ρ.

How then are the microscopic and macroscopic systems related? It was Boltzmann who coined the term ergoden or ergodisch (from the Greek εργον, energy, and οδοσ, path), which was freely translated to ergodic in English. It was later expanded by Gibbs, and it postulated that the macroscopic system, starting from any given point, would eventually pass all possible configura-tions of the system on the potential energy surface. Under this assumption, it follows that:

(3)

It was also known as the continuity of path in Maxwellian thermodynamics. The ergodic postulate forms the basis of the work presented in this thesis.

Let us have a look at the total energy of our system in more detail. This is constant if the system is isolated, and can be written as a combination of its potential and kinetic energies:

(4)

The latter term is the Hamiltonian, which is a more general construct to allow us to talk about a coordinate system when we want to relate energies to posi-tions and velocities (we will see below why this is useful) of particles in the system. We can then relate the energy of the system to the probability of find-ing the system in a particular configuration ( ), if we have a fixed number of atoms (N) at a thermal equilibrium at temperature T:

(5)

Where is the probability of observing the system in configuration x with an instantaneous velocity v, is the inverse temperature where is the Boltzmann constant. This formula is also known as Boltzmann’s law, and is a consequence of counting energy levels

Page 15: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

15

and densities in systems that can exchange energy with other systems. In this particular case, we are looking at a system having a constant number of parti-cles N, volume V and temperature T, also known as the (N,V,T) or canonical ensemble. The associated Helmholtz free energy F, is related to this canonical ensemble:

(6)

Note that this relationship formally holds only under the canonical (N,V,T) ensemble; however, for the consideration of biological processes we can as-sume that both T and P are constant (to a large extent), in which case the re-spective Helmholtz and Gibbs free energies become identical. Hence, from hereon we refer to Gibbs free energy.

Kinetic energy In the case of our monoatomic, ideal gas, the particles are inseparable, and we can factor out the potential energy term. Thus, we can describe the total energy of the system by just considering the kinetic energy. We will see that this en-ergy is related only to the mass of the particles under investigation:

(7)

(8)

Note that is a probability distribution, and can be normalized to

, such that we can describe the probability of finding a certain particle (with a given mass and speed) in the system at a temperature (T):

(9)

Thus, if we are to plot such a distribution for atoms of various masses in a system, we obtain (at T = 298K):

Page 16: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

16

Figure 1: probability densities of atoms of different mass at a given speed.

Heavier particles will have lower speeds, and the probability density is nar-rower. In general, the Maxwell-Boltzmann distribution adheres to a Chi prob-ability distribution, with a scaling factor . We will show how this distribution can be used to initiate a molecular dynamics simulation later.

Potential energy In this thesis, we are mainly interested in the properties of proteins and lig-ands, which definitely do not behave like an ideal, monoatomic gas. Thus, to calculate the total energy of the system we also need to consider the potential energy. We previously noted that it was useful to use a coordinate system, and here we will show how we can calculate the potential energy of that system.

Let us consider a simple molecule of butane in gas phase:

Page 17: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

17

Figure 2: a butane molecule in ball and stick representation. The carbon atoms are colored in orange, the hydrogen atoms in white.

We are dealing with a set of particles (representing the atoms, the balls in the figure), connected by bonds (sticks in the figure), which together describe the topology of the system. All atoms are interacting with each other via non-bonded interactions, and in some cases via bonded interactions. Let us con-sider both terms by decomposing the total potential energy:

(10)

I will first describe the bonded interactions, i.e. all forces associated with at-oms that are connected with each other. We can represent two bonded particles as beads on a spring, in which case we can calculate the potential energy of that spring using a harmonic oscillator:

(11)

Where is some force constant, the distance between two particles, and is the minimum of the function. The angular motion of a particle can also be calculated using harmonic oscillators:

Page 18: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

18

(12)

(13)

Besides vibrational motion, molecules have rotational motion, i.e. by rotating around a certain bond. This is also referred to as torsional energy, and can be calculated using a periodic function:

(14)

Where is a force constant, the periodicity and the phase shift. The potential energy associated with the rotational motion is typically a collection of several of these Fourier terms (N) describing multiple minima, as shown in Fig. 4. Lastly, the bonded interactions contain the energies associated with out-of-plane motions, or impropers. These can be calculated using either a harmonic or periodic potential.

Figure 3: vibrational motions between two bonded atoms (left) and angular motion between three atoms (right) can be calculated with harmonic oscillators.

Page 19: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

19

Figure 4: rotational motion (left) is calculated using a periodic potential

The remainder of potential energy is made up by the non-bonded interactions between all the atoms in the system. These are decomposed in electrostatic and van der Waal’s interactions. The first can be calculated using Coulomb’s law:

(15)

Where Coulomb’s constant, = 332 kcal/mol if considering q in elemen-tary charges, r in Å and energies in kcal/mol. The second term is calculated using the Lennard-Jones potential:

(16)

By combing both and we can calculate the total non-bonded interaction energy between any particle pair. For the simple case of a sodium and a chlorine ion, the total energy as a function of their relative distance will look like this:

Page 20: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

20

Figure 5: potential energy surface of a sodium and chlorine ion, as a function of their relative distance

As indicated in Eq. 10, the total potential energy is described by enumerating all bonded and nonbonded interactions as a function of all the particles in the system. The force constants used in the resulting combined equation (Eq. 17), together with the corresponding equilibrium values, define a forcefield. All forcefields, such as AMBER [4], CHARMM [5] or OPLS [6], adhere to sim-ilar functional forms as described below. Throughout this thesis, we have mainly used the OPLS forcefield, and thus report its detailed form here:

(17)

Note that improper (out-of-plane motions) in OPLS are described using a pe-riodic function, with a large , where the other Fourier terms are zero.

Page 21: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

21

Sampling Now that we have described a way to calculate the energy of our system, we need a way to obtain meaningful ensemble averages (Eq. 2), which is achieved by statistical sampling of all possible configurations of the system. In simple cases, such as our butane molecule in vacuum, we could systematically ana-lyze the energy for every possible configuration. However, this approach quickly becomes computationally inaccessible if we are dealing with larger systems (i.e. Leventhal’s paradox [7]), and we need different ways to sample the configurational space. Broadly speaking, there are two ways of doing this, based on either stochastic or deterministic algorithms. In the first case we can use the Monte Carlo algorithm, which is based on equilibrium statistical me-chanics, where new configurations are generated at random and accepted based on some (usually energetic) criterium. On the other hand, molecular dynamics (MD) is a deterministic algorithm based on solving Newton’s laws of motion to obtain new configurations of the system.

Throughout this thesis, we have mostly used MD sampling, which we will consequently describe in more detail. We can solve Newton’s second law of motion:

(18)

For i = 1, 2, 3, … N (number of particles in the system). In this case, we cal-culate the force on every atom i in the system at each step t, as a negative positional derivative of the potential energy. We can advance position (r) and velocity (v) over time:

(19)

(20)

Note that in this case we calculate first the velocities at time , and from that can calculate the new positions at time , which is why this type of algorithm is also known as the leap-frog integrator. The initial (at =0) positional conditions are obtained from structural information from for instance X- ray structure or homology modelling (see below), whereas the in-itial velocities (formally at can be obtained from the Maxwell distribution (Eq. 9).

Page 22: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

22

Molecular dynamics engines Now that we have a formalism to propagate the system over time, we need a way to describe the system in a way that the properties estimated from the MD simulation are a reliable estimate of the real properties of the system (ergodic postulate, Eq. 3). Whilst we need to adhere to this postulate, such a represen-tation scheme should be as simple as possible to save computational time. A plurality of MD engines is available, and it goes beyond the scope of this thesis to describe them all. Therefore, we focus on two of the engines used in this work, GROMACS [8] and Q [9,10].

The size of the system (N) dictates the amount of computations needed for each integration step, in particular the amount of non-bonded interaction cal-culations, which is equal to N2/2. To reduce the amount of calculations needed, cut-off radii are used to define which of these interactions are explicitly cal-culated. A simplified model for the long-range electrostatics beyond this cut-off is used for the remainder of atom pairs (see below). Furthermore, we have to somehow limit the size of the system by introducing boundaries. We can do this by using periodic boundary conditions (PBC) or spherical boundary conditions (SBC). In the first case, a box1 is placed around the protein and infinitely replicated along the x, y and z coordinates. As a consequence, the system box needs to be large enough such that the protein is not interacting with its (artificial) replicate. Technically, this means that the embedding envi-ronment needs to be larger than the cut-off used for explicit calculation of electrostatic interactions (see below). Since we are dealing with a constant number of particles, whenever a particle leaves the box, it will be put back on the opposite side of the box. Along this thesis, we have performed MD simu-lations under PBC with GROMACS [8].

Instead of using a periodic box, in SBC a single finite spherical droplet of water is placed around the protein, or more commonly around the area of in-terest (e.g. the ligand-binding site) in which case any part of the protein out-side the sphere is tightly restrained. Waters in the boundary of the sphere are treated with the surface constrained all atom solvent (SCAAS) model that mimics the properties of bulk waters [11]. While a simple half harmonic boundary potential can be employed, it has been shown that this yields unsat-isfactory number density, orientational sampling, and free energy results, and it is more appropriate to additionally introduce an attraction term for waters towards the sphere boundary [12]. To further reduce the number of non-bonded calculations, several cut-off schemes can be applied. The most deli-cate simplification pertains the long-range electrostatic interactions used be-yond the cut off for explicit non-bonded interactions (typically > 10 Å): in GROMACS, these are treated with the particle mesh Ewald (PME) [13]

1 From hereon we will refer to a rectangular box, though other geometrical shapes are possible using PBC, such as the octahedral box used in the GROMACS simulations in this thesis.

Page 23: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

23

model, while a multipole expansion known as local reaction field [14] (LRF) is implemented in Q. Finally, an efficient way to reduce the number of com-putations is to increase the integration timestep, which can be achieved using the SHAKE [15] algorithm, possibly combined with hydrogen mass reparti-tioning or HMR [16]).

Ligand binding The main focus of this thesis is to describe the process of small molecule recognition (binding) to a target protein. As a biologic process, this is a dy-namic event and can be captured by considering the equilibrium between the ligand (L) in its free (i.e. water solvated) state, and in complex with the target protein (P), i.e. the bound (PL) state:

(21)

From this, we can determine the corresponding equilibrium constant (in this case, the dissociation rate or ). This is related to the on and off rates of the ligand, therefore to the relative concentrations of free and bound protein and ligand, and ultimately to the free energies of binding (∆Gbind):

(22)

Note that is related to only when considering a standard concentra-tion. The molecular mechanic approaches described above allow us to esti-mate the free energy of binding (see below), and thus we can actually capture the properties of ligand binding using computer simulations. In order to do this, we need accurate descriptions of both the protein and ligand. These can be obtained experimentally (i.e. via X-ray crystallography, cryo-EM or NMR), or alternatively predicted using for instance homology modelling and protein-ligand docking. Since the latter two are used throughout this thesis, I will describe them in more detail below.

Page 24: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

24

Homology modelling In many cases, including some studies within this thesis, we do not have an experimental structure of the protein: how can we then generate a reliable in-itial configuration for the MD simulations? Homology modelling is one of the answers to this question, which was used in this work. The underlying princi-ple of this method is to derive the structure of a protein based on its homology to other protein(s) having a known structure. This is done by aligning a query sequence to potential target sequences, and identification of the best tem-plate(s)2, typically the protein(s) with the highest sequence identity with the query.3 The aligned sequence will then be built satisfying constraints imposed by the 3D structure of the target(s) in the homologous regions, followed by a brief minimization of the generated structure. It is typically advisable to gen-erate multiple models, score them with an estimation of the internal energy and select the best model.4 A commonly used homology modeling package is MODELLER [17]. Typically a threshold of 35% sequence identity between query and template sequences is needed to obtain a reliable homology model. However, by taking advantage of the conserved 7TM topology of GPCRs, these sequence identity percentages can actually be lower and still provide useful models [18]. Our group has developed one of these protocols, available as a webserver, called GPCR-ModSim [19]. In this thesis, we have used this protocol to generate homology models for the A2B and A3 ARs (see papers V, VI, VII and IX). GPCR-ModSim is an integrated webservice that considers a multiple sequence alignment of the query protein against a selection of GPCRs with known structures as starting point. From this alignment, the user can identify the template receptor(s) (which can be different for different topolog-ical segments of the protein), and refine the alignment with the filtered tem-plates. The server uses MODELLER to produce homology models, which are ranked based on their DOPE score. The resulting model(s) can thereafter be prepared for MD simulations using the Python engine PyMemDyn [20] em-bedded in the webserver. This protocol automates the insertion of the solvated model in the membrane, and runs a dedicated equilibration protocol of MD simulations. PyMemDyn has been used for the setup of all initial GPCR sys-tems in this thesis.

2 Single or multiple templates can be used for homology modeling, a feature that is also imple-mented in the GPCR-ModSim webserver used along this work. 3 Other descriptors can also be considered such as phylogeny or, in the particular case of GPCRs, sharing the same class of natural ligand (i.e. peptide, catecholamine, nucleotide, etc. 4 Similar to the case of protein-ligand docking (see below), the “best” model should be selected complementing the ranking from the scoring function with visual inspection and interpretation on the generated structure of the available experimental data coming from e.g. site directed mutagenesis or other sources.

Page 25: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

25

Protein-ligand docking and free energies In typical drug design projects, we have to manage many ligands, and it is unlikely that we will obtain crystal structures for all these compounds with our target protein. Thus, even if we have a reliable starting configuration of the protein, we need to obtain reasonable ligand conformations in the binding pocket. A typical way to obtain these is via automated docking algorithms, which generate possible conformations of the ligand in the binding site, and score these in terms of their binding energy. In this thesis, we have used the GLIDE-SP algorithm [21] for automated docking. In some cases, knowledge-based docking was employed, by aligning a series of ligands to a reference ligand using flexible ligand alignment in Maestro [22]. The obtained struc-tures are thereafter refined by manual intervention using PyMOL [23].

Whilst docking algorithms are efficient tools to acquire plausible binding modes, they typically fail to accurately describe binding free energies [24]. Alternative methods combine the ergodic postulate with more rigorous sam-pling of the protein-ligand complex, e.g. by employing MD simulations. Some methods introduce additional empirical fitting to increase throughput, such as the MM/GBSA and MM/PBSA [25] or linear interaction energy (LIE) [26] methods. Alternatively, the binding (or unbinding), of a ligand can be simu-lated by unbiased MD simulations [27], from which free energies of binding can be obtained via for instance the potential mean force [28]. However, ob-taining these values from unbiased simulations can be problematic, as these simulations in many cases need to encompass time ranges of seconds to hours. Alternatively, biased MD methods, such as meta-dynamics [29] can be used, though finding the correct collective variables tends to be challenging [24].

Free energy perturbation The free energy perturbation (FEP) methodology a rigorous method based on first principles. In this case, we are interested in two (closely related) systems at equilibrium. We can describe the two different potential free energies ( and ) as a function of one of the two, such that:

(23)

Where <.> is the ensemble average over the potential . The above expressed formula is also known as Zwanzig’s equation [30], and the ensemble averages are related to averages of a set of many sampled conformations (e.g. obtained by MD simulations). Practically, this means that we will perform simulations of the molecular system A (with potential ) and for each of the configura-tions we also calculate the free energies for system B ( ). Consequentially,

Page 26: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

26

since we are comparing these conformations directly, this only works if the two potentials sufficiently overlap, i.e. the low-energy conformers of system B must be correctly sampled in system A. In fact, the allowed changes are so small, that this would not yield any reliable results even for small changes, say a simple methylation of a small-molecule inhibitor. To overcome this, one can in fact sample a collection of intermediate potentials between and , to ensure sufficient phase-space overlap, in which case the total energy can be collected as a summation over all these intermediates:

(24)

These intermediate potentials are defined by applying a scaling factor λ to each of the two potentials:

(25)

This means that we are in fact creating unphysical intermediates of a molecule, which is why this method is sometimes referred to as alchemistry. Still, we do not have to worry about these unphysical intermediates; since the free energy is a state function the difference in free energies obtained is path independent. Another way to obtain free energies, which has been used throughout this the-sis, is Bennet’s acceptance ratio (BAR) [31]:

(26)

where the constants Ci are optimized iteratively such that the two ensemble averages become equal, yielding .

Since we are describing and as functions of , we need a configu-rational framework that relates to both and . There are two common ways to describe such a system, based on either a single or dual topology rep-resentation of the atomic coordinates. In case of the first, equivalent atoms in

and will have the same coordinates and dissimilar atoms are represented by non-interacting (dummy) atoms in the opposite state. In a dual topology representation, the full coordinate set of and are present, and dummy atoms represent the non-interacting atoms in the ‘off’ state for either or . Noteworthy, since the total potential energy of the system is a linear combi-nation of both potentials, the system interacts with a mixture of both states, and the two entities and do not ‘see’ each other. This also means that equivalent atoms in both and will be seen by the system as one complete atom. An additional advantage of the dual topology approach is that we do not

Page 27: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

27

have to change any bonded terms, which are typically more difficult to con-vergence. I have used both coordinate systems for different problems in this thesis, and more specific information is given in papers I, II and III.

GPCR signaling (in a nutshell) Along the 1980s, functional analogies were made between visual signaling in the retina, and the response to hormones in other tissues [32]. These were thought to be regulated by a ‘receptor’ (GPCR), a ‘transducer’, (G protein), and an ‘effector’ (enzyme). This notion was confirmed upon the first cloning of the β2 adrenergic receptor [33], for which it was shown that it shared se-quence homology to rhodopsin, and was presumed to have a conserved hep-tahelical (7TM) architecture. Subsequent cloning of other receptors, including more receptors of the adrenergic family, as well as muscarinic cholinergic and several serotonin receptors [34], lead to the same conclusions. The connection between receptors and effectors was made much earlier by Martin Rodbell, who postulated that a guanine-nucleotide regulatory protein was functionally connected to receptors, through which signaling by a second-messenger (cAMP) could be achieved [35]. This regulatory protein Gs (originally termed Ns) was later purified, and shown to consist of three subunits (see above), of which the Gα subunit signals via the hydrolysis of GTP to GDP, and the β and γ units are tightly linked in a βγ complex [36].

Both the βγ dimer and α subunits act as intracellular switches in the cellular response to receptor activation, which occur via the inhibition or activation of an – ever expanding – list of intracellular effectors. To activate this switch, G proteins must first bind to an activated conformation of the receptor (denoted as R*), which is achieved from the “resting” or inactive state (R) through a series of conformational changes converging in an outward movement of TM helices 6 and 3 [37,38]. Initially proposed on the basis of biochemical evi-dences, these conformational changes were confirmed by the determination of an experimental structure of the β2-Gs complex [39]. Recent advances in mo-lecular and structural biology further confirmed the so-called ‘allosteric ter-nary complex model’ (initially proposed in the mid 80s), which involves the isomerization of the receptor from an inactive (R) to an active (R*) confor-mation, stabilized both by the extracellular signaling ligand and the effector G protein [40]. The extent to which a receptor can be activated in the absence of the ligand is known as constitutive activity, which varies between receptors. Moreover, receptor mutants can affect the constitutive activity, which can re-sult in disease or altered responses to drugs [41]. Receptor activation can moreover be modulated by (endogenous) agonists. Synthetically derived ag-onists can similarly activate a receptor, but it is not uncommon that these ac-tivate the receptor to a lower extent, also known as partial agonism. These typically show lower affinities to the fully active conformation of the receptor,

Page 28: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

28

and possibly bind different conformations along the activation pathway [42]. The inactive state on the other hand can be stabilized by inverse agonist (char-acterized by inhibiting the basal level of receptor signaling). These molecules are pharmacologically distinct from neutral antagonists, which equally bind the R and R* state of the receptor and do not alter the receptor’s conforma-tional equilibrium [43]. In the last chapter of this thesis, I present a new ap-proach to predict these pharmacological concepts using FEP.

Understanding GPCRs through thermodynamic cycles FEP calculations are based on closed thermodynamic cycles [44–47]. This means that the free energy associated with the chemical modification occur-ring in the binding site must be connected to the associated free energy change occurring in the reference state. The binding of a ligand to a protein involves the transfer of the ligand from the aqueous media (reference state) to the bind-ing site of the protein. The concept of free energy perturbation is not only useful for ligand transformations, but can be used to calculate various proper-ties, as long as the two potential energies to be compared can be described using a thermodynamic cycle. In this thesis, I describe “classical” cycles to calculate protein-ligand binding affinities for both ligand pairs (paper I) and ligand affinities between wildtype and mutant proteins (paper II). In addition, a conceptually new thermodynamic cycle is presented. Here, pharmacological properties of a ligand can be calculated based on the preferential affinity for a given conformational state of the receptor (paper III), which was later ex-tended to include the effect of protein mutations (paper XI). I have written a Python based API framework to setup these calculations, and a brief overview of the workflow is described in the next part of this thesis, which summarizes papers I-III.

Page 29: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

29

Figure 6 A) The free energy difference between two ligands (A and B) can be calcu-lated by transforming one ligand into another in water (W) and protein (P). Alterna-tively, the effect of a single point mutation (Rmut) on the affinity for a given ligand (L) can be calculated, by transforming the corresponding sidechain from the wild-type (RWT) to the mutant, in presence or absence of a ligand. The conformational selectivity between two ligands can be calculated by transforming an antagonist (Lant) into an agonist (Lago), in both the inactive (R) and active (R*) conformation of the receptor.

Broadly speaking, the remainder of the thesis is subdivided in three parts, which cover the thermodynamic cycles related to i) protein-ligand binding, ii) site directed mutagenesis and iii) conformational selectivity. These ap-proaches have been tested on various systems, including hydration free ener-gies of sidechains and protein stability and ligand binding on globular pro-teins. However, most of the work performed focused on GPCRs, with an em-phasis on the family of ARs (papers IV-VIII and X- XII), though we show that our approaches can be equally useful for orphan receptors (paper IX). Papers V-VII are mostly related to protein-ligand binding, whereas papers VIII and X include cases related to site directed mutagenesis, in paper IX I show the complementarity of both approaches. Finally, I show how these approaches can be used to predict conformational selectivity.

Page 30: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

30

QresFEP and QligFEP (I-III)

In this thesis, I have developed two protocols to routinely perform FEP calcu-lations on either small molecules (QligFEP) or protein mutations (QresFEP). Effectively, this means that the two potentials in Eq. 23, 25 and 26 are either describing two ligands, or a residue in the wild type and mutant protein. Though the two approaches are different from a technical point of view, both can provide valuable insights in ligand-protein recognition mechanisms, and I like to see them as two complementary methods; i.e. they form two sides of the same coin (see Fig. 7).

Figure 7: Schematic representation of the alchemical transformation protocols use in QligFEP (left, based on a dual topology representation) and QresFEP (right, based on a single-topology, using predesigned protocols gradually annihilating any sidechain to Ala). Both methods can be used complementary to provide insights in ligand binding mechanisms.

The largest difference between the two methods is the representation of the atomic coordinate system. QligFEP is based on a dual topology approach, where a mixture of both ligands is present along the simulation (see Fig. 8). Along the progression of λ windows all the atoms of one ligand are turned off whilst the atoms in the other ligand are turned on. QresFEP on the other hand is based on a single topology approach, where a given wildtype residue is gradually annihilated to Ala through a series of cumulative “subperturba-tions”, applied to groups of atoms of the sidechain (see Fig. 9). Whilst it is

Page 31: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

31

possible to collect a library of pre-generated protocols for residue mutations (since we are dealing only with 20 building blocks), this is definitely not pos-sible in the case of ligand perturbations (the small-molecule chemical space is estimated to be around 1033 molecules [48,49]). In this case, we have to be able to describe the FEP transformation on the fly (i.e. based on the input of two given molecules), and nowadays there are plenty of examples of protocols that do so [50–54]. A key difference between those methods and the QligFEP approach is that all methods use single topology, constructing the hybrid mol-ecule based on the identification of the maximum common substructure (MCS). A challenge for this method is to deal with larger topological changes, in particular those that involve bond modifications, which are typically asso-ciated with convergence problems [55]. This can be overcome by for instance employing soft-core bond potentials [56], or by constructing intermediate (al-chemical) molecules along the transition pathway. The latter method is used in the QresFEP protocols (see Fig. 9).

Figure 8: general scheme of the dual topology approach used in QligFEP. Two lig-ands (orange and blue), are shown in sticks, and the change between the two is shown in a circle. The script will generate a mixture state of the two ligands, which can subsequently be used in the FEP simulations to describe the potential energy (Ui), which is a linear combination of both potentials.

Page 32: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

32

Figure 9: Overview of the FEP scheme for side chain mutations to alanine. The figure shows the annihilation of a Trp side chain, which is gradually broken down in 9 FEP subperturbations each consisting of 20 λ windows. Annihilation of partial charges is indicated with a gray color and introduction of soft-core potentials with dots on the molecular structure. Blue and red lines account for the accumulated free energy (ΔG, kcal/mol) along the transformation in water or vacuum, respectively (the dark line represents the average of 10 independent simulations, each of which is depicted as a light-colored line). By means of the thermodynamic cycle employed, the difference between the final values (ΔΔG kcal/mol) is the estimation of the hydration free energy of the Trp sidechain compared to the Ala sidechain.

In both QresFEP and QligFEP, the protein under investigation needs to be prepared for spherical boundary conditions. This includes the de-ionization of residues outside the sphere (to avoid insufficient di-electric screening in the

Page 33: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

33

vacuum environment) and residues that are within the restrained area of the sphere boundary. After this, a spherical droplet of water is added to the system, and all crystallographic waters in the system are kept. In the case of QligFEP, the ligand mixture is added to the system (water droplet or protein), whereas in the case of QresFEP only one ligand is added to the holo simulations, and the reference simulations are run on the pseudo apo system. The protocols consist of 10 replicates for every individual leg of the thermodynamic cycle, each replicate typically running on an individual node of a high-performance computing (HPC) cluster with different initial velocities, assigned following Eq. 9. This setup allows both for the flexible use of computational resources through distributed computing, and the estimation of both average ΔG values and error estimates as standard error of the mean (SEM), which is key to assess the convergence of the calculation (see Fig. 10 for an overview of the prepa-ration steps for an example case with QligFEP).

Figure 10: example of the preparation steps for QligFEP. The protein is prepared and solvated in a spherical droplet of water. In addition, a ligand mixture is added to the system. Thereafter, 10 individual replicates of the transformation of ligand A to ligand B in the system are performed. The resulting ΔG’s of each replicate are col-lected and averaged, yielding ΔGavg ± SEM.

Performance of QresFEP and QligFEP QresFEP and QligFEP have been extensively benchmarked on several sys-tems. For both protocols, the solvation free energies of side chain mimics were calculated. In addition, protein-ligand binding affinities between ligand pairs

Page 34: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

34

(QligFEP) and protein mutants (QresFEP) were estimated. The first included calculations on three systems, namely CDk2, Chk1 and the A2AAR. The latter focused on protein-ligand binding to the neuropeptide Y1 (NPY1) receptor and the calculation of protein stability. The main results will be summarized for each case below.

QresFEP The QresFEP protocols have been applied on different systems by me and previous members in the lab, to estimate the effect of amino acid mutations on ligand binding. The results of all of these studies are summarized in Table 1, which includes calculations that were done for this thesis (indicated with their roman numbering). In general the R2 for these calculations is low (rang-ing between 0.01 and 0.59), though the meaningfulness of this statistical fig-ure is limited given the low number of datapoints on each study [57]. The mean absolute error (MAE) is a more useful measure of the accuracy of the method. These range from 0.42 to 1.41 kcal/mol, and fall within range of other reported values for analogous FEP calculations on protein mutations [58]. No-tably, many of the mutations include qualitative datapoints, since the effect on ligand binding is actually larger than the observable binding affinity range. In addition, there can be significant differences in binding affinity changes for a mutation between labs [59–61]. Therefore, it is of importance to interpret the results on the light of the experimental data available, rather than only focus-ing on variations in the statistical parameters associated to the particular mod-els. A way to quantify the qualitative agreement between calculated and ex-perimental values is to use the Matthew’s Correlation Coefficient (MCC), which is calculated from the confusion matrix:

(27)

Where TP = true positive, TN = true negative, FP = false positive and FN = false negative. The MCC ranges from -1 (completely anti-correlated) to 1 (per-fect correlation). For QresFEP, values between -0.19 and 1 were found, de-pending on the dataset.

Page 35: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

35

Table 1: Overview of ligand binding calculations with QresFEP.

References System Mutationsa MAEb R2 MCCc

Keranen et al.[60] A2AAR 17 (15) 1.31 0.53 0.58

Keranen et al.[62] A2AAR 26 (13) 1.07 0.28 0.59

Boukharta et al.[63] Y1Rd 13 (10) 0.96 0.34 0.57 Xu et al. [64] Y2Rd 9 (9) 0.78 0.41 1.00 Jespers et al. [59]II Y1R 16 (3) 1.41 0.01 -0.19 Nøhr et al.[65]VIII GPR139d 6 (0) N/Ae N/A 1.00 Jespers et al. IX A2AAR 24 (3) 0.69 0.50 0.41

Jespers et al. [61]X A1AR 6 (4) 0.42 0.59 1.00

a In parenthesis, the number of mutations with qualitative experimental data used for the calculation of MAE and R2 (in the remaining cases the observed effect was larger than the experimental cutoff, or the experimental data was based on functional data). bMAE = Mean absolute error. c MCC = Matthew’s correlation coefficient. d A homology model of the receptor and a docked position of the ligand were used for the simulations. e N/A = Not available

Besides protein-ligand binding calculations, QresFEP has been benchmarked on a set of mutations associated with protein stability. Here, instead of performing mutations in the holo and apo state of the protein, one compares the folded state of the protein with a representation of the denatured protein, which is a simple tripeptide (Ala-Xxx-Ala) of the aminoacid undergoing the mutation (see Fig. 11). The observed quantitative and qualitiative correlation is typically lower than for protein-ligand binding FEP studies [55,66]. The performance of QresFEP on a test set consisting of 43 mutations on T4-lysozyme follows this trend and the MAE on this set was 1.66 kcal/mol. Still, the qualitative performance was quite reasonable (MCC = 0.53). In other words, while we do not accurately predict the quantitative effects, we are able to predict whether a mutation will be stabilizing or not. This is particularly relevant considering that the experimental values, taken from the FoldX data-base, show relatively large internal discrepancies (to the extent of 0.81 kcal/mol on average, with an R2 of 0.71) [67].

Page 36: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

36

Figure 11: Calculations of the relative stability of T4-lysozyme caused by single point mutations. (A) Thermodynamic cycle employed. Top leg: the folded-state of the pro-tein with mutated represented in orange sticks. Bottom leg: the unfolded state (bottom leg) is represented by a simple Ala-Xxx-Ala tripeptide. (B) Experimental versus cal-culated shifts in thermal stability, expressed as ΔΔG (kcal/mol), for the 43 mutants calculated with QresFEP (blue dots) and FEP+ (orange dots). The solid line repre-sents a perfect correlation, and the dotted and dashed lines indicate the thresholds at ±1 and ±2 kcal/mol respectively.

Page 37: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

37

QligFEP Benchmarks for QligFEP included three protein-ligand binding systems: two kinases (CDk2 and Chk1) and one GPCR (A2AAR). The CDk2 dataset was first published in 2013 [68] and since then it has been extensively used to address the performance of FEP protocols. It consists of 16 ligands, many of which have been co-crystallized with CDk2. The first step to calculate free energies of binding for the ligands in the set, is to define pair comparisons that sufficiently cover the whole dataset. The matrix covering all possible pertur-bations is of size , which in this case accumulates to 120 pairwise comparisons. Many of these comparisons would be redundant, and in addition this strategy would yield large errors (as a cause of error propagation over each perturbation). Therefore, a proper design of the experiment needs to be performed to obtain a balance between computational cost, robustness and ac-curacy. Ideally one wants to connect those perturbations that 1) consider lig-and pairs with maximum similarity (to ensure proper phase-space overlap along the transformation) and 2) minimize the total amount of calculations. One approach that covers these two points is LoMAP [52]. Another alternative is the star map strategy: that is to consider the MCS of the set as the reference compound, given that this is actually a real compound with some measured binding affinity. Given its inherent simplicity, this map can contain large per-turbations associated with poor convergence. Since these can be identified e.g. with an unusual high standard error and SEM over replicates, a correction can be added by excluding these perturbations and designing new nodes consider-ing a smoother path along one or multiple intermediates. Consider the star map of the CDk2 compound series (see Fig. 12), were in most cases the cal-culations are in good agreement with experiment (MAE 0.85 kcal/mol, R2 0.59). Two datapoints are clearly outliers, and both these values have very large associated SEM values, indicative of poor convergence. By redesigning the pathway via two smaller perturbations, these large perturbations can be avoided. In this case, the errors for the comparison accumulate over each edge, which is not problematic as long as these errors are relatively small. The re-sulting new values are much closer to the actual experimental value, and the overall statistics indeed improve (MAE = 0.52; R2 = 0.71). It is important to note that this approach is only valuable if it can actually be performed before the affinities of the compound are known. This means that the convergence measurement should help in identifying problematic predictions for which we have low confidence, such that we can exclude them and redesign pathways to obtain more accurate results.

Page 38: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

38

Figure 12: Top left, binding mode of one of the CDk2 compounds, in the circle the substituents at the R group. Top right, star map used to calculate free energies, the two compounds in red circles did not converge sufficiently (bottom right). The rede-signed path included sufficiently converged edges, significantly improving the results (bottom left).

The next benchmark example, the Chk1 series [56], was originally published in a study considering scaffold hopping transformations [69,70]. The key con-cept of these transformations is to change a core scaffold such that it no longer resembles the parent scaffold, which can be useful to avoid toxicology or to break a patentability issue. These modifications typically involve large changes to the bond topology of the compound, and are thus challenging to converge with regular FEP calculations. The dual topology approach avoids changing any bonds along the transformation, which should help increase the efficiency of the method. As shown in Fig. 13A, our approach performs com-parably well to Schrödinger’s FEP+ (presented in that paper), which employs a single topology approach and introduces softcore bond potentials to avoid the convergence issue. Using the same approach, in this thesis I also present another application of dual topology, shown in Fig. 13B: the calculation of

Page 39: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

39

relative free energies between two different ligand poses (A and B). Two pro-totypical A2AAR antagonists, the xanthine-based compounds caffeine and the-ophylline were considered. It is known from experiment (via X-ray crystal-lography [71]) that caffeine binds the receptor in two alternate conformations, whereas theophylline only binds in one orientation, locked by a dual hydrogen bond network with the conserved N2536.55 (see paper IV). These experimental observations are correctly captured in the FEP calculations. No significant dif-ference in free energies between the two poses are observed for caffeine, whereas these favor the experimentally observed binding mode for theophyl-line. I will show in more detail how this approach can be used in paper X.

Figure 13: Left) example of scaffold hopping perturbations on a series of Chk2 inhib-itors. The obtained results with QligFEP (Q) and Schrodinger’s FEP+ are given in the table. Right) besides scaffold hopping perturbations, QligFEP allows the direct comparison of free energy differences between binding modes, as observed for the A2AAR antagonist caffeine.

The final benchmark set included binding free energies of a series of antago-nists for the A2AAR. Similar to the CDk2 case, the initial starmap had to be adjusted to converge two of the larger perturbations. The final results, sum-marized in Fig. 14, show that the experimental effect of both the alkylic sub-stitutions and the changes in the core ring of the scaffold are reproduced cor-rectly, and compare well to previously reported computed values using FEP+. Note that preparation of this system included insertion and equilibration of the protein into the natural environment of the cellular membrane using Py-MemDyn. Thereafter the spherical system of the binding site is extracted and considered for the FEP calculations. This approach was generalized along this

Page 40: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

40

thesis for any membrane receptor. As we will see in the next chapters for the full AR family, as well as the GPR139 receptor system.

Figure 14: preparation steps for membrane proteins. Here, a pseudo apo complex is generated and equilibrated with GROMACS using PyMemDyn implemented in GPCR-ModSim. Thereafter, a ligand mixture is added to this system and the FEP transformation performed under spherical boundary conditions. The protocol has been tested on a series of A2AAR antagonists (right)

Page 41: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

41

The adenosine receptor family (IV)

The adenosine receptor (AR) family consists of four subtypes, namely A1, A2A, A2B and A3 [72]. These receptors mediate many processes in the body; in the central nervous system they exert depressant, anticonvulsant and sleep-promot-ing effects. They are furthermore associated with antidiuretic, negative ino-tropic, negative chronotropic, anti-inflammatory, immunosuppressive and angi-ogenic effects. As a consequence of these mechanisms, they exert pathophysio-logical effects in cardiovascular and neurodegenerative diseases, as well as in cancer growth and immune responses [73,74]. Whilst the interest in this receptor family has traditionally been high, the ability to bring effective drugs to the mar-ket has been limited. To date, the FDA approved A2A agonist Regadenoson (as a coronary vasodilator used in cardiac imaging), and more recently the antago-nist Istradefylline (2019) for the treatment of Parkinson’s disease in combina-tion with Levodopa, after it was first approved in Japan [75].

The tremendous interest in this receptor family has accumulated in a sub-stantial amount of experimental data. These include a vast amount of SDM studies (over 2500 datapoints have been collected in this thesis). Additionally, a large amount of data is collected in public databases. Advances in membrane protein engineering and crystallography have sparked a surge of experimental GPCR structures [76], and the AR family is one of the most thoroughly char-acterized families from the structural point of view. There is now a collection of 49 structures in the PDB (not including the two structures reported in paper X). These include structures of the inactive conformation of the A2AAR bound to different chemotypes of orthosteric antagonists [77–79], see Fig. 15. Some of these also revealed the presence of allosteric sites, for instance the binding site of the negative allosteric modulator sodium (Na+) [80], or allosteric sites in the extracellular loop (EL) region [81]. Recently, the A1AR has been crys-tallized in complex with xanthine antagonists [82,83]. In addition, active-like (agonist-bound) structures of the A2AAR have been obtained with several adenosine derivatives [84–86] (Fig 15.), lately complemented with the first fully active conformation of the A2AAR bound to an engineered G protein fragment [87] and a cryo-EM structure of the active A1 G protein complex [88]. These structures are further complemented by data from NMR, which provide more insights in the conformational flexibility of the receptor [89,90]. Additionally, the integration of experimental data with computational models provide new insights in ligand recognition patterns, for instance via prote-

Page 42: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

42

ochemometrics (PCM) [91]. Another technique known as biophysical map-ping (BPM) [92], combines mutagenesis with pharmacological and structural analyses, and has successfully been used in the design of new A2AAR antago-nists [93,94]. In paper X we have complemented this technique with rigorous FEP calculations, to provide new insights in ligand recognition of the chro-mone scaffold.

Figure 15: Ligands co-crystallized with adenosine receptors. The top row contains agonist structures and the bottom row agonist structures. The box indicates hydrogen bond donors and/or acceptors forming key interactions with the receptor.

Recognition of the core scaffold of agonists and antagonists All classic AR agonists and antagonists exhibit common interaction patterns with residues in the orthosteric binding pocket (see Fig. 16). They are typically derived from a planar heterocycle, which has key hydrogen bond donors/ac-ceptors that form a common interaction with the completely conserved N2536.55 (superscript refers to Ballesteros-Weinstein Numbering, a common numbering scheme based on the topological conservation between GPCRs [95]). This scaffold, can be decorated with substituents, conferring high affin-ity, selectivity and/or intrinsic activity (i.e. the ribose moiety). All crystal structures of the A2AAR and A1AR show a common binding interaction with the conserved Asparagine residue. This interaction is typically complemented with π−π stacking of the core moiety with F168EL2(F/F/F/F) in the second ex-tracellular loop (EL2). Many SDM studies have confirmed the important role of N2536.55 in all AR subtypes [92,96–100]. In all cases, the mutation of this

Page 43: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

43

residue completely abolishes ligand binding (Figure 16b). Still the receptor is expressed at normal levels, and no change in basal activity levels is observed [96]. F168EL2 can be replaced by another aromatic residue (tryptophan [96,101]), but mutations to Ala or Asp result in a drastic reduction in ligand affinity [102]. The next residue in sequence, E169EL2(E/E/E/V), is also in-volved in ligand recognition, forming an additional H-bond to the ligands bearing an exocyclic amino group (Fig. 16B). In addition, this residue inter-acts with H264EL3(H/H/N/-) [103], forming a salt bridge commonly associated with ligand residence time [104]. Alanine mutations at this position reduce ligand binding affinities of the full agonist CGS21680 [105] and the antago-nist ZM241385 [106], but in some cases the effect is milder [102]. Interest-ingly the salt bridge between His and Glu is not present in all of the A2AAR structures, and in the A1AR K265EL3(K/A/G/P), might replace H264EL3 in this salt bridge.

Page 44: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

44

Figure 16: Orthosteric ligand binding

Page 45: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

45

a) Topological overview of the adenosine receptor. The regions encircled have been included in panel b to d. Antagonist structures are colored blue and agonist structures in orange, the A1AR structure has been colored in pink. b) The main anchor points for agonists and antagonists, indicated by the adenine cores of agonist NECA and antag-onist ZM241385. c) Crystal structures of antagonist PSB36 bound to A2AAR (5N2R) and A1AR (5N2S). Residue M2707.34(T/M/M/L) has been identified as a key residue involved in ligand selectivity d) Residues involved in agonist binding. An inward movement of TM1, TM5, and TM7 is observed. The top half of TM6 remains stable and a kink is induced from W2466.48. These movements propagate into the sodium binding pocket, resulting in a collapse of this pocket. e) The effect of alanine mutations of the A2AAR on binding and potency of partial and full agonists and binding of an-tagonists. Reducing values are shown in red, increasing in blue. A darker color means a more pronounced effect (>30 fold).

Selectivity hotspots The residue H2506.52 (H/H/H/S), is positioned more deeply in the ligand bind-ing pocket (Fig. 16B), and the H2506.52A mutant [98,100] receptor recognizes agonist nor antagonist. However, affinity of both is retained upon mutation to more bulky residues such as phenylalanine [100,107]. The A3AR is the only AR bearing a Ser residue on this position, and this might provide an explana-tion for the tolerance of more bulky substituents of A3 ligands (see paper V). A key selectivity hotspot between adenosine A1 and A2A receptors is posi-tioned at the 7.34, which is a Met residue in both A2 receptors, but a Thr and Leu in A1 and A3 respectively. In A1, the Thr residue accommodates cycloal-kylic groups at the C8-position, characteristic of A1-selective xanthine deriv-atives (Fig 16C) [82,83]. Indeed, the M270T7.34 mutant A2AAR showed in-creased affinity for A1 selective ligands. The opposite effect is observed for the T270M7.34 mutant A1AR [82,83]. Besides these single residue hotspots, differences in binding site architecture can have a large influence on selectiv-ity. An example is the N1-substituent of the xanthine derivative PSB36, which occupies a narrow cavity between TM3, TM5 and TM6 in the A1AR. This cavity is not present in the A2AAR and consequently, PSB36 is forced to sit in an alternative, less-favorable position [83].

Residues involved in the recognition of agonists All known full agonists of ARs contain a ribose moiety, which specifically interacts with four key residues in the adenosine receptor binding site, as ob-served in the A1 and A2AAR active structures. These residues include S2777x42 (T/S/S/S) – H2787x43 (H/H/H/H), together with T883x36 (T/T/T/T) and H2506x52

(H/H/H/S) (Fig. 16D). Many of these residues were identified already in the early 90’s [108], and subsequently their specific hydrogen bonding patters were confirmed by the so-called ‘neoceptor’ approach [109,110], where hy-drogen bond acceptors and donors are swapped between both receptor and ligand. These patterns are much less pronounced in the case of partial agonists,

Page 46: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

46

which exert different changes in levels of activation depending on the muta-tion [96]. Typically, these partial agonists include small hydroxy substituents that form hydrogen bond interactions with (one of) the four mutations dis-cussed above. Mutating either of the two histidine sidechains in the binding site in the H250A6x52 [98,100] and H278A7x43 [100] mutants strongly de-creased agonist potency and binding, but exert the same effect on antagonist binding.

On the other hand, mutations S277A7x42 and T88A3.36 exert a more spurious effect. The binding affinity of antagonists typically increase for these muta-tions and they are included in the set of mutations that form the ‘stabilized receptor’ (StaR) inactive A2AAR [111]. Full agonists such as NECA and CGS212680 respond with a large decrease in potency, with the most pro-nounced effect observed in the Thr mutant [112–114]. Whilst the mutations of the polar sidechain (Thr or Ser) at position 7x42 mildly to severely decrease agonist binding [115–118], the same mutants have negligible or even positive effects on the potency and/or efficacy of partial agonists [96,116–118]. The role of these mutations is further explored in the section covering conforma-tional equilibria (paper XI).

Page 47: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

47

Protein-ligand binding (V-VII)

In this chapter, I will summarize those projects that involved the design and elucidation of the binding mechanisms of novel series of small molecules for the A3AR (paper V) and A2BAR (papers VI and VII). The first project de-scribes the development of 2-acetamidopyridines as novel, potent and selec-tive A3 adenosine receptor (AR) antagonists. Here, we focused on the bioiso-steric replacement of the N1 atom of a previously reported series of diarylpy-rimidines by a CH group. With FEP simulations we showed the previously hypothesized role of the second nitrogen in the parent series in the stabilization of a water network in the binding site. In addition, we identified the binding mode of the most potent compound in the parent series, for which the bioiso-steric replacement abolished binding affinity.

In papers VI and VII we used both QligFEP and QresFEP to investigate the binding mode and SAR of various series of A2BAR antagonist. In paper VI, I performed over 100 FEP calculations, that allowed the identification of pro-tein-ligand binding mechanisms and compound optimization. In paper VII I explored the role of functional groups of fluorine on the most potent com-pound in the series, ISAM140. Here, the influence of changes in the pKa of the fluorinated compounds and the role of the conformation of N2546.55 proved essential to understand receptor-ligand recognition.

A3AR antagonists The design of the A3AR antagonists is based on structurally simple, monocy-clic pyri(mi)dines, allowing variability at three points on the core scaffold (see Fig. 17), leading to low molecular weight, potent and selective A3 antagonists [119,120]. A key difference between the three series presented in Fig. 17 is the presence of a second nitrogen atom and its position in the monocycle core of the scaffold. Previously Pennington et al. reviewed the role of what they coined a “necessary nitrogen” and concluded that the SAR around nitrogen containing heterocycles is a complex phenomenon [121]. A few key examples include: 1) the involvement in hydrogen-bond interactions with the receptor and/or solvent in the binding cavity, 2) altering the energy profile associated

Page 48: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

48

with the bioactive conformation and 3) altering the physicochemical proper-ties of the scaffold.

Figure 17. Top) Rationale of the design strategy of targeted structures, the arrows indicate hydrogen bonds with N2506.55. Bottom) based on the multi component reac-tion used here, variability can be introduced at three points, where the pink substitu-ents are always symmetrical. These can be optimized such that they fit the binding site of the receptor (schematically depicted as L1, L2 and L3).

The N1 (series 3) and N3 (series 1) in the heterocycle, together with the exo-cyclic nitrogen of the amido group (incoming and outgoing arrows, respec-tively) are essential for a hydrogen bond network with N2506.55 (see paper IV). On the other hand, the second nitrogen atom of the 4-amidopyrimidine ring (series 2, N1 in red circles in Fig. 17), yielded increased binding affinities as compared to the series 1 of 2-amidopyrimidines. This effect was initially as-sociated with the stabilization of a water network in the binding site, in anal-ogy to observations in the A2AAR crystal structures [120]. Thus, we set out to understand the role of the second nitrogen in the 4-amidopytimidine series, by comparing the N1 in the red circle for series 2 with a new series of pyridines

N N

Ar4

Ar6H

R

O N

N N

Ar6

Ar2H

R

O

N

N

N

Ar6

Ar4H

R

O

1 23

Bioisosteric Replacement

Bioisosteric Replacement1

23

4

56

34

56

1

2

34

5

61

23 1 3

Page 49: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

49

(series 3). The effect of this substitution on the A3AR affinities was subse-quently investigated, and the results thereof are reported in Fig. 18. The new series contains potent and highly selective A3 antagonists, in a few cases equi-potent to, and in one case more potent than the corresponding pyrimidines. However, in most cases the bioisosteric replacement results in a loss of po-tency. Here, we set out to understand this reduction in affinity for three compound pairs showing the largest effect, indicated by red bars in Fig. 18.

Figure 18: the hypothesized bioisosteric replacement of the pyrimidine to pyridine scaffold (panel A) in the binding site (panel B). In some cases, the change of the scaf-fold introduces large decreases in binding affinity (panel C).

The core scaffold, containing the N1 → CH substitution is located deep in the binding pocket, and there are no residues nearby that could form a hydrogen bond with the nitrogen. Instead, this part of the ligand was hypothesized pre-viously to interact with a water molecule in the binding site [119]. Thus, we compared the water occupancy in the binding pocket of the A3AR -2a and 3a complexes, and used a grid analysis to quantify the water occupancy. We iden-tified a more stable water network in the first hydration shell around the py-rimidine (Fig. 19A, density corresponding to waters W5-W9), compared to the pyridine complex (see lower density in this region in the corresponding map in Fig. 19B). We additionally showed that this water network is highly

Page 50: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

50

similar to the one observed in the high resolution crystal structure of the A2AAR – ZM241385 complex, which was previously shown to be key for high affinity receptor binding [122].

Figure 19. Structure of the complex of pyrimidine 2a (A) and pyridine 3a (B) with the hA3AR. The average water occupancy calculated from unbiased MD trajectories is illustrated as a volume density map (red), with assigned structural waters labeled as referred to in the text.

Finally, we compared the relative free energies of the three pairs using FEP (see Table 2). Both for the unsubstituted as well as the paramethoxy-substi-tuted compounds, we correctly captured the changes in binding free energies. However, the 2j/3j pairs additionally contain an orthomethoxy substitution (denoted pose a-d in Table 2), effectively resulting in four possible conforma-tional minima, each of which could theoretically be accommodated in the binding site. However, the calculated differences in free energy only match experiment when methoxy substituent in L2 is in fact forming an additional hydrogen bond with N2506.55. The differences at the L3 position, which is more solvent exposed, are less pronounced, and in both cases match the ob-served detrimental effect in binding affinity.

Taken altogether, the results from the FEP simulations thus further rein-forces the suggested stabilization of a water network in the binding site from the unbiased MD simulations.

Page 51: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

51

Table 2: Experimental and calculated relative binding free energies between pairs of pyrimidine/pyridine ligands.

Transformation ΔΔGexp (kcal/mol)a ΔΔGcalc (kcal/mol) 2a → 3a 1.24 1.83 ± 0.14 2g → 3g 2.01 3.19 ± 0.10

2j → 3j > 3.09b pose

a -0.52 ± 0.15 b 7.44 ± 0.22 c 5.17 ± 0.19 d -2.59 ± 0.06

aThe relative binding free energies (ΔΔGexp) were calculated from experimentally de-termined Ki values using the relation b No experimental value could be determined (no full displacement at 1uM), and the calculated ΔΔGexp repre-sents the detection threshold.

A2B antagonists The A2BAR is a low affinity receptor that requires micromolar concentrations of adenosine to produce functional signalling and thus remains silent at low extracellular adenosine concentrations [123–125]. Under some pathophysio-logical conditions however, adenosine concentrations will increase, leading to

subsequent activation of A2BAR signalling pathways [123–125]. It has re-cently been shown that the A2BAR is transcriptionally regulated by factors implicated in inflammatory hypoxia, and is moreover involved in cardiac con-tractility, glucose homeostasis, pulmonary inflammation, inflammatory re-

N

N N

NO

O

H

H

Pr

NN

CF33. CVT-712417

N

N

NH

6. LAS3809620

N

N

N NH

NN

O

HNH

Me

O

N Ph

4. OSIP33939118

O

NN

Ki = 6.00 nMKi = 0.50 nM

Ki = 17.00 nM

N

N

NH

O

O

O

8. ISAM-14022Ki = 3.50 nM

N

NO

H

Me

O

O

H

S

N

NN

H

Me

O

O

H

O

NC

7. SYAF08021 9. (S)-SY1K02423Ki (hA2B) = 23.6 nM Ki (hA2B) = 15.1 nM

N

N N

NO

O

H

H

Pr

1. X = Br PSB-1901, Ki = 0.083 nM15

2. X = Cl PSB-603 Ki = 0.53 nM16

SN

N

OO

X

N

N

NN

OPh

Ph

Cl

5. 19Ki = 1.40 nM

Figure 20. Representative structure of A2B antagonists [127,180–187].

Page 52: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

52

sponse and pain [123,126]. Both its unique functional behaviour, and its dis-tinct tissue distribution, have turned increasing attention to the A2BAR as a promising drug target [123,126]. Consequently, this interest yielded a variety of A2BAR antagonists (see Fig. 20). Two papers (VI and VII) in this thesis report the design, pharmacological characterisation and receptor-ligand bind-ing modelling, which I will discuss below.

3,4-Dihydropyrimidin-2(1H)-ones (VI) Recently, we reported a novel series of potent A2BAR antagonists assembled by a Biginelli multicomponent approach (Fig. 20, Cpds 7-9). All of these com-pounds contain a chiral center, which is crucial for their biological profile, as shown below. In this work we performed an exhaustive and multidisciplinary exploration of the structural determinants governing the A2B antagonistic ef-fect of the 3,4-dihydropyrimidin-2(1H)-one (DHPM) scaffold. Starting from our binding model for DHPMs, a large library consisting of 160 derivatives was designed to evaluate the contribution of the different positions in the het-erocyclic core (see Fig. 21). The exploration further investigated the binding mode and hydrogen bond network with N2546.55, as well as different bioiso-steric replacements at position 5. Furthermore, the stereospecific recognition of this compound series was explored. A large scale FEP based exploration was performed on more than 100 ligand pairs, providing key insights in the protein-ligand binding mechanism.

The FEP calculations covered all compound pairs were at least one of the molecules involved showed measurable A2B binding affinities (Ki < 1 μM). The first set of calculations were performed on series I-III to explore the effect of changes at position R1, by the replacement of 3,4-dihydropyrimidin-2(1H)-one to -thione (series I → II), as well as the role of N1 methylation (series I → III). Thus, this first series of calculations evaluated the proposed binding mode and the role of the key hydrogen bonding network with N2546.55 (see Fig. 22 and 23). Initial comparison between series I and III showed drastic decreases in binding affinity due to N1-methylation, presumably by the dis-ruption of the hydrogen bond network. Accordingly, all FEP calculations show a similar decrease in binding affinities, but most important: this effect was only captured in one of the two potential binding modes (see Fig. 23A, MAE = 0.71, as opposed to the alternative model in Fig 23B, MAE = 3.51 kcal/mol). The single heteroatom substitution between series I and II leads to a more spurious effect on ligand binding free energies, which is also observed in the FEP calculations. In most cases both the experimental and FEP data indicate a decrease in affinities, though the most potent compound reported (47) in fact belongs to this series.

Page 53: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

53

Based on these results, we retained pose A, containing two hydrogen bonds with residue N2546.55 and a π-π stacking of the core ring with F173EL2 (see paper IV). These interactions are complemented by an enantiospecific shape complementarity of the central ring with the A2BAR specific V2506.51, providing a possible source of selectivity over the other ARs members, which all have a Leu at this position. The binding orientation determined for DHPMs is indeed analogous to that of previously reported series of tricyclic compounds [127], which were the basis of the ligand series discussed in the next chapter (paper VI).

Figure 21. General structure of previously explored chemotypes (panel A) and newly reported series (Panel B, series I-IX).

N

NO

HR4

Me

O

H

Series IV (Cmpds 105-120)

N

NO

HR4

P

Me

O

H

Series VII (Cmpds 145-148)

N

NO

HR4

Me

O

O

H

R5

Series IX (Cmpds 153-180)

R5

Series V (Cmpds 121-136)

N

NO

HR4

Me

S

O

H

R5

Series VI (Cmpds 137-144)

OEtOEt N

NO

HR4

MeH

Series VIII (Cmpds 149-152)

O

NMe

N

NO

HR4

MeH

NO

R4

MeH

N

N

R4

MeH

N

N

N

R4

MeH

N

N

N

NN

HR4

MeH

NCO

OR5 O

OR5

O

OR5O

OR5O

OR5

3,4-Dihydro-pyrimidin-2-(1H)-ones

3,4-Dihydro-pyridin-2-(1H)-ones

2-cyanoimino-pyrimidines

[1,2,4]triazolo[1,5-a]-pyrimidines

benzo[4,5]imidazo[1,2-a]-pyrimidines

N

NO

HR4

Me

O

O

H

R5

Series I (Cmpds 16-43)

N

NS

HR4

Me

O

O

H

R5

Series II (Cmpds 44-71)

N

NO

HR4

Me

O

O

Me

R5

Series III (Cmpds 72-99)

A: Previous works:

B: Current exploration

N

NO

HR4

H

O

Me

NR5

R5´

N

NO

HR4

Y

MeH

O SO S

General Structure of Series IV-IX

- COR5- CONR5R5´- COSR5- PO(OR)2- Oxazol-2-yl - COOR5

N

NX

HR4

Me

O

O

R1

R5

O SO S

- Me- Et- Pr- i-Pr- i-Bu- t-Bu- BnO, S

H, MeGeneral Structure Series I-III

Page 54: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

54

Figure 22. Effect of methylation on R1. The binding modes considered are depicted in panels A and B for the compound pair 19 (R1 = NH) and 75 (R1 = N-CH3). Panel C depicts the result of the FEP simulations, performed on each binding pose for 10 pairs of compounds. Color code corresponds to the binding pose, blue for pose A and or-ange for pose B. * = No detectable binding.

Page 55: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

55

Figure 23. Binding mode of compounds 19 (subset I, orange) and 47 (subset II, blue), before and after the corresponding FEP transformation in the A2BAR model (top). It can be appreciated the shift in the binding orientation due to the heteroatom change. Experimental and calculated binding free energies between this and analogous com-pound pairs in subsets I and II is depicted in the bar graph (bottom). * No detectable binding

This study also aimed to determine the role of position R5, through the design of bioisosteric replacements of the parent ester (COOR) at this position (series IV-VIII). These series included ketone (subset IV), amide (subset V) tioam-ide (subset VI), phosphate (series VII) or oxazolyl (subset VIII) substituents. In addition, series I was extended with another subset of alkylic substituted

Page 56: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

56

esters at R5. These substituents would be accommodated in the same binding pocket in all series (see Fig. 24 showing the example of SYAF014 (19)). Moreover, it overlays with a pocket of conserved water molecules in the A2AAR binding site [128], some of which were shown to be ‘happy waters’. This pocket is likely relatively small as well as mostly hydrophobic, as we observed a size optimum for the ester substituents in both series I and IX. Indeed, larger hydrophilic changes (in particular series VII) resulted in non-binding compounds. To confirm these qualitative observations, we performed FEP calculations between the parent compounds in series I (SYAF014, SYAF101 and SYAF080) and their counterparts in series IV – IX. This thus included 19 pair comparisons in each set,

Figure 24: based on the previously established model (panel SYAF014) the substitu-ents at R5 for series IV-IX would be accommodated in the binding site similarly

As can be seen in Fig. 25, all substitutions in series IV-IX resulted in unfa-vorable binding affinities compared to the parent compound. The mildest change is observed in series IV, which yields COR substituted functionalities at R5, and some of the original affinity can be recovered for instance in the i-Bu substituted compounds in series IV (e.g. compare compound 108, 240 nM, with SYAF014 (40.8 nM). The series including more hydrophilic and/or larger series in most cases lead to abolished binding affinities, clearly indicat-ing the preference for small hydrophobic substitutions at R5. This is repro-duced correctly in our FEP calculations, with the exception of series VII (COSR): here the FEP calculations show a relative tolerance for the substitu-tion, possibly due to an incorrect representation in the forcefield, or by small imperfections in the homology model. The exploration of the alkylic substit-uent on the ester function in series IX, indicates more subtle changes in bind-ing affinity, and there is a size optimum for medium sized compounds (i-Pr and Et). Whilst some of this SAR is reproduced in the computational model (particularly for the 3 substituted furyl and thienyl series), in the 2 furyl series there is no clear correlation between calculations and experiment. This might point again to deficiencies in the model around the R5 pocket, in particular the

Page 57: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

57

delicate water network in a homology model. Still, even with these limitations the overall agreement of over 100 FEP calculations to experiment is quite good, with a MAE of 1.15 kcal/mol, falling well within values previously re-ported by others for FEP on homology models [129].

Figure 25: experimental (grey) and calculated (orange) free energies between the most potent compound for each R4 substitution (2 and 3 furyl and 3 thienyl). A * indicates values for which no experimental value could be determined (larger than the experimental cut-off, whereas a ǂ indicates that the compound was predicted to be a non-binder.

Finally, we calculated the relative preference for the active stereoisomer. The calculated preferences (S → R, see Fig. 26) of ΔΔG = 6.04 ± 1.06 kcal/mol for the most potent compound 47 (SYAF030) is in line with our previous modeling hypothesis [130]. Our collaborators furthermore experimentally es-tablished that this stereoisomer indeed was the active one within this series (see Table 3.).

Page 58: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

58

Figure 26: binding mode and structure of the two enantiomers of SYAF130

Table 3: experimental affinities of the two enantiomers of SYAF030 and its racemic mixture.

Compound Ki (nM) or % at 1 μμM hA1

hA2A hA2B

hA3

(±)-47 (SYAF030)

15% 18% 10.2 ± 0.5 1%

(R)-47 (R-SYAF030)

10% 6% 12% 2%

(S)-47 (S-SYAF030)

8% 13% 6.30 ± 1.1 1%

ISAM-140 20% 25% 3.49 ± 0.2 2% ZM241385 683 ± 4 1.9 ± 0.1 65.7 ± 1.7 863 ± 4

DPCPX 2.20 ± 0.2 157 ± 2.9 73.24 ± 2.0 1722 ± 11

Page 59: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

59

Trifluorinated Pyrimidine-Based A2B Antagonists The first clinical drug containing a fluorine atom was Fludrocortisone intro-duced in 1954 [131], and since a large number of fluorine-containing com-pounds have been added the therapeutic arsenal. It is estimated that about 25% of drugs contain one or more Fluor atoms [132,133]. The properties of the carbon-fluorine bond and the unique nature of the fluorine atom itself are now well documented – and extensively exploited – in medicinal chemistry [132,133]. Fluorine replacement of hydrogen atoms have shown to exert sig-nificant effects on several structural, pharmacodynamic and pharmacokinetic parameters, leading to improved metabolic stability or optimized ligand effi-ciency [134]. In addition, fluorinated compounds are used in positron emis-sion tomography (PET), and as such have become highly appreciated probes, e.g. in the case of xanthine derivatives for the A2BAR [134]. In this study, we designed, synthesized and characterized, from pharmacological and structural perspectives, a new series of non-xanthine fluorinated A2BAR antagonists.

Figure 27. Design strategy and diversity elements explored during the study.

Based on the monocyclic DHPMs and tricyclic analogues discussed in the previous chapter, we obtained two subsets of fluorinated derivatives, where fluorine moieties were introduced at various positions (see Fig. 27). Several of these ligands were both highly potent (Ki < 15 nM) and selective towards the A2BAR. The design of these molecules was based on models described in the previous section, and their proposed binding modes are depicted in Fig. 28. Based on the obtained experimental data, these models were further re-fined with short MD simulations, and provide an explanation towards the spu-rious effects observed upon fluorination of the parent compound.

N

N

R4

O

H

HMe

O

OR5CF3N

N

R4

O

H

HCF3

O

OR5 N

N

R4

O

H

HMe

O

OR5

N

N

R4

O

O

Me

R3

H

N

N

R4

O

O

CF3

R3

H

N

N

R4

O

O

Me

R3CF3

H

N NN

Model Series I

Model Series II

Series III (Cpds 13a-h) Series IV (Cpds 14a-p)

Series V (Cpds 15a-h) Series VI (Cpds 16a-p)

O

O

S

S

CF3

CF3

CF3

CH3

CF3

CH3

CH3

CH3

R4R5/R3

Page 60: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

60

Figure 28: Effects of trifluoromethylation on ligand binding affinities. A) binding mode of the most potent compounds (ISAM-140 in orange and SYAF-080 in teal). B) Positions of fluorination (R2/R6 and R3/R5), together with the variability at position R4.

Introduction of fluorine atoms at the ester residue (R5/R3) resulted in a more variable effect on the binding affinity of both monocyclic and tricyclic series. While in some cases trifluoromethylation is tolerated, leading to comparable (e.g., compounds 14i, 16a, 16b, 16i, 16j) or even improved binding affinities (14a, 14k) as compared to the corresponding parent compounds in the model series, the rest of the ligands showed at least a 10−fold affinity decrease. In particular the simultaneous introduction of two trifluorinated moieties yielded drastic reductions in binding affinities, in line with the observation from the previous chapter stating that these substituents are accommodated in a very tight pocket. The fluorination at the R2/R6 position proved to be detrimental for binding in all cases, which we found was due to substantial changes in the physical-chemical properties of the heterocyclic cores induced by fluorina-tion. In particular, we predicted that the stronger electronegativity of the CF3 group as compared to CH3 induced a pKa shift of the nitrogen, resulting in a change in tautomerization of the tricyclic core, which was confirmed by the crystal structure of one of the tricyclic compounds. To accommodate the al-ternative tautomer, the sidechain of N2546.55 needs to flip to maintain the dou-ble hydrogen bond, as illustrated in Fig. 29. This alternative rotamer has only been observed in one crystal structure of the A2AAR, in complex with the xan-thine antagonist XAC [135]. Via FEP calculations, we show that this alterna-tive orientation of the residue is mildly unfavourable, 1.3 ± 0.3 kcal/mol. This energetic penalty could therefore explain the systematic decrease in binding affinities for the R6 trifluorinated series.

Page 61: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

61

Finally, the previously mentioned stereospecific recognition of these com-pounds was confirmed via chiral HPLC, circular dichroism, diastereoselective synthesis, molecular modelling and X-ray crystallography, providing further experimental evidence into the stereospecific recognition of these series of A2BAR antagonists.

Figure 29: Binding mode of compound IIb (panel A, orange) and its fluorinated ver-sion 15b in two different tautomeric states (panels B and C). Binding mode of com-pound IIe (panel D, blue) and the fluorinated diasteroisomers (4S,3R)-16i (panel E) and (4S,3S)-16i (panel F).

Page 62: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

62

Site-directed mutagenesis (VIII, IX)

In this chapter, I will describe the two projects that made use of the QresFEP approach. These include examples on the development of protein-ligand bind-ing models for the orphan GPCR GPR139, and a set of mutations on the A1AR in complex with the antagonist DPCPX. In the first case, the calculations are performed on a homology model of the receptor, and iteratively optimized via subsequent steps of docking and FEP calculations. The second case includes a direct application of the protocol on a crystal structure of the A1AR and docking of the compound via flexible-ligand alignment.

Ligand-receptor model optimization for GPR139 A total of 121 GPCRs are non-sensory orphan receptors [136] and have un-known endogenous ligands, which could represent yet untapped targets for novel treatments [137]. One such receptor is GPR139, which is a class A or-phan GPCR [138]. mRNA for this receptor is predominantly expressed in the striatum, habenula and hypothalamus [139–142], and the cross-species ex-pression of GPR139 mRNA in the striatum [139–141,143] suggests that GPR139 may play a role in locomotor activity. Indeed, activation of GPR139 with a surrogate agonist 7c (also known as JNJ-63533054) leads to decreased spontaneous locomotion activity in rats [139]. A recent study by Andersen et al. showed that GPR139 agonists protect primary dopaminergic neurons against MPP+ in vitro [144]. Based on these findings, GPR139 has been hy-pothesized as a potential target for the treatment of diseases with impaired movement control, e.g. Parkinson’s disease.

Furthermore, the GPR139 mRNA expression in hypothalamus and habenula suggests a role in the regulation of food consumption and/or energy expenditure [142]. L-Trp and L-Phe [139,145] activate GPR139, and there-fore the receptor has been proposed as a nutrient-sensing receptor [139,145].

Page 63: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

63

Figure 30: workflow applied for the generation of the protein-ligand complex models.

In support of this hypothesis, the closest homolog GPR142 is also activated by L-Trp and L-Phe [146,147] and activation of this receptor has been shown to lower blood glucose levels and increase insulin secretion in mice [146,148,149] making it a new putative target for treatment of diabetes. Since the GPR139 receptor shares the same ligands and is expressed in the hypo-thalamus it is possible that GPR139 is also involved in the pathophysiology of diabetes. Furthermore, it was recently shown that the endogenous POMC derived peptides ATCH, α-MSH, and β-MSH, known to be involved in energy homeostasis, also activate GPR139 in vitro [150]. Taken together GPR139 has

Page 64: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

64

been hypothesized as a potential target for the treatment of metabolic syn-dromes, e.g. diabetes and eating disorders.

Figure 31: Structure of the GPR139 agonists studied herein. (a) Surrogate agonist 1a from Shi et al. [151] (b) Surrogate agonist 7c from Dvorak et al. [152,153] (c,d) Trp and Phe from Isberg et al. [150]. Coloring denote chemical commonalities (supported by mutations herein); grey: major hydrophobic part, red: polar linkers (1a and 7c) or carboxyls (Phe and Trp), green: hydrophobic element unique for the larger 1a and 7c.

Besides the natural aromatic amino acids L-Trp and L-Phe[139,145] and the endogenous POMC derived peptides [150], GPR139 has been reported to be activated by surrogate small molecules (e.g. 1a and 7c) [151,153–157]. To provide further insights in the binding mode of these GPR139 agonists, a se-ries of mutations were proposed based on an initial GPR139-1a model (see Fig. 30 for a complete workflow). These were based on a commonly shared SAR of these molecules (see Fig. 31) reported earlier [156], which indicated that these were recognized by the receptor in a similar binding mode. From this initial set of mutations, three were selected for further model optimization, which was done via an iterative process of FEP calculations and MD simula-tions around compound 1a. Briefly, the procedure uses the calculated relative binding free energies for each mutation and compares these to the in vitro potency. If these do not match, another iteration of MD and FEP simulations was performed, until the calculated and experimental values matched. After the binding mode of 1a was established, the same analysis was performed on the final model for compound 7c, which indicated that the same residues were responsible for ligand recognition (see Table 4). Indeed, a second round of SDM confirmed that these residues were also involved in the same recognition pattern with compound 7c. Furthermore, both L-Phe and L-Trp could be ac-commodated in the binding site similarly, with their functional groups over-laying those of the reference compounds (see Fig. 32).

Page 65: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

65

Figure 32 a) Binding mode of 1a (blue) and 7c (yellow) and b) endogenous amino acids L-Trp (cyan) and L-Phe (magenta). Mutations that showed a significant effect when mutated are colored orange. Residues with thick sticks have been mutated in silico and in vitro (F1093x33, H1875x43 and N2717x39) and those with thin sticks in vitro only (W2416x48). The latter was excluded due to the dynamic role of this residue as an activation switch in class A GPCRs[158]. Residues colored in grey showed no signif-icant changes in potency (E1083x32) and those in black were not expressed respectively (R2446x51). c) Overlay of all four studies ligands within the GPR139 binding pocket shown as a surface. All tested agonists bind a deep hydrophobic pocket and are shown to undergo hydrogen bonding with R2446x51.

Table 4. GPR139 in silico mutant effects of 1a and 7c binding/. The FEP relative binding free energies that are in agreement with in vitro data are shown in bold for each iteration.

A1AR – DPCPX recognition The A1 AR was the first AR to be characterized [72]. It plays a significant role in the regulation of cardiac, neural and renal systems [159]. Most A1AR are derivatives of the xanthine scaffold, such as those described in paper III. The reference antagonist DPCPX is no exception, and the xanthine core is deco-rated by allowing alkyl substitutions on the 1 and 3 positions and a cyclic

Page 66: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

66

substituent on the 8 positions, which is associated with A1 subtype selectivity. A key selectivity hotspot that accommodates this moiety, is T2707.35 (Met in A2AAR and A2BAR, Leu in A3AR), which was recently confirmed by two crys-tal structures in complex with selective xanthine derivatives [83,160]. In ad-dition, both structures show a slightly widened area as compared to the A2AAR in the bottom of the binding pocket, which accommodates the alkylic substit-uents at position 1. Both substituents tightly lock the core scaffold in one con-formation, as opposed to the dual binding mode observed for caffeine.5

DPCPX was manually docked on the A1AR crystal structure 5N2S, based on its structural homology to the co-crystallized xanthine PSB36 (see Fig. 33). The conserved double hydrogen bond with N2546.55 (discussed in paper IV) formed the main anchoring point between the core scaffold and the receptor, similar to A2AAR.

Figure 33. (A) The average structure from MD simulations of the DPCPX-A1AR com-plex (blue) superimposed with the crystal structure of the same receptor in complex with PSB-036 (PDB 5N2S); and (B) selected residues for in silico mutagenesis studies (see Table 1).

No affinity data is available for DPCPX for the N254A6.55 mutant in the A1AR. However, the fact that many other ligands show severely reduced af-finities for any N6.55A AR mutant (37 receptor–ligand pairs in the GPCRdb, as we described in Paper IV) indicate that this residue is essential for ligand binding, which was confirmed by the FEP calculations (see Table 4). The ar-omatic sidechain of F171EL2 forms the second main anchor for ligand binding (see paper IV), and the Alanine mutant consequently reduced affinity for DPCPX biding [161], which was also reproduced in our FEP calculations. 5 I will discuss the role of substituents in locking a scaffold in one or another conformation in the next chapter

Page 67: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

67

Notably, T277A7.41 is the only mutation resulting in favorable binding affini-ties as compared to WT A1AR. This mutation has been thoroughly studied in ligand binding, and is included as a thermostabilizing mutant in the crystal structure 5N2S [83], and likely associated with active/inactive equilibria of the receptor (see also chapter XI, conformational selectivity). We correctly captured the favorable effect on ligand binding, although our initial FEP cal-culations overestimated this effect by 2 kcal/mol. However, the calculated free energies using the crystal structure with PDB code 5UEN, that did not contain this thermostabilizing mutation in the construct, resulted in much closer values to experiment (Table 5).

Table 5. Changes in free energies of DPCPX for different A1AR mutants. The experi-mental affinities were retrieved from the GPCRdb and the reference(s) are given in brackets. The binding affinity (pKD) was converted to ∆∆G following:

Mutation ∆∆G (kcal/mol) In Vitro In Silico F171AEL2 [161] 4.32 a 6.15 ± 0.69 I175AEL2 [161] 0.98 0.26 ± 0.39 M177A5.37 [161] 1.06 1.84 ± 0.46 N254A6.55 ND b 4.14 ± 0.67 T270A7.34 [161] 0.46 0.40 ± 0.48

T277A7.41 [116,117,162] −0.32 ± 0.19 c −2.77 ± 0.55 −0.42 ± 0.54 d

a No detectable binding, the value represents the detection threshold of the experiment; b No experimental value determined in literature; c An average value and associated s.e.m. were calculated based on the reported values from literature (n = 3); d Calcula-tions performed on the 5UEN crystal structure.

Page 68: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

68

Two sides of the same coin (X)

In this chapter, I will describe how we have used both QligFEP and QresFEP to understand the binding mechanism of a series of antagonist for the A2AAR. This receptor is associated with a number of pathologies [72,75,163], with an increasing interest for antagonists in immuno-oncology over the last few years. As a consequence, many antagonists have been developed targeting this receptor, of which a number have been co-crystallized with the A2AAR. As such, this receptor stands out as one of the best characterized receptors from the structural point of view and together with significant amounts of SDM data [164], provide unique insights in ligand binding mechanisms. In turn, these provide valuable information in structure-based drug design (SBDD) programs of antagonist molecules [165]. Still, there has been limited success to reach the drug market, motivating the search towards novel chemical as A2AAR antagonists [166]. Frequently such searches are done via high through-put screening (HTS) campaigns, in which case it is uncommon to obtain a

Figure 34: Binding mode and chemical structures of antagonists ZM241385 (A) andtriazine 4b (B), both co-crystallized with the A2AAR (ribbons, corresponding to the 4EIY structure)[80] and characterized by BPM (residues labelled and depicted ingrey sticks). Receptor-ligand hydrogen bonds are depicted as magenta lines.

Page 69: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

69

crystal structure of the receptor-ligand complex. Instead, approximate binding modes are often inferred from the experimental data extracted from SAR or SDM data [164], which can be complemented by computational models of the protein-ligand complex [167].

Here we combine the qualitative mapping of these data to QresFEP, where the correlation of the FEP values to experiment is used as a filter to identify the most probably binding mode. An initial test was based on two ligands for which a protein-ligand complex was available, namely ZM341385 [80] and Triazine 4g (see Fig. 34) [168,169]. The calculated values are in excellent agreement with experiment (see Table 6, MAE = 0.41 and 0.94 R2 = 0.94 and 0.66 respectively)

Table 6: Comparison between experimental and calculated relative binding free en-ergies ( , in kcal/mol) for A2AAR mutants.

Mutant ZM241385 Triazine 4g I66A2.64 0.14 0.83 ± 0.34 0.41 1.94 ± 0.34 L85A3.33 2.45 3.30 ± 0.41 1.09 1.65 ± 0.37 L167A5.28 0.00 0.60 ± 0.31 -0.14 -0.39 ± 0.36 M177A5.38 0.14 -0.09 ± 0.44 -0.27 1.66 ± 0.49 N181A5.42 1.23 1.47 ± 0.57 0.82 -0.63 ± 0.55 N253A6.55 ≥ 5.86c 5.81 ± 0.57 ≥ 4.36c 5.64 ± 0.56 Y271A7.36 1.09 0.84 ± 0.74 0.41 -0.1 ± 0.68 a Free energies are calculated as and errors are ± 0.1 kcal/mol. [170] b Data for the mutant receptor constructs reported in [170]. c Binding affinity of the ligand to the (mutant) receptor was lower than the experimental thresh-old (pKD < 5 in all cases). Errors are SEM over a total of 10 replicates.

After the performance on known binding configurations was established, we applied the approach to the chromone series, starting with the most potent compound Chromone 14 (see Fig. 35). During this optimization, two putative binding modes could explain the BPM data, which were related via a sym-metry axis along the bicyclic chromone core, which would possibly allow the swapping of the substituents at the R6 and R7 position (see below). The dock-ing scores for both poses were energetically equivalent (see Fig. 36), though some SAR data indicated that pose A might be the most favorable [93], which at the time could not be validated by X-ray crystallography [93,171].

Page 70: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

70

Figure 35: Putative binding modes A (orange) and B (cyan) of Chromone 14 to the A2AAR (H-bonds in magenta). (C) Experimental and calculated changes in binding free energies for each mutation in the BP. The error bars correspond to the s.e.m. of the replica calculations for the calculated values, or are adjusted to the reported value of 0.1 pKD unit in the case of experimental data [170].

Thus, both binding poses were used as a starting point for the FEP calcula-tions. We found pose A to have the best correlation to the experimental data with a MAE of 0.50 kcal/mol and R2 = 0.74 Conversely, the corresponding values calculated on pose B are much higher (MAE = 1.53 kcal/mol) and the correlation is completely lost (R2 = 0.03). Based on these findings, a series of compounds were designed which included the introduction of a methyl group

Page 71: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

71

at R2 (see Table 7). This lead to severely reduced affinities, caused by a steric clash with N2536.55 in binding mode A. Moreover, we showed that positions at R6 and R7 were not interchangeable, and consequently that the two poses are not related via the proposed symmetry axis from the docking experiments.

Figure 36: A) Dual binding mode of caffeine, as extracted from the A2AAR crystal structure with the A2AAR (PDB code 5MZP). Color code is orange (binding mode A) and blue (binding mode B). (B) modelled binding modes of chromone 4a, following the same coloring scheme as in panel A.

Thereafter, we studied the energetic differences between poses A and B di-rectly. This can be done easily in QligFEP, as the dual topology allows a direct comparison between unrelated topologies (see paper I and III). First, we stud-ied the binding modes of two well-known A2AAR antagonists, caffeine and theophylline. The first was shown to adopt an isoenergetic dual binding mode [83,172,173] as discussed above, see also Fig. 36A. The N7-demethylated an-alogue theophylline however forms an additional H-bond with N2536.55. The negligible calculated free energy difference between the two poses of caffeine agrees with the experimentally determined dual binding mode. In contrast, the single binding mode observed in the crystal structure of theophylline is ener-getically favored by 1.6 kcal/mol. A similar energy gap was observed between the two binding poses considered for the simplest chromone in our series (4a, Fig. 36B), with pose A being 1.7 kcal/mol more favorable than pose B. Nota-bly, this energy gap increases significantly for the highest affinity compound 4d, suggesting an optimal anchoring of the R6 acetamide and R7 n-propyl sub-stituents in pose A. Furthermore, the observed effect of methylation in pose A is in excellent agreement with experiment, for both the caffeine → theophyl-line and the chromones 4a → 5a and 4d → 5d (see Table 7). In the latter case,

Page 72: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

72

a shift of the core scaffold was observed to accommodate the methyl group, which was in agreement with the X-ray crystallography data (see below).

Table 7: Calculated free energy difference between two alternative poses for A2AAR antagonists. Energies are in kcal/mol.

A → B H → CH3 Ligand ΔΔGcalc ΔΔGexp ΔΔGcalc a

Caffeine 0.47 ± 0.49 - - Theophylline 1.59 ± 0.87 0.6 b 0.31 ± 0.32

4a 1.65 ± 0.93 1.16 ± 0.1 c 2.06 ± 0.72 4d 8.76 ± 0.81 3.68 ± 0.1 d 1.66 ± 1.09

a Calculations performed in pose A (see text). b ΔGbind (caffeine – theophylline), ex-tracted from ChEMBL[174]. c ΔGbind (5a – 4a) and d ΔGbind (5d – 4d), from Table 2.

Finally, the binding mode of two chromone compounds, 4d and its 2-methyl analogue 5d, were crystallized in complex with the A2AAR-StaR2. The struc-tures were obtained following the in meso soaking approach [71], and could be refined down to a resolution of 1.92 and 2.13 Å respectively. Overall, the structure of the A2AAR receptor is highly similar to previously solved struc-tures (see Fig. 37A), with an RMSD of 0.46 Å for the Cα trace as compared to the ZM241385 structure (PDB 4EIY, calculated using PyMOL [23]). The two structures show clear positive omit density at 1σ for the presence of the chromone compounds in the orthosteric binding site (Fig. 37B and C), in both cases adopting binding mode A (Fig. 37D and 4E). Given the good resolution of the structure, we could observe a well-defined water network in the inter-face between the carbonyl moiety of the chromone core and the receptor (Fig. 38D). While most water positions are comparable to waters in the A2AAR-ZM241385 ligand 4d displaces a number of water molecules in the first layer.

Some of these waters were previously associated to a high-energy or “un-happy” state [171], which could partially explain the high affinity of this par-ticular compound.

Page 73: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

73

Figure 37. Crystal structure (orange) and modelled coordinates (cyan) of (A) the highest affinity compound 4d and (B) the methylated derivative 5d (panel B) with the A2AAR. H-bond interactions are indicated in magenta.

The high affinity ligand 4d shows excellent agreement with the computational model (see Figure 38A) with an RMSD of 0.67 Å between the docked and experimental poses, and is in agreement with previous binding hypotheses for this compound [93,171]. The A2AAR-5b complex (Fig. 37D and Fig. 38B) further shows that methylation at position R2 displaces the chromone core in pose A by 1.79 Å, in accordance with observations from the FEP simulations (Table 7 and Figure 38B, RMSD = 1.24 Å between docked and experimental ligand configuration).

Figure 38: computationally derived (teal) and experimentally determined (orange) binding modes of chromone 4d (panel A) and 5d (panel B)

Page 74: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

74

Conformational selectivity (IX, XI)

Traditionally, the activation of a GPCR has been seen as a two state pharma-cological model [175], which assumes that receptors transition between two distinct conformations, i.e. the inactive and active states (see Fig. 39A). Fol-lowing the ‘allosteric ternary complex model’, the equilibrium between these states can be modulated both by the intracellular signalling protein and the signalling ligands (Fig. 39B) [40]. In addition, certain mutations in the recep-tor can also alter its conformational stability (Fig. 39C), enhancing or reducing the receptor’s basal activity.

Our current understanding of GPCR activation has tremendously benefitted from the elucidation of GPCR structure-function relationships, largely fuelled by the rise in experimentally determined structures. In the previous chapters I discussed how receptor structures, SDM and SAR data can be integrated, via free energy computations, to provide further understanding in receptor-ligand binding. I showed that we can calculate changes in binding affinity between ligand pairs (QligFEP) or between wild type and mutant receptors (QresFEP) to achieve this goal. In this chapter, I will show how we can extend the ap-plicability of these methods to study receptor equilibria, and how these can be affected by modifications of either ligands (Fig. 39B) or receptor structures (Fig. 39C). This concept is generally applicable to any protein that transitions between clear end state conformations, following an equilibrium that we can represent through thermodynamic cycles. In this chapter, I focus on the change in conformational equilibria of the A2AAR caused either by agonist ligands with different efficacies, or by single point mutations. Taken together, these calculations provide novel insights in the structure-function determinants of activation of this receptor.

Page 75: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

75

Figure 39: A) two state model of GPCR activation, in orange the inactive structure and in blue the active structure bound to a G protein. B) Experimentally (horizontal legs) a given ligand L2 shifts the receptor equilibrium to the inactive state (left shit), and is a stronger antagonist than a ligand L1 (thicker arrow). C) the receptor equilib-rium can be shifted for a certain mutant construct of the receptor; in this case the mutant shifts the receptor to the inactive states (thicker arrow). In both cases, a ther-modynamic cycle can be closed to link experimental values to computational results.

Ligand conformational selectivity In our first study in the area of GPCR conformational selectivity, we aimed at the prediction of ligand relative affinities for a receptor state (Fig. 39B), to explain different ligand efficacies. We explored the effect of varying substit-uents in two series of molecules, where the pharmacological profile is modi-fied (i.e. antagonists that changed to agonists) upon relatively minor structural changes, in particular addition of hydroxy substituents. It has previously been shown that these substituents on the ribose group of adenosine and NECA are

Page 76: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

76

key for receptor activation [176]. Thus, as a control the conformational pref-erence of this pair of full agonists was calculated. In this case, we used the antagonist 9-cyclopentylpurin-6-amine as a reference, which is a similar com-pound to adenosine, but the ribose is replaced by a cyclopentyl group (code CHEMBL294590, see Fig. 40). As expected, in both cases removing the -OH groups drastically reduced the relative affinity for the active state (see Fig. 40). A larger effect is observed for NECA as compared to adenosine, in agree-ment with the higher potency of the former for the A2AAR. Next, we investi-gated the agonistic properties of a series of 7 (Prolinol N yl)-2-phenylami-nothiazolo[5,4 d]pyrimidines, with a partial agonist profile associated to the prolinol substituent and the corresponding pyrrolidine substituted compounds as neutral antagonists [177]. The efficacy of the partial agonists was modu-lated by decorations on R1 (see Fig. 40). In agreement with these experimental data, we found that prolinol decorated compounds showed conformational preference for the active state of the A2AAR, with the most potent compound in the series (10m) showing the highest relative affinity towards this active state (see Fig. 40).

Based on these encouraging results, we moved on to investigate a different series based on the non-ribose partial agonist LUF5833. In this case, hydroxy decorations on the 4-phenylpyrimidine scaffold lead to agonists with different efficacies, depending on the positioning of this group (i.e. p-OH being less potent than m-OH) [178]. It was earlier postulated that these compounds in some cases interact with different residues in the binding site than the full agonist CGS21680, but maintained some interactions with residues related to the stabilization of the active state of the receptor. Interestingly, the undeco-rated compound and its two hydroxylated counterparts show the same overall affinity (in terms of Ki), so one can hypothesize that this structural effect can be directly associated with the relative conformational affinity for one recep-tor state over the other. In this case, the p-OH substituted compound showed similar agonist efficacy as the undecorated LUF5833 compound of about 55% activation (as compared to the potent full agonist CGS21680 used as a refer-ence in the study). However, the m-OH shows a significant increase in po-tency, and this is correctly matched by the increased preference of this com-pound for the active state observed in our calculations. The structural reason in terms of ligand-receptor interactions is a hydrogen bond with T883.36, which is well known to be involved in agonist recognition [164].

Page 77: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

77

Figure 40: Top, ligands studied in this chapter. Bottom: relative free energies between the agonist/antagonist pair. * refers to CHEMBL294590, ** to LUF5833

The effect of mutations on conformational selectivity As mentioned before, agonists typically interact with specific residues in the binding site associated with receptor activation. As we discussed in paper IV,

Page 78: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

78

mutagenesis and modeling studies indicated that the full agonist CGS21680 engaged different residues in the binding site than the partial agonist LUF5834 [102]. Two of those mutations, T883.36A and S2777.42A, showed dis-tinctly different effects on both receptor basal activity and ligand efficacies. Whilst the T883.36A mutation reduced the basal activity, little effect was ob-served as a consequence of the S2777.42A mutation. It is worth noting that the T883.36A mutation was previously identified as crucial for stabilization of the inactive state of the receptor [179]. Furthermore, the binding affinity and ag-onist potency of CGS21680 was severely reduced for both mutant receptors, whereas the potency of LUF5834 was increased minorly (see also Table 8). The S2777.42A mutation furthermore increased the efficacy levels of LUF5834 to near full agonist levels.

Table 8: experimental efficacy data taken from [96]

CGS21680 LUF5834 Construct pEC50 fold change pEC50 fold change Efficacya Wildtype 7.70 ± 0.1 1 7.8 ± 0.2 1.0 41 ± 4 T883.36A 4.5 ± 0.0 1700 7.9 ± 0.1 0.8 N.D.b S2777.42A 5.6 ± 0.1 110 8.2 ± 0.2 0.4 94 ± 2 a Percentage of activation compared to CGS21680 b Could not be measured due to low activation of the receptor by CGS21680.

From a computational perspective, we can actually build different thermody-namic cycles to calculate the effect of a mutation along different stages of the GPCR activation pathway. For instance, by combining calculations of either the active or inactive states with the unfolded state (represented by a tripeptide system) we can determine the change in stability of that conformation of the receptor. If we compare calculations within the inactive and active states, we can directly get insight in the basal activity of the receptor, since the legs in the unfolded state would in this case cancel. Finally, we already showed that we can calculate the effect of the mutation on ligand binding to the antagonist state, and here calculate those effects on the active state of the receptor (see Fig. 41).

Page 79: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

79

Figure 41: thermodynamic cycles to predict protein stability (blue), basal activity (or-ange) and efficacy (green). The various vertical legs include FEP calculations of a protein mutation in various systems. The unfolded state (U) is represented by a tripep-tide model, whereas the inactive (R) and active (R*) states are based on pseudo-apo structures of the experimentally derived inactive and active receptors. The holo state (R*-L) in this case includes the (partial) agonist under investigation.

To provide structural-energetic insights in the role of the T883.36A and S2777.42A mutations in receptor activation, we calculated the energetic effects of both mutations in four models. These included free energies of protein sta-bility in general (by taking into account the unfolded state, see Fig, 42), the free energy changes between mutant and wildtype inactive and active recep-tors (basal activity), and finally the role of each individual mutation on ligand binding to the active state of the receptor (see Table 9). Compared to the un-folded states, the only significant difference was found for the T883.36A muta-tion, which resulted in unfavorable free energies (3.98 ± 0.18 kcal/mol) for

the active state, and was predicted to destabilize the active (but not the inac-tive) state of the receptor. Contrarily, in all other cases, the mutations showed no energetic effect on stability (see Table 9).

Figure 42: the stability of a given receptor state can be calculated by taking into consideration the unfolded state. Note that in this case, the two unfolded legs willcancel and we can calculate the effect on basal activity directly.

Page 80: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

80

Table 9: effects of the two mutations on the unfolded (U), inactive (R), active (R*) states of the receptor. By nature of the thermodynamic cycles used, the unfolded states will cancel and we can calculate the effect on basal activity by taking the difference between inactive and active states (ΔΔGR* →R) directly. Mutation ΔΔGU ΔΔGR ΔΔGR* ΔΔΔGR* →R T883.36A -2.26 ± 0.09 -2.64 ± 0.29 1.34 ± 0.16 3.98 ± 0.35 S277A -7.50 ± 0.20 -7.37 ± 0.59 -7.34 ± 0.06 0.03 ± 0.67

Next, we analyzed the effect on partial agonist binding for these two muta-tions. By combining the effects of the same mutation on conformational sta-bility (see Table 9), we can estimate how much the ability of the ligand to bind the active state of the receptor is affected (see Fig. 43). The effects are more pronounced than in the previous analysis on receptors conformational stabil-ity, and in all cases indicate a significant difference in ligand efficacy between wild-type and mutant receptors, though it clearly affects each ligand in a dif-ferent way: The effect on the efficacy of CGS21680 is unfavorable in both cases, whereas the mutations favorably influences the predicted efficacy of LUF5834 (see Table 10).

Table 10: Calculated effects between active and inactive receptor states (ΔΔGR* →R), and on the relative affinity of each ligand for the mutant (active) receptor (ΔΔGR*-L →

R*). Combining these values gives the difference in efficacy of a ligand between wildtype and mutant receptors (ΔΔGefficacy).

CGS21680 LUF5834

Mutant ΔΔGR* →R ΔΔGR*-L → R* ΔΔGefficacy ΔΔGR*-L → R* ΔΔGefficacy T883.36A 3.98 ± 0.35 2.17 ± 0.23 5.77 ± 0.40 -4.21 ± 0.26 -0.61 ± 0.40 S2777.42A 0.03 ± 0.67 1.72 ± 0.31 1.88 ± 0 .53 -1.97 ± 0.30 -1.81 ± 0.60

However, the effect of the T883.36A mutation on the efficacy of LUF5834 be-comes negligible, since the increase in relative affinity for the mutant’s active state is counterbalanced by a decrease in the relative stability (i.e. the popula-tion) of the active state due to the same mutation. The experimental effect of this mutation on the efficacy of this ligand could not be determined, due to the low activation levels of the reference ligand CGS21680 for this mutant (see below). Conversely, the S2777.42A mutation did not affect the preference for either conformational state, and consequently the favorable effect on binding of LUF5834 predicted for this mutant receptor explains the experimental ob-servation of an increased efficacy of this ligand upon receptor mutation. No-tably, the unfavorable effect of the T883.36A mutation in both the relative sta-bility of the active receptor as well as on the CGS21680 binding to the active conformation yields large detrimental effects on the ability of this ligand to

Page 81: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

81

activate the receptor, which is in perfect agreement with the observed experi-mental effect of more than 103 fold reduction in potency (Table 8).

Figure 43: by combining information on the distribution of inactive-active states of the receptor in combination with the effects on ligand binding, the total change in ligand efficacy can be calculated.

The results of this chapter thus support the presented conceptually new ther-modynamic cycles, and show how FEP calculations can yield insights into receptor stability and activation. Both basal activity and ligand associated sig-naling can be understood by considering end-point conformations, in analogy with the pharmacological models that relate end-point experimental measure-ments. Our approach combines ligand perturbations, comparing agonists and antagonists differing on chemical groups, with point mutations examined on different states (i.e. folded, inactive, active and ligand-bound), to examine re-ceptor stability and its modulation by ligand binding. This approach could be useful in the design of conformationally selective ligands, with tailored phar-macological profiles, and provide new insights in receptor activation.

Page 82: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

82

Conclusions

The family of GPCRs have a large therapeutic potential, and our insights in determinants of GPCR binding have increased significantly as a consequence of a near exponential growth in the availability of experimentally determined structures. In addition, a substantial amount of data is nowadays available through public databases. These data provide us with excellent starting points for the characterization and identification of (small-molecule) modulators of this receptor family via computational approaches. In addition, the associated methods have benefited from significant efforts in software and algorithm de-velopment, alongside increases in computational power. Examples of the de-velopment of such software is given in papers I-III, which report our QligFEP and QresFEP approaches. Both methods have extensively been tested on var-ious benchmark sets.

In addition, these approaches have been used to understand protein-ligand binding mechanisms for a prototypical GPCR family of adenosine receptors, which was introduced in paper IV. The methods presented here are based on the design of thermodynamic cycles for different goals: to predict binding free energy differences between ligand pairs (paper V-VII and X), or to estimate the effect of mutations on the binding of one ligand (papers VIII-X). Finally, a new application of thermodynamic cycles is presented to understand GPCR pharmacology, and to elucidate the preference of a compound for a specific activation state of the receptor (conformational selectivity, papers IX and XI).

The first protein-ligand binding case study (paper V) included the identifi-cation of a series of new A3AR antagonists. Here, the role of a single nitrogen substitution between pyridine and pyrimidine-based scaffolds was examined, and related to the role of water molecules in the binding site. The approach combined unbiased MD with FEP simulations around three compound pairs where significantly altered affinities were observed. Papers VI and VII de-scribe the optimization of A2BAR antagonists, where our calculations assist the synthetic chemistry of our collaborators. In paper VI, more than 200 com-pounds were synthesized and used to refine an in-house protein-ligand binding model. This included more than 100 FEP simulations using QligFEP, which provided clear evidence for a conserved binding mode of this series of antag-onists. Further biological experiments additionally point to promising anti-carcinogenic properties of these compounds. In paper VII, the introduction of fluorine containing substituents was studied on the monocyclic compounds described in paper VI, as well as on a related series of tricyclic compounds

Page 83: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

83

previously reported. Here, the role of the fluorine moieties on the pKa of ni-trogen atoms in the heterocycle was analyzed, and related to changes in bind-ing free energies for these series.

The effect of single-point mutations on ligand binding free energies was examined with QresFEP on a variety of cases. In paper VIII, this approach was used to generate and further optimize a binding hypothesis for surrogate agonists to the orphan GPR139, a receptor without an experimental crystal structure. An initial series of experimental mutations indicated several resi-dues crucial for the binding of one of these reference ligands. Our FEP calcu-lations revealed the binding mode, and indicated that this was indeed a com-mon mechanism for other ligands extracted from literature. These findings were later confirmed by further SDM experiments, pointing to a common binding mechanism for these compounds. In paper IX, we examined a series of mutations affecting the affinity of the A1AR antagonist DPCPX, on the basis of a recently published crystal structure of the A1AR of the closely re-lated compound PSB36. We determined that a number of common residues in AR recognition (as revised in paper IV) were also involved in the recognition of this ligand.

Paper X provides an excellent example of the complementarity of QigFEP and QresFEP to characterize ligand binding modes. A series of chromone an-tagonists for the A2AAR was explored, which revealed challenging due to the potential dual binding mode of the core scaffold. By examining the SDM data with QresFEP, the most probable orientation was identified. To confirm this binding mode, additional compounds were designed and analyzed with QligFEP, to explore the underlying SAR around this scaffold. The experi-mental SAR obtained supported the binding mode hypothesis, which was fi-nally confirmed by X-ray crystallography of two compounds in the series.

Finally, in paper IX and XI, we explored the applicability of FEP to eluci-date ligand pharmacological profiles. In molecular terms, this was translated in determination of ligand preferences for either the inactive or active confor-mations of the A2AAR. We initially showed that conformational selectivity could be calculated between compound pairs, e.g. between an agonist and an-tagonist. We also showed how thermodynamic cycles could be used to inter-pret the effect of point mutations on the receptor’s basal activity by comparing the active and inactive conformations of wildtype and mutant receptors. More-over, this approach could be expanded to explain how such mutations could affect the pharmacological profiles of full and partial agonists. The encourag-ing correlation found between our calculations and the experiments shows the applicability of this new FEP approach to examine these pharmacological problems.

Taken together, in this thesis I described two new FEP methods, QligFEP and QresFEP. These methods have been applied to solve questions related to ligand binding, site directed mutagenesis and conformational selectivity. As a result, we provide rational understanding of mechanisms of ligand-receptor

Page 84: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

84

modulation and assisted in the design of new ligands with interesting proper-ties for the AR receptor family. However, these methods are broadly applica-ble to many proteins and receptors, and this work provides an initial explora-tion thereof.

Page 85: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

85

Populärvetenskaplig sammanfattning

G proteinkopplade receptorer (GPCR) är membranproteiner som omvandlar signalerna från extracellulära ligander, såsom hormoner, neurotransmittorer och metaboliter, till en lämplig cellulär respons. Denna respons förmedlas ge-nom intracellulära G proteiner, en familj av proteiner som består av tre suben-heter α, β (båda interagerar med GPCR) och γ. En annan signalväg innefattar rekrytering av β-arrestin. GPCR:er förekommer rikligt i den mänskliga fysio-login, med över 800 gener som kodar för fem GPCR-klasser. Cirka 34% av de marknadsförda läkemedlen riktar sig mot en GPCR. GPCR:er delar samma topologi och genomgår konformationella förändringar vid aktivering för att ackommodera bindningen av de intracellulära signalproteinerna. Även om denna aktiveringsmekanism är relativt bevarad, varierar ligandbindningsstäl-let kraftigt mellan receptorfamiljer. Under det senaste decenniet har våra in-sikter i strukturella determinanter för ligandbindning och receptoraktivering ökat enormt. Följaktligen täcker de tillgängliga strukturerna nu många GPCR:er. 3D-strukturer av GPCR-ligand-komplex ger avgörande insikter i en förenings bindningsmekanism. En majoritet av dessa receptorer saknar dock fortfarande struktur, och baserat på ett hur ett typiskt läkemedelsdesignprojekt är designat är det osannolikt att sådana 3D-strukturer kan genereras för alla undersökta föreningar. Protein-ligand-komplex kan då förutsägas genom be-räkning av ligandens potentiella bindningsläge, till exempel via dockningsal-goritmer. Även om dessa algoritmer är ganska effektiva för att generera möj-liga protein-ligand-komplex, misslyckas de vanligtvis med att göra en exakt beräkning av ligandens faktiska bindningsaffinitet. Därför är det lämpligt att ytterligare validera ligand-receptor modellen med användning av experimen-tella data, t.ex. genom att kartlägga känd (experimentell) information om re-ceptor-ligand-komplexet. Sådana experimentella data inkluderar bland annat strukturaffinitetsrelationer inom en sammansatt serie, eller mutagenes av för-modade aminosyror vid bindningspositionerna. För att få ytterligare insikter i bindningsdeterminanter kan dessa data kompletteras med rigorösa energibe-räkningar, såsom metoden med fri energiperturbation (FEP). Här kan de be-räknade (relativa) fria energierna sedan användas för att korrelera beräknings-bestämda bindningsmetoder till experimentell avläsning.

I denna avhandling diskuteras utvecklingen och tillämpningen av beräk-ningsmetoder för att karakterisera och förstå ligandbindning. Arbetsflöden för att beräkna relativa fria energier för bindning av ligander till vildtyp (Qlig-FEP) och mutant (QresFEP) presenteras. Dessa protokoll applicerades för att

Page 86: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

86

karakterisera ligandbindning hos den prototypiska GPCR-familjen av adenos-inreceptorer, och en omfattande överblick av dessa receptorer från ett struktu-rellt och mekanistiskt perspektiv ges. Familjen av adenosinreceptorer består av fyra undertyper, nämligen A1, A2A, A2B och A3. Dessa receptorer förmedlar många processer i kroppen; i det centrala nervsystemet utövar de depressiva, antikonvulsiva och sömnfrämjande effekter. De är dessutom associerade med antidiuretiska, negativa inotropa, negativa kronotropiska, antiinflammato-riska, immunsuppressiva och angiogena effekter. Dessutom ger de patofysio-logiska effekter vid hjärt-kärlsjukdomar och neurodegenerativa sjukdomar, såväl som vid cancertillväxt och immunsvar. Även om intresset för denna re-ceptorfamilj traditionellt har varit stort, har förmågan att föra effektiva läke-medel till marknaden varit relativt begränsad. Hittills har FDA godkänt A2A-agonisten Regadenoson (koronarkärlsvidgande medel som används vid hjärt-avbildning) och antagonisten Istradefylline (för behandling av Parkinsons sjukdom i kombination med Levodopa), som först godkändes i Japan. Däre-mot har det enorma intresset för denna receptorfamilj resulterat i en betydande mängd experimentella data. Dessa inkluderar en stor mängd lägesspecifika mutagenesstudier (över 2500 datapunkter har samlats in i denna avhandling). Dessutom samlas en stor mängd data i offentliga databaser. Framstegen inom membranproteinteknik och kristallografi har gett upphov till en kraftig ökning av experimentella GPCR-strukturer, och adenosinreceptor-familjen är en av de mest karaktäriserade familjerna ur strukturell synvinkel.

QligFEP användes i en fallstudie om protein-ligand-bindning som inklude-rade identifiering av en serie nya A3 adenosinreceptorantagonister. Här analy-serades rollen hos en enda kvävesubstitution mellan pyridin- och pyrimidin-baserade strukturer. Vattenmolekylernas påverkan på bindningsläget studera-des med hjälp av molekyldynamik (MD). Dessutom användes FEP för att ana-lysera tre av föreningarna som uppvisade signifikant olika fria bindingsenergier för matchade-molekylpar. Två andra kapitel beskriver (del av) en storskalig optimering av A2B adenosinreceptor antagonister. Totalt syn-tetiserades och användes fler än 200 föreningar för att förfina en modell av intern proteinbindning. Detta inkluderade omfattande FEP-simuleringar (> 100) med användning av QligFEP, vilket gav ytterligare bevis för ett t konser-verat bindningsläge för denna serie antagonister. Ytterligare biologiska expe-riment pekar dessutom på anti-cancerframkallande egenskaper hos dessa för-eningar. Dessutom studerades införandet av fluorinnehållande substituenter på de monocykliska föreningarna, såväl som tidigare rapporterade tricykliska föreningar. Här analyserades fluorgruppernas roll på pKa hos kväveatomer i heterocykeln och hur de är relaterade till förändringar i fri bindingsenergi hos dessa serier.

Därefter visas tillämpningen av QresFEP i karakterisering av fria energier för bindning av en ligand för vildtyp- och mutantreceptorerna. Detta användes för att generera och optimera en hypotes för protein-ligand bindning för

Page 87: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

87

GPR139, för vilken ingen kristallstruktur är tillgänglig. En initial serie mutan-ter indikerade att några få aminosyror var avgörande för bindning av en av de tillgängliga referens liganderna, som genom FEP visade sig vara en vanligt förekommande mekanism hos andra ligander kända från litteraturen. Dessa fynd bekräftades senare genom ytterligare mutagenes experiment, vilka indi-kerade en gemensam bindningsmekanism för dessa föreningar. Därefter utför-des en serie mutationer på en kristallstruktur i A1AR och en modell genererad genom dockning av DPCPX-A1 adenosinreceptorkomplexet. Här analyserade vi om frekvent förekommande aminosyror är involverade i ligandigenkänning för just denna ligand.

Vi kombinerade sedan både QligFEP och QresFEP för att karakterisera bindningsläget för en serie chromoneantagonister för adenosinreceptorenA2A. En inledande undersökning pekade på ett potentiellt dubbelt bindningsläge och genom att korrelera LSM-data till FEP identifierades den mest troliga ori-enteringen. För att bekräfta detta bindningsläge genererades ytterligare före-ningar för att utforska underliggande struktur-aktivitetssamband (SAR) som igen pekade på ett bindningsläge. Denna bindningsmod bekräftades slutligen via röntgenkristallografi, som gav experimentella bindningspositioner för två av föreningarna i serien.

Slutligen undersökte vi användbarheten av FEP vid bestämning av prefe-renser hos en ligand för en inaktiv eller aktiv konformation av A2A adenos-inreceptorer (konformationell selektivitet). Här kan den relativa preferensen för ett givet receptortillstånd beräknas för ett sammansatt par, t.ex. mellan en agonist och en antagonist. Här bör preferensen för agonisten vara högst för receptorns aktiva tillstånd. Dessutom kopplade vi mutagenes data till både basaktiviteten (genom att ansluta de aktiva och inaktiva vildtyp- och mutant-tillstånden i den termodynamiska cykeln) och interaktionen mellan fullstän-diga och partiella agonister med specifika aminosyror på bindningsstället.

Page 88: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

88

Acknowledgements

It is not easy to capture five years of scientific collaboration and friendship in approximately two pages (correction three pages), and although these words are written sincerely, they will almost certainly fail miserably in describing my appreciation to everyone who helped see this work come to light.

First of all, to Hugo, friend, company co-founder and not to forget super-visor of all you have read in this thesis. We have shared not only our love and appreciation for science, but plenty a beer, story and occasional jam (with mixed success). To Johan, who gave the final push to join the group at a very cold (yes, it was once cold in Sweden!) julbord, thanks for all the inspirational late-night - sometimes scientific- discussions. You both have taught me all I know, but definitely not all you know.

A sincere thanks to all the collaborators involved in this work. To the team in Leiden, for giving me an office far away from office. Gerard, I appreciate all the time you have spent being my ‘unofficial’ third supervisor, and for be-ing a great mentor. Ad, Daan and Laura, thanks for all the help and insightful discussions. Bart, thanks for sending me off to Uppsala, for the scientific hangout chats, and of course my first scientific paper. Lindsey, thanks for not kicking my ass and for the great times we had in Leiden (and in Uppsala, albeit brief). Brandon and Olivier, the office is not the same without the both of you, and I mean not only acoustically. Yes, I am also looking at you Hein, thanks for making the new building one ‘totale escalatie’. Marina, we have only been colleagues briefly, but quality over quantity (exception for beers at science club). Speaking of which, a big thanks to all of MedChem (sorry I mean DDS4), for making science club and coffee breaks great (again and again and again). A special thanks to Xuesong, for all the amazing work you have done for our projects.

The Santiago team, one cannot wish for a more amazing and talented bunch of chemists. Eddy, thanks for being so extremely nice, and your patience with my (sometimes very wrong, and expensive) synthesis proposals. Jhonny, thanks for all the work you have done, and for unknowingly serving as my go-to FEP example (Jhilly for the win! I will explain later). To María and Ana, for the papers we have together, but more importantly the fun and drinks we shared at the GLISTEN and ERNEST conferences. A big thanks to all the others in the ComBioMed group.

Page 89: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

89

Geir, Tor Arne and Bjørn, thanks so much for the scientific collabora-tions. Not to forget the skiing trips (let’s forget about the actual skiing) and for showing me around beautiful Trømso.

To the team in Copenhagen. David for being such a kind host during my STSM stays, your helpful insights and support in writing my first grant appli-cation. To Anne Cathrine and Mohamed, for having me on your (and my first ‘first author’) paper. To Alex, Christian, Albert, Kasper, Tsonko, Stefan et al. for the amazingly fun times during my (short) stays over there.

To Zuzana and Chris, not only for the nice scientific collaboration, but for hosting me in Vienna, showing me around and allowing me to enjoy a very nice defense party. Also, to Jan and all the others for being partners in crime in said party (and not to forget my first days as a PhD student, on a boat cruise (‘summer school’) in Croatia!).

Finally, to the Heptares team, Jon, Rob, Chris and all the other people involved in the paper. If you would have told me at the beginning of my PhD that there would be crystal structures involved in one of the projects, I would probably not have believed you (I still barely do).

Of course, work is not work without all the awesome (present and past) group members. Jaka, we have spent so many hours together discussing who knows what, also thanks to Nisa for joining us on many of those occasions. Silvana, thanks for the collaboration, pizza, drinks and for the football eve-nings, Forza Roma! My former roommates: Masoud, for being such an inspi-ration in my first year of my PhD and beyond, and Christoffer, who first in-spired me to use dual topology for ligands, but more importantly always was there for a nice chat and a coffee (or beer). Mauricio, you have been such an amazing help, your kindness and patience is unmatched. Thanks for the great times in Stockholm. Yasmin and Jessica, you were always so nice to me, we worked together far to briefly. Sudarsan, ever the gentleman, your kindness and hospitality are an inspiration. Ana, for all the social get-togethers and showing me around Porto. Miha, for introducing me to poker, great talks, good beer and even better company. Florian, from awesome student to even better colleague, I wish you all the best the coming years of your PhD. Of course, to all the master students that worked with me, from whom I have learned so much. Amber, you were such a great student, I was lucky you ini-tiated my training as supervisor. Belma, we have shared many laughs, and I am glad we became friends, even though there is a significant distance sepa-rating us now I am sure we will meet again. Gabriel, hard-working and ever so friendly, I am sure you will succeed wherever you might go. Katherin, your intelligence is only surpassed by your work ethos and collaborative spirit, I am sure you will go on inspire many more. To Katarina for being an excel-lent running partner, and for the nice talks we had. Thanks also to the three of you for helping out with the Swedish translation. Martijn, I was so flattered when you came asking if you could do an internship with me, I hope I could live up to your expectations. I really enjoyed our talks during and outside of

Page 90: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

90

office hours. Alejandro, I was so impressed by how you dived into the project and made it your own, hats off to you. Finally, Lucien, thanks for all the hard work, great scientific insights, programming, and lest we forget the memes and football.

To all the people in and around BMC that made the Swedish winters more bearable and the summers more enjoyable. Showgy, tiger, princess, I am so glad I met you, thanks for all the fun and cheerfulness in and outside of BMC. Emil, thanks for the beer clubs and the sharing of songs and guitars. Javier, for your amazing laugh and friendliness. Katrin, I really enjoyed our hikes and the hospitality of your home. Luka, who coincidentally also went on a cruise in Croatia. Laura, if only I was a bit more social at the beginning of my PhD, we could’ve had so much more fun. Thanks for adding a bit of Mex-ican flavor to my life! Nima, you crazy mofo, thanks for allowing me to be as weird as you. Gabriela, crazy partying mixed with good conversations, a rare cocktail. Lastly, Antonio, dude we owe Belma so many beers for our formal introduction. Thanks for being a true friend, our record listening sessions, sharing good wines, conversations, coffee and the ‘occasional’ ‘few’ beer’s’.

I would not be who I am now without my lifelong friends. David, Boris, Mink and Frank, oude garde super mooi (de rest van het lied sla ik even over). A special thanks to Boris for helping out with the cover of this book. To all members of ‘de Bazenclub’, Geert, Reza, Steven, Rebecca and Rianne, luckily our friendship has far exceeded the net ECTS collected in our first year, though the competition with hours spent in Jan de Winter is still ongoing. Diek, we have spent so many moments together, I don’t know where to start: thanks for being there always, and that epic hike through Israel (or Jordan) is still on my list ;-). I also owe you for introducing me to Hannah, Koen and Kate, may there be many more New Year’s Eve dinners together.

Finally, the most important people in my life, family. My two brothers Ben and Chiel, everything they say about younger brothers being annoying is com-pletely based on fiction. My parents in law, Ton and Tineke, who have helped and supported me for so many years now. To Thomas, with whom I can al-ways ‘even een Uyltje knappen’ and his fiancé Charlotte, who tries to teach me wine. To my parents, without whom none of this ever would’ve happened. Thanks, so much for always being there for me, and providing me with all that I could ever need. Paps, thanks for all the creative inspiration, I don’t think there are many sons lucky enough to play in a band together with their dads. I have learned so much from you. Mams, the best mom in the world, and a great boss for that matter, you helped me through the hardest moments, and were there for all the best ones.

Finally, to my beautiful wife, Florence, you were so supportive and patient during these hard years apart. You are the love of my life, my inspiration, muse, my everything. I am so glad I met you all these years ago, and I look forward to the rest of our lives together.

Page 91: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

91

Bibliography

1 Rosenbaum, D.M. et al. The structure and function of G-protein-coupled receptors. , Nature, 459. 20-May-(2009) , 356–363

2 Fredriksson, R. et al. (2003) The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol. Pharmacol. 63, 1256–1272

3 Hauser, A.S. et al. (2017) Trends in GPCR drug discovery: new agents, targets and indications. Nat. Rev. Drug Discov. 16, 829–842

4 Wang, J. et al. (2006) Automatic atom type and bond type perception in molecular mechanical calculations. J. Mol. Graph. Model. 25, 247–260

5 Vanommeslaeghe, K. et al. (2010) CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 31, 671–690

6 Jorgensen, W.L. et al. (1996) Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids. J. Am. Chem. Soc. 118, 11225–11236

7 Zwanzig, R. et al. (1992) Levinthal’s paradox. Proc. Nail. Acad. Sci. USA 89, 20–22

8 Hess, B. et al. (2008) GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 4, 435–447

9 Marelius, J. et al. (1998) Q: a molecular dynamics program for free energy calculations and empirical valence bond simulations in biomolecular systems. J. Mol. Graph. Model. 16, 213–225

10 Bauer, P. et al. (2018) Q6: A comprehensive toolkit for empirical valence bond and related free energy calculations. SoftwareX 7, 388–395

11 King, G. and Warshel, A. (1989) A surface constrained all-atom solvent model for effective simulations of polar solutions. J. Chem. Phys. 91, 3647

12 Essex, J.W. and Jorgensen, W.L. (1995) An empirical boundary potential for water droplet simulations. J. Comput. Chem. 16, 951–972

13 Darden, T. et al. (1993) Particle mesh Ewald: An N·log(N) method for Ewald sums in large systems. J. Chem. Phys. 98, 10089–10092

14 Lee, F.S. and Warshel, A. (1992) A local reaction field method for fast evaluation of long-range electrostatic interactions in molecular simulations. J. Chem. Phys. 97, 3100

15 Ryckaert, J.-P.J. et al. (1977) Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 23, 327–341

16 Hopkins, C.W. et al. (2015) Long-time-step molecular dynamics through hydrogen mass repartitioning. J. Chem. Theory Comput. 11, 1864–1874

Page 92: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

92

17 Fiser, A. and Šali, A. MODELLER: Generation and Refinement of Homology-Based Protein Structure Models. , Methods in Enzymology, 374. 01-Jan-(2003) , Academic Press, 461–491

18 Kufareva, I. et al. (2011) Status of GPCR modeling and docking as reflected by community-wide GPCR Dock 2010 assessment. Structure 19, 1108–26

19 Esguerra, M. et al. (2016) GPCR-ModSim: A comprehensive web based solution for modeling G-protein coupled receptors. Nucleic Acids Res. 44, W455–W462

20 Rodríguez, D. et al. (2011) Molecular Dynamics Simulations Reveal Insights into Key Structural Elements of Adenosine Receptors. Biochemistry 50, 4194–4208

21 Halgren, T. a et al. (2004) Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J. Med. Chem. 47, 1750–9

22 Schrödinger Release 2015-4. . Maestro, Schrödinger, LLC, New York, NY, 2015, New York, NY

23 The PyMOL Molecular Graphics System, Version 1.4 Schrödinger, LLC. . 24 Scior, T. et al. (2012) Recognizing pitfalls in virtual screening: a critical

review. J. Chem. Inf. Model. 52, 867–81 25 Genheden, S. and Ryde, U. The MM/PBSA and MM/GBSA methods to

estimate ligand-binding affinities. , Expert Opinion on Drug Discovery, 10. 01-May-(2015) , Informa Healthcare, 449–461

26 Gutiérrez-De-Terán, H. and Åqvist, J. (2012) Linear interaction energy: Method and applications in drug design. Methods Mol. Biol. 819, 305–323

27 Dror, R.O. et al. (2013) Structural basis for modulation of a G-protein-coupled receptor by allosteric drugs. Nature 503, 295–299

28 Deng, Y. and Roux, B. (2009) Computations of standard binding free energies with molecular dynamics simulations. J. Phys. Chem. B 113, 2234–46

29 Incerti, M. et al. (2017) Metadynamics for perspective drug design: Computationally driven synthesis of new protein-protein interaction inhibitors targeting the EphA2 receptor. J. Med. Chem. 60, 787–796

30 Zwanzig, R.W. (1954) High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys. 22, 1420

31 Bennett, C.H. (1976) Efficient estimation of free energy differences from Monte Carlo data. J. Comput. Phys. 22, 245–268

32 Lefkowitz, R.J. (2000) The superfamily of heptahelical receptors. Nat. Cell Biol. 2, E133–E136

33 Dixon, R.A.F. et al. (1986) Cloning of the gene and cDNA for mammalian β-adrenergic receptor and homology with rhodopsin. Nature 321, 75–79

34 Dohlman, H.G. et al. (1991) Model Systems for the Study of Seven-Transmembrane-Segment Receptors. Annu. Rev. Biochem. 60, 653–688

35 Birnbaumer, L. et al. (1971) The glucagon-sensitive adenyl cyclase system in plasma membranes of rat liver. II. Comparison between glucagon- and fluoride-stimulated activities. J. Biol. Chem. 246, 1857–1860

36 Gilman, A.G. (1987) G Proteins: Transducers of Receptor-Generated Signals. Annu. Rev. Biochem. 56, 615–649

Page 93: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

93

37 Farahbakhsh, Z.T. et al. (1995) Mapping Light-Dependent Structural Changes in the Cytoplasmic Loop Connecting Helices C and D in Rhodopsin: A Site-Directed Spin Labeling Study. Biochemistry 34, 8812–8819

38 Ballesteros, J.A. et al. (2001) Activation of the β2-Adrenergic Receptor Involves Disruption of an Ionic Lock between the Cytoplasmic Ends of Transmembrane Segments 3 and 6. J. Biol. Chem. 276, 29171–29177

39 Rasmussen, S.G.F. et al. (2011) Crystal structure of the β2 adrenergic receptor–Gs protein complex. Nature 477, 549–555

40 Lefkowitz, R.J. et al. (1993) Constitutive activity of receptors coupled to guanine nucleotide regulatory proteins. Trends Pharmacol. Sci. 14, 303–307

41 Hauser, A.S. et al. (2018) Pharmacogenomics of GPCR Drug Targets. Cell 172, 41-54.e19

42 Staus, D.P. et al. (2016) Allosteric nanobodies reveal the dynamic range and diverse mechanisms of G-protein-coupled receptor activation. Nature 535, 448–452

43 Zhu, J. et al. (2000) Inverse agonism and neutral antagonism at a constitutively active alpha-1a adrenoceptor. Br. J. Pharmacol. 131, 546–552

44 Tembre, B.L. and Mc Cammon, J.A. (1984) Ligand-receptor interactions. Comput. Chem. 8, 281–283

45 Brandsdal, B.O. et al. (2003) Free energy calculations and ligand binding. Adv. Protein Chem. 66, 123–158

46 Jorgensen, W.L. (2004) The Many Roles of Computation in Drug Discovery. Science (80-. ). 303, 1813–1818

47 Pohorille, A. et al. (2010) Good practices in free-energy calculations. J. Phys. Chem. B 114, 10235–10253

48 Polishchuk, P.G. et al. (2013) Estimation of the size of drug-like chemical space based on GDB-17 data. J. Comput. Aided. Mol. Des. 27, 675–679

49 Reymond, J.L. (2015) The Chemical Space Project. Acc. Chem. Res. 48, 722–730 50 Wang, L. et al. (2015) Accurate and Reliable Prediction of Relative Ligand

Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc. 137, 2695–2703

51 Loeffler, H.H. et al. (2015) FESetup: Automating Setup for Alchemical Free Energy Simulations. J. Chem. Inf. Model. 55, 2485–90

52 Liu, S. et al. (2013) Lead optimization mapper: Automating free energy calculations for lead optimization. J. Comput. Aided. Mol. Des. 27, 755–770

53 Homeyer, N. and Gohlke, H. (2013) FEW: A workflow tool for free energy calculations of ligand binding. J. Comput. Chem. 34, 965–973

54 Christ, C.D. and Fox, T. (2014) Accuracy assessment and automation of free energy calculations for drug design. J. Chem. Inf. Model. 54, 108–120

55 Steinbrecher, T. et al. (2017) Predicting the Effect of Amino Acid Single-Point Mutations on Protein Stability—Large-Scale Validation of MD-Based Relative Free Energy Calculations. J. Mol. Biol. 429, 948–963

56 Wang, L. et al. (2017) Accurate modeling of scaffold hopping transformations in drug discovery. J. Chem. Theory Comput. 13, 42–54

57 Lenselink, E.B. et al. (2016) Predicting Binding Affinities for GPCR Ligands Using Free-Energy Perturbation. ACS Omega 1, 293–304

Page 94: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

94

58 Gapsys, V. et al. (2016) Accurate and Rigorous Prediction of the Changes in Protein Free Energies in a Large-Scale Mutation Scan. Angew. Chemie 128, 7490–7494

59 Jespers, W. et al. (2019) QresFEP: An Automated Protocol for Free Energy Calculations of Protein Mutations in Q. J. Chem. Theory Comput. 15, 5461–5473

60 Keränen, H. et al. (2015) Free energy calculations of A 2A adenosine receptor mutation effects on agonist binding. Chem. Commun. 51, 3522–3525

61 Jespers, W. et al. (2017) Structure-Based Design of Potent and Selective Ligands at the Four Adenosine Receptors. Molecules 22, 1–17

62 Keränen, H. et al. (2014) Structural and energetic effects of A2A adenosine receptor mutations on agonist and antagonist binding. PLoS One 9, e108492

63 Boukharta, L. et al. (2014) Computational prediction of alanine scanning and ligand binding energetics in G-protein coupled receptors. PLoS Comput. Biol. 10, e1003585

64 Xu, B. et al. (2018) Elucidation of the Binding Mode of the Carboxyterminal Region of Peptide YY to the Human Y2 Receptor. Mol. Pharmacol. 93, 323–334

65 Nøhr, A.C. et al. (2017) The GPR139 reference agonists 1a and 7c, and tryptophan and phenylalanine share a common binding site. Sci. Rep. 7, 1–9

66 Seeliger, D. and de Groot, B.L. (2010) Protein Thermostability Calculations Using Alchemical Free Energy Simulations. Biophys. J. 98, 2309–2316

67 Gapsys, V. et al. (2016) Accurate and Rigorous Prediction of the Changes in Protein Free Energies in a Large-Scale Mutation Scan. Angew. Chemie - Int. Ed. 55, 7364–7368

68 Wang, L. et al. (2013) Modeling local structural rearrangements using FEP/REST: Application to relative binding affinity predictions of CDK2 inhibitors. J. Chem. Theory Comput. 9, 1282–1293

69 Böhm, H.J. et al. (2004) Scaffold hopping. Drug Discov. Today Technol. 1, 217–224

70 Zhao, H. Scaffold selection and scaffold hopping in lead generation: a medicinal chemistry perspective. , Drug Discovery Today, 12. 01-Feb-(2007) , Elsevier Current Trends, 149–155

71 Rucktooa, P. et al. (2018) Towards high throughput GPCR crystallography: In Meso soaking of Adenosine A2A Receptor crystals. Sci. Rep. 8, 41

72 Fredholm, B.B. et al. (2011) International Union of Basic and Clinical Pharmacology. LXXXI. Nomenclature and classification of adenosine receptors--an update. Pharmacol. Rev. 63, 1–34

73 Haskó, G. et al. (2008) Adenosine receptors: therapeutic aspects for inflammatory and immune diseases. Nat. Rev. Drug Discov. 7, 759–770

74 Leone, R.D. et al. A2aR antagonists: Next generation checkpoint blockade for cancer immunotherapy. , Computational and Structural Biotechnology Journal, 13. (2015) , 265–272

75 Chen, J.-F. et al. (2013) Adenosine receptors as drug targets--what are the challenges? Nat. Rev. Drug Discov. 12, 265–86

76 Katritch, V. et al. (2013) Structure-function of the G protein-coupled receptor superfamily. Annu. Rev. Pharmacol. Toxicol. 53, 531–56

Page 95: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

95

77 Jaakola, V.-P. et al. (2008) The 2.6 angstrom crystal structure of a human A2A adenosine receptor bound to an antagonist. Science 322, 1211–7

78 Doré, A.S. et al. (2011) Structure of the adenosine A(2A) receptor in complex with ZM241385 and the xanthines XAC and caffeine. Structure 19, 1283–93

79 Congreve, M. et al. (2012) Discovery of 1,2,4-Triazine Derivatives as Adenosine A(2A) antagonists using structure based drug design. J. Med. Chem. 55, 1898–1903

80 Liu, W. et al. (2012) Structural basis for allosteric regulation of GPCRs by sodium ions. Science 337, 232–6

81 Sun, B. et al. (2017) Crystal structure of the adenosine A2A receptor bound to an antagonist reveals a potential allosteric pocket. Proc. Natl. Acad. Sci. U. S. A. 114, 2066–2071

82 Glukhova, A. et al. (2017) Structure of the Adenosine A1 Receptor Reveals the Basis for Subtype Selectivity. Cell 168, 867-877.e13

83 Cheng, R.K.Y. et al. (2017) Structures of Human A1 and A2A Adenosine Receptors with Xanthines Reveal Determinants of Selectivity. Structure 25, 1275-1285.e4

84 Lebon, G. et al. (2011) Agonist-bound adenosine A2A receptor structures reveal common features of GPCR activation. Nature 474, 521–5

85 F Xu, H Wu, V Katritch, GW Han, KA Jacobson, Z Gao, V Cherezov, R.S. et al. (2011) Structure of an agonist-bound human A2A adenosine receptor. Science 332, 322–7

86 Lebon, G. et al. (2015) Molecular Determinants of CGS21680 Binding to the Human Adenosine A2A Receptor. Mol. Pharmacol. 87, 907–15

87 Carpenter, B. et al. (2016) Structure of the adenosine A2A receptor bound to an engineered G protein. Nature 536, 104–107

88 Draper-Joyce, C.J. et al. (2018) Structure of the adenosine-bound human adenosine A1 receptor-Gi complex. Nature 558, 559–565

89 Ye, L. et al. (2016) Activation of the A2A adenosine G-protein-coupled receptor by conformational selection. Nature 533, 265–268

90 Fredriksson, K. et al. (2017) Nanodiscs for INPHARMA NMR Characterization of GPCRs: Ligand Binding to the Human A2A Adenosine Receptor. Angew. Chemie Int. Ed. 56, 5750–5754

91 Van Westen, G.J.P. et al. (2012) Identifying novel adenosine receptor ligands by simultaneous proteochemometric modeling of rat and human bioactivity data. J. Med. Chem. 55, 7010–20

92 Zhukov, A. et al. (2011) Biophysical Mapping of the Adenosine A 2A Receptor. J. Med. Chem. 54, 4312–4323

93 Langmead, C.J. et al. (2012) Identification of novel adenosine A(2A) receptor antagonists by virtual screening. J. Med. Chem. 55, 1904–9

94 Gutiérrez-de-Terán, H. et al. (2017) Structure-Based Rational Design of Adenosine Receptor Ligands. Curr. Top. Med. Chem. 17, 40–58

95 Ballesteros, J.A. and Weinstein, H. (1995) Integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in G protein-coupled receptors. Methods Neurosci. 25, 366–428

Page 96: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

96

96 Lane, J.R. et al. (2012) A novel nonribose agonist, LUF5834, engages residues that are distinct from those of adenosine-like ligands to activate the adenosine A(2a) receptor. Mol. Pharmacol. 81, 475–87

97 Li, Q. et al. (2007) ZM241385, DPCPX, MRS1706 are inverse agonists with different relative intrinsic efficacies on constitutively active mutants of the human adenosine A2B receptor. J. Pharmacol. Exp. Ther. 320, 637–45

98 Gao, Z. et al. (2002) Identification by Site-directed Mutagenesis of Residues Involved in Ligand Recognition and Activation of the Human A 3 Adenosine Receptor. J. Biol. Chem. 277, 19056–19063

99 May, L.T. et al. (2011) Allosteric interactions across native adenosine-A3 receptor homodimers: quantification using single-cell ligand-binding kinetics. FASEB J. 25, 3465–76

100 Kim, J. et al. Site-directed Mutagenesis Identifies Residues Involved in Ligand Recognition in the Human A2a Adenosine Receptor. , Journal of Biological Chemistry, 270. (1995) , 13987–13997

101 Jaakola, V.-P. et al. (2010) Ligand binding and subtype selectivity of the human A(2A) adenosine receptor: identification and characterization of essential amino acid residues. J. Biol. Chem. 285, 13032–44

102 Lane, J.R. et al. (2012) A novel nonribose agonist, LUF5834, engages residues that are distinct from those of adenosine-like ligands to activate the adenosine A 2a receptor. Mol. Pharmacol. 81, 475–487

103 Wheatley, M. et al. (2012) Lifting the lid on GPCRs: the role of extracellular loops. Br. J. Pharmacol. 165, 1688–1703

104 Segala, E. et al. (2016) Controlling the Dissociation of Ligands from the Adenosine A 2A Receptor through Modulation of Salt Bridge Strength. J. Med. Chem. 59, 6470–6479

105 Kim, J. et al. (1996) Glutamate residues in the second extracellular loop of the human A2a adenosine receptor are required for ligand recognition. Mol. Pharmacol. 49, 683–91

106 Guo, D. et al. (2016) Molecular basis of ligand dissociation from the adenosine A2A receptor. Mol. Pharmacol. 89, 485–491

107 Jiang, Q. et al. (1997) Mutagenesis reveals structure-activity parallels between human A(2A) adenosine receptors and biogenic amine G protein-coupled receptors. J. Med. Chem. 40, 2588–2595

108 IJzerman, A.P. et al. (1996) Site-directed mutagenesis of the human adenosine A(2A) receptor. Critical involvement of Glu13 in agonist recognition. Eur. J. Pharmacol. 310, 269–272

109 Gao, Z.G. et al. (2006) Orthogonal activation of the reengineered A3 adenosine receptor (neoceptor) using tailored nucleoside agonists. J. Med. Chem. 49, 2689–2702

110 Zeevaart, J.G. et al. (2001) Neoceptor Concept Based on Molecular Complementarity in GPCRs: A Mutant Adenosine A3 Receptor with Selectively Enhanced Affinity for Amine-Modified Nucleosides. J. Med. Chem. 130, 9492–9499

111 Lebon, G. et al. (2011) Thermostabilisation of an agonist-bound conformation of the human adenosine A2A receptor. J. Mol. Biol. 409, 298–310

Page 97: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

97

112 Bertheleme, N. et al. (2013) Loss of constitutive activity is correlated with increased thermostability of the human adenosine A2A receptor. Br. J. Pharmacol. 169, 988–998

113 Jiang, Q. et al. (1996) Hydrophilic side chains in the third and seventh transmembrane helical domains of human A2A adenosine receptors are required for ligand recognition. Mol. Pharmacol. 50, 512–21

114 Magnani, F. et al. (2008) Co-evolving stability and conformational homogeneity of the human adenosine A2a receptor. Proc. Natl. Acad. Sci. 105, 10744–10749

115 Thimm, D. et al. (2013) Ligand-specific binding and activation of the human adenosine A(2B) receptor. Biochemistry 52, 726–40

116 Townsend-Nicholson, A. and Schofield, P.R. (1994) A Threonine Residue in the Seventh Transmembrane Domain of the Human AI Adenosine Receptor Mediates Specific Agonist Binding. J. Biol. Chem. 269, 2373–2376

117 Dalpiaz, A. et al. (1998) Thermodynamics of full agonist, partial agonist, and antagonist binding to wild-type and mutant adenosine A1 receptors. Biochem. Pharmacol. 56, 1437–1445

118 Kourounakis, A. et al. (2001) Differential effects of the allosteric enhancer (2-amino-4,5-dimethyl-trienyl)[3-(trifluoromethyl) phenyl]methanone (PD81,723) on agonist and antagonist binding and function at the human wild-type and a mutant (T277A) adenosine A1 receptor. 61, 137–144

119 Yaziji, V. et al. (2011) Pyrimidine Derivatives as Potent and Selective A 3 Adenosine Receptor Antagonists. J. Med. Chem. 54, 457–471

120 Yaziji, V. et al. (2013) Selective and potent adenosine A3 receptor antagonists by methoxyaryl substitution on the N-(2,6-diarylpyrimidin-4-yl)acetamide scaffold. Eur. J. Med. Chem. 59, 235–242

121 Pennington, L.D. and Moustakas, D.T. (2017) The Necessary Nitrogen Atom: A Versatile High-Impact Design Element for Multiparameter Optimization. J. Med. Chem. 60, 3552–3579

122 Bortolato, A. et al. (2013) Water Network Perturbation in Ligand Binding: Adenosine A2A Antagonists as a Case Study. J. Chem. Inf. Model. 53, 1700–1713

123 Aherne, C.M. et al. The resurgence of A2B adenosine receptor signaling. , Biochimica et Biophysica Acta - Biomembranes, 1808. 01-May-(2011) , Elsevier, 1329–1339

124 Gessi, S. et al. (2005) Expression, pharmacological profile, and functional coupling of A 2b receptors in a recombinant system and in peripheral blood cells using a novel selective antagonist radioligand, [3H]MRE 2029-F20. Mol. Pharmacol. 67, 2137–2147

125 Pierce, K.D. et al. (1992) Molecular cloning and expression of an adenosine A2b receptor from human brain. Biochem. Biophys. Res. Commun. 187, 86–93

126 Müller, C.E. et al. (2018) Medicinal chemistry of a 2B adenosine receptors. Receptors 34, 137–168

127 El Maatougui, A. et al. (2016) Discovery of Potent and Highly Selective A 2B Adenosine Receptor Antagonist Chemotypes. J. Med. Chem. 59, 1967–1983

Page 98: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

98

128 Bortolato, A. et al. (2013) Water network perturbation in ligand binding: Adenosine A2A antagonists as a case study. J. Chem. Inf. Model. 53, 1700–1713

129 Cappel, D. et al. (2016) Relative Binding Free Energy Calculations Applied to Protein Homology Models. J. Chem. Inf. Model. 56, 2388–2400

130 Carbajales, C. et al. (2017) Enantiospecific Recognition at the A2B Adenosine Receptor by Alkyl 2-Cyanoimino-4-substituted-6-methyl-1,2,3,4-tetrahydropyrimidine-5-carboxylates. J. Med. Chem. 60, 3372–3382

131 Fried, J. and Sabo, E.F. 9α-Fluoro derivatives of cortisone and hydrocortisone. , Journal of the American Chemical Society, 76. 01-Mar-(1954) , American Chemical Society, 1455–1456

132 Meanwell, N.A. Fluorine and Fluorinated Motifs in the Design and Application of Bioisosteres for Drug Design. , Journal of Medicinal Chemistry, 61. 26-Jul-(2018) , American Chemical Society, 5822–5880

133 Gillis, E.P. et al. (2015) Applications of Fluorine in Medicinal Chemistry. J. Med. Chem. 58, 8315–8359

134 Ojima, I. (2009) Fluorine in Medicinal Chemistry and Chemical Biology, John Wiley and Sons.

135 Doré, A.S. et al. (2011) Structure of the adenosine A 2A receptor in complex with ZM241385 and the xanthines XAC and caffeine. Structure 19, 1283–1293

136 Alexander, S.P. et al. (2015) The concise guide to pharmacology 2013/14: G Protein-Coupled Receptors. Br J Pharmacol 170, 1459–1581

137 Garland, S.L. (2013) Are GPCRs still a source of new targets? J. Biomol. Screen. 18, 947–966

138 Gloriam, D.E.I. et al. (2005) Nine new human Rhodopsin family G-protein coupled receptors: identification, sequence characterisation and evolutionary relationship. Biochim. Biophys. Acta 1722, 235–46

139 Liu, C. et al. (2015) GPR139, an Orphan Receptor Highly Enriched in the Habenula and Septum, Is Activated by the Essential Amino Acids l-Tryptophan and l-Phenylalanine. Mol. Pharmacol. 88,

140 Matsuo, A. et al. (2005) Molecular cloning and characterization of a novel Gq-coupled orphan receptor GPRg1 exclusively expressed in the central nervous system. Biochem. Biophys. Res. Commun. 331, 363–9

141 Süsens, U. et al. (2006) Characterisation and differential expression of two very closely related G-protein-coupled receptors, GPR139 and GPR142, in mouse tissue and during mouse development. Neuropharmacology 50, 512–20

142 Wagner, F. et al. (2016) Microarray analysis of transcripts with elevated expressions in the rat medial or lateral habenula suggest fast GABAergic excitation in the medial habenula and habenular involvement in the regulation of feeding and energy balance. Brain Struct. Funct.

143 Dvorak, C.A. et al. (2014) Physiological ligands for GPR139; Internatinal Patent WO2014/152917 A2, Janssen Pharmaceutica. , Patent WO 2014/152917, Janssen Pharmaceutical

144 Andersen, K.B. et al. (2016) Protection of primary dopaminergic midbrain neurons by GPR139 agonists supports different mechanisms of MPP+ and rotenone toxicity. Front. Cell. Neurosci. 10, 1–10

Page 99: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

99

145 Isberg, V. et al. (2014) Computer-Aided Discovery of Aromatic l -α-Amino Acids as Agonists of the Orphan G Protein-Coupled Receptor GPR139. J. Chem. Inf. Model. 54, 1553–1557

146 Lin, H. V. et al. (2016) GPR142 controls tryptophan-induced insulin and incretin hormone secretion to improve glucose metabolism. PLoS One 11, 1–17

147 Wang, J. et al. (2016) GPR142 agonists stimulate glucose-dependent insulin secretion via gq-dependent signaling. PLoS One 11, 1–14

148 Yu, M. et al. (2013) Aminopyrazole-phenylalanine based GPR142 agonists: Discovery of tool compound and in vivo efficacy studies. ACS Med. Chem. Lett. 4, 829–834

149 Guo, L. et al. (2016) Discovery and optimization of a novel triazole series of GPR142 agonists for the treatment of type 2 diabetes. ACS Med. Chem. Lett. 22, 5942–7

150 Nøhr, A.C. et al. (2016) The orphan G protein-coupled receptor GPR139 is activated by the peptides: Adrenocorticotropic hormone (ACTH), α-, and β-melanocyte stimulating hormone (α-MSH, and β-MSH), and the conserved core motif HFRW. Neurochem. Int. 102, 105–113

151 Shi, F. et al. (2011) Discovery and SAR of a Series of Agonists at Orphan G Protein-Coupled Receptor 139. ACS Med. Chem. Lett. 2, 303–306

152 Dvorak, C.A. et al. (2015) Identification and SAR of Glycine Benzamides as Potent Agonists for the GPR139 Receptor. ACS Med. Chem. Lett. 6, 1015–1018

153 Dvorak, C.A. et al. (2015) Identification and SAR of Glycine Benzamides as Potent Agonists for the GPR139 Receptor. ACS Med. Chem. Lett. 6, 1015–1018

154 Hu, L. a et al. (2009) Identification of surrogate agonists and antagonists for orphan G-protein-coupled receptor GPR139. J. Biomol. Screen. 14, 789–97

155 Hitchcock, S. et al. (2016) Patent. Takeda Pharmaceutical Company Limited: 4-OXO-3,4-DIHYDRO-1,2,3-BENZOTRIAZINE MODULATORS OF GPR139. , WO/2016/081736

156 Shehata, M.A. et al. (2016) Novel Agonist Bioisosteres and Common Structure-Activity Relationships for The Orphan G Protein-Coupled Receptor GPR139. Sci. Rep. 6, 36681

157 Kuhne, S. et al. (2016) Radiosynthesis and characterisation of a potent and selective GPR139 agonist radioligand. RSC Adv. 6, 947–952

158 Ahuja, S. and Smith, S.O. (2009) Multiple Switches in G Protein-Coupled Receptor Activation. Trends Pharmacol. Sci. 30, 494–502

159 Jacobson, K.A. and Gao, Z.-G. (2006) Adenosine receptors as therapeutic targets. Nat. Rev. Drug Discov. 5, 247–64

160 Glukhova, A. et al. (2017) Structure of the Adenosine A1 Receptor Reveals the Basis for Subtype Selectivity. Cell 168, 867-877.e13

161 Nguyen, a. T.N. et al. (2016) Extracellular Loop 2 of the Adenosine A1 Receptor Has a Key Role in Orthosteric Ligand Affinity and Agonist Efficacy. Mol. Pharmacol. 90, 703–714

162 Palaniappan, K.K. et al. (2007) Probing the binding site of the A1 adenosine receptor reengineered for orthogonal recognition by tailored nucleosides. Biochemistry 46, 7437–7448

Page 100: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

100

163 Müller, C.E. and Jacobson, K.A. (2011) Recent developments in adenosine receptor ligands and their potential as novel drugs. Biochim. Biophys. Acta 1808, 1290–308

164 Jespers, W. et al. (2018) Structural Mapping of Adenosine Receptor Mutations: Ligand Binding and Signaling Mechanisms. Trends Pharmacol. Sci. 39, 75–89

165 Jazayeri, A. et al. (2017) Structurally Enabled Discovery of Adenosine A 2A Receptor Antagonists. Chem. Rev. 117, 21–37

166 Richardson, C.M. et al. (2006) Identification of non-furan containing A2A antagonists using database mining and molecular similarity approaches. Bioorg. Med. Chem. Lett. 16, 5993–7

167 Lounnas, V. et al. (2013) Current progress in Structure-Based Rational Drug Design marks a new mindset in drug discovery. Comput. Struct. Biotechnol. J. 5, e201302011

168 Langmead, C.J. et al. (2012) Identification of novel adenosine A 2A receptor antagonists by virtual screening. J. Med. Chem. 55, 1904–1909

169 Congreve, M. et al. (2012) Discovery of 1,2,4-Triazine Derivatives as Adenosine A(2A) antagonists using structure based drug design. J. Med. Chem. 55, 1898–1903

170 Zhukov, A. et al. (2011) Biophysical Mapping of the Adenosine A 2A Receptor.

171 Andrews, S.P. et al. (2014) Structure-based drug design of chromone antagonists of the adenosine A2A receptor. Medchemcomm 5, 571

172 Liu, Y. et al. (2011) Computational study of the binding modes of caffeine to the adenosine A2Areceptor. J. Phys. Chem. B 115, 13880–13890

173 Doré, A.S. et al. (2011) Structure of the adenosine A(2A) receptor in complex with ZM241385 and the xanthines XAC and caffeine. Structure 19, 1283–93

174 Gaulton, A. et al. (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100-7

175 Leff, P. (1995) The two-state model of receptor activation. Trends Pharmacol. Sci. 16, 89–97

176 van der Wenden, E.M. et al. (1995) Ribose-Modifíed Adenosine Analogues as Potential Partial Agonists for the Adenosine Receptor. J. Med. Chem. 38, 4000–4006

177 Bharate, S.B.S.S. et al. (2016) Discovery of 7-(Prolinol- N -yl)-2-phenylamino-thiazolo[5,4- d ]pyrimidines as Novel Non-Nucleoside Partial Agonists for the A 2A Adenosine Receptor: Prediction from Molecular Modeling. J. Med. Chem. 59, 5922–5928

178 Beukers, M.W. et al. (2004) New, Non-Adenosine, High-Potency Agonists for the Human Adenosine A 2B Receptor with an Improved Selectivity Profile Compared to the Reference Agonist N -Ethylcarboxamidoadenosine. J. Med. Chem. 47, 3707–3709

179 Magnani, F. et al. (2008) Co-evolving stability and conformational homogeneity of the human adenosine A2a receptor. Proc. Natl. Acad. Sci. U. S. A. 105, 10744–10749

Page 101: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

101

180 Vidal, B. et al. (2007) Discovery and characterization of 4′-(2-furyl)-N-pyridin-3-yl-4, 5′-bipyrimidin-2′-amine (LAS38096), a potent, selective, and efficacious A2B adenosine receptor antagonist. J. Med. Chem. 50, 2732–2736

181 Taliani, S. et al. (2012) 3-Aryl-[1,2,4]triazino[4,3-a ]benzimidazol-4(10 H)-one: A novel template for the design of highly selective A2B adenosine receptor antagonists. J. Med. Chem. 55, 1490–1499

182 Stewart, M. et al. (2004) [3H]OSIP339391, a selective, novel, and high affinity antagonist radioligand for adenosine A2B receptors. Biochem. Pharmacol. 68, 305–312

183 Kalla, R. V. et al. (2008) Selective, high affinity A2B adenosine receptor antagonists: N-1 monosubstituted 8-(pyrazol-4-yl)xanthines. Bioorganic Med. Chem. Lett. 18, 1397–1401

184 Borrmann, T. et al. (2009) 1-Alkyl-8-(piperazine-1-sulfonyl)phenylxanthines: Development and characterization of adenosine A2B receptor antagonists and a new radioligand with subnanomolar affinity and subtype specificity. J. Med. Chem. 52, 3994–4006

185 Jiang, J. et al. (2019) A2B Adenosine Receptor Antagonists with Picomolar Potency. J. Med. Chem. 62, 4032–4055

186 Crespo, A. et al. (2017) Exploring the influence of the substituent at position 4 in a series of 3,4-dihydropyrimidin-2(1H)-one A2B adenosine receptor antagonists. Chem. Heterocycl. Compd. 53, 316–321

187 Carbajales, C. et al. (2017) Enantiospecific Recognition at the A 2B Adenosine Receptor by Alkyl 2-Cyanoimino-4-substituted-6-methyl-1,2,3,4-tetrahydropyrimidine-5-carboxylates. J. Med. Chem. 60, 3372–3382

Page 102: Free energy calculations of G protein-coupled receptor ...uu.diva-portal.org/smash/get/diva2:1417794/FULLTEXT01.pdf · ACTA UNI VERSITATIS UPSALIENSIS UPPSALA 2020 Digital Comprehensive

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1925

Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science andTechnology, Uppsala University, is usually a summary of anumber of papers. A few copies of the complete dissertationare kept at major Swedish research libraries, while thesummary alone is distributed internationally throughthe series Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology.(Prior to January, 2005, the series was published under thetitle “Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-407840

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2020