35
13.11.2013. Bioinformatics - Proteomics Bioinformati cs Proteomics Lecture 9 Prof. László Poppe BME Department of Organic Chemistry and Technology Bioinformatics Proteomics Lecture and practice

Bioinformatics Proteomics Lecture 9Bioinformatika 2 Way to assess the fit: 1. Consideration of simple geometric fit 2. Evaluation of the fit: a complex energy function, electrostatic

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

  • 13.11.2013. Bioinformatics - Proteomics

    Bioinformatics − Proteomics Lecture 9

    Prof. László Poppe

    BME Department of Organic Chemistryand Technology

    Bioinformatics – Proteomics

    Lecture and practice

  • 2 Bioinformatika 22009. 04. 17.

    Map of bioinformatics A medicinal chemistry point of view

  • 3 Bioinformatics13.11.2013.

    Drug-receptor interaction modeling

    The drug-receptor interaction theoretical

    modeling of the flow chart and its

    relationship with the experiments.

    Theoretical studies are in rectangles, ellipses

    represent experimental data. The gray area is

    the modeling of small molecules, the white

    areas are related to bioinformatics.

    Advanced quantum

    mechanical methods

    (eg. QM / MM) can be

    applied

  • 4 Bioinformatics2009. 04. 17.

    Farmacophore model constructionD1 dopamin receptor

    Allowed

    pharmacophore

    regions

    Forbidden

    pharmacophore

    regions

    Pharmacophore model: a spatial

    alignement of ligands. Models a 3D image

    being complementary to the active site.

  • Bioinformatika 22009. 04. 17.

    Way to assess the fit:

    1. Consideration of simple geometric fit

    2. Evaluation of the fit: a complex energy function, electrostatic complementarity, etc..

    According to the model:

    1. Both molecules are rigid

    2. One molecule (usually a ligand) is flexible, and the other (usually protein) is rigid

    3. Both are flexible (the search is very time-consuming)

    According to the algorithm:

    1. Molecular dynamics

    2. Monte Carlo methods (generate random positions)

    3. Simulated annealing: simulation of a slow cooling of a high temperature system, it helps to achieve the minimum energy

    4. Other methods5

    Docking molecules to proteins

    Predicting / testing the binding of a small molecule

    (ligand, substrate and coenzyme, etc.) inside / on the

    surface of a protein (receptor).

    Prediction / test of the binding of two proteins to each

    other

    Predictions / analysis of the binding of protein to DNA

  • 6 Bioinformatics13.11.2013.

    QM/MM methods

    MM

    Region

    Treatment of proteins in several regions.

    Important part of the active site and the substrate

    (or product, reactive intermediate, transition

    state) contains a more accurate calculation for

    QM region.

    The QM region is applicable for analyzing

    electronic / quantum interactions (semi-empirical

    and HF / DFT methods).

    The rest of the protein is treated with classical

    MM force-field.

  • 7 Bioinformatika 22009. 04. 17.

    QM/MM methodsThe boundary of classical and quantum regions

    Splitting of a glutamate side chain to quantum

    and classical regions.

    The terminal CH2CO2 group is treated with

    quantum mechanics, and a molecular mechanics

    force-field is applied to the main chain atoms.

    Determination of the cutting surface is the most

    difficult question (usually along the C(sp3)-

    C(sp3) bond).

    There are two main approaches to manage the boundary.

    One method is the ‘‘link atom approach’’ [MJ Field, PA Bash, M Karplus: J Comput Chem,

    1989, 6, 700], the QM region is ”closed” by an appropriate virtual ligand atom.

    Another mathod is ‘‘ frozen orbital approach’’ [G Monard, M Loos, V Thery, K Baka, J-L

    Rivail: Int J Quant Chem, 1996, 58, 153]. The continuous electron density at the bundary is

    ensured by "frozen" orbitals between quantum and classical atoms (local self-Consistent

    field, LSCF).

  • 8 Bioinformatics

    Application of QM/MM methodsMechanism of triose-phosphate isomerase (TIM)

    Important parts of the active site and the substrate, reactive intermediates, transition states and

    product were managed by QM/MM methods in the discovery / correct interpretation of the triose

    phosphate isomerase (TIM) enzyme reaction [PA Bash, MJ Field, RC Davenport, GA Petsko, D

    Ringe, M Karplus: Biochemistry 1991, 30, 5826–5832; JR Knowles: Phil Trans Roy Soc Lond B

    1991, 332, 115–121].

    13.11.2013.

  • 9 Bioinformatika 22009. 04. 17.

    Binding free-energy calculations

    Ligand 1 + Receptor ΔG1 Ligand 1/Receptor

    Ligand 2 + Receptor ΔG2 Ligand 2/Receptor

    Two independent experiments to determine both the ligand and receptor binding free-energies:

    Ligand 1 + Receptor ΔG1 Ligand 1/Receptor

    ⇓ ⇓ΔG3 ΔG4⇓ ⇓

    Ligand 2 + Receptor ΔG2 Ligand 2/Receptor

    Two relative ligand binding free-energies determined by using the following cyclic scheme:

    where ΔG3 and ΔG4 are the formal difference of the free-energies of chemical

    transformations of Ligand 1 -> Ligand 2 in solution and bound to receptor. As ΔΔGcycle = 0,

    ΔΔGcycle = ΔG1 + ΔG2 -ΔG3 - ΔG4 = 0

    therefore

    ΔΔGbinding = ΔG1 - ΔG2 = ΔG3 - ΔG4

    Use of the relatív ΔΔGbinding values eliminates the need to determine the real ligand - receptor

    ΔG1 és ΔG2 binding free-energies which are quite computation demanding.

  • 10 Bioinformatics

    Identification of target proteins

    Direct identification of the target proteins is possible only since about a decade

    Historically, only a few drugs are known to which the target protein has become known at

    the same time as the drug itself. The reason for this is that the development of new drugs

    have traditionally been based largely on modifying known of drugs by intuitive use of

    molecular similarities. The changes were immediately tested experimentally in vitro and

    in vivo. Thus, the effectiveness of the drug was judged even without knowledge of the

    target protein. The consequence of this is that the drugs currently on the market act on

    members of an approx. 500 may target protein kit [Drews, J.: Die verspielte Zukunft,

    1998, Basel: Birkhauser Verlag].

    Identification of protein targets represents the bottleneck of today's medical and

    pharmaceutical science.

    13.11.2013.

  • 11 Bioinformatika 22009. 04. 17.

    Target protein identification - genomics

    The figure shows a portion of a DNA chip.

    This DNA chip shows the difference in proteins produced by yeast cells

    in two different states. One of the states (green) in the presence of glucose

    represents the "healthy" condition of the cells, the second state (red) in

    the absence of glucose represents the "hungry" condition of the cells.

    The bright green spots indicate proteins that are expressed in a large

    proportion of "healthy" state of cells. The red spots are proteins, which

    are mainly formed by the hungry state. When a protein is produced both

    states, the spot is yellow (additive mixture of green and red colors).

    The dark spots are proteins that are not expressed at high frequency.

    Therefore, it can be decided on the basis of the spots’ color in which state

    of the cell a protein of question is formed more frequently.

    Today, new methods of molecular biology – which only developed a few years ago – provide

    fundamentally new opportunities for the identification of target proteins. This development can

    be exemplified by DNA chip technology [DeRisi, J. L, Iyer, V. R., Brown, P. O. Science, 1997,

    278 (5338) 680-686]. The overall picture is of course also includes a number additional methods

    / options which are under development.

  • 12 Bioinformatics

    Very important issue is to know exactly what studies can result in an image. What is also of

    great importance to know what amount of information can be assigned to each colored spot

    of the image. We can make the following general statements:

    1. Coordinates of the colored spots are used for identification of the protein. For simplicity, it

    can be assumed that various spots represent different proteins (although this does not always

    apply, as multiple spots may be used for eg. calibration purposes). The exact position of

    spots have been set prior to DNA chip manufacturing. The DNA chip design requires

    identification of a number of proteins and optimization of their layout on the surface of the

    chip. The exact location depends on the boundary conditions and the nature of the

    experiment, but has no significant importance in terms of interpreting the results.

    2. Only partial / basic information can be assigned to the unique spots. In the best case, the

    experiment corresponds to the full sequence of the gene or protein. However, in many cases,

    it happens that only a short but necessarily relevant part of the sequence is available.

    Target protein identification - genomics

    13.11.2013.

  • 13 Bioinformatika 22009. 04. 17.

    Genomics vs. proteomics

    The methods of genomics test the expressed genes that result in the translation of protein but not the

    actual proteins. The proteomic methods investigate the effectively formed proteins.

    The previous figure shows DNA chip providing information on the expressed genes, therefore giving

    only indirect data on the actual protein products. The advantage of genomic approach is that genes

    experimentally more accessible and easier to handle than the proteins. As a result, nowadays the genomic

    methods are more widespread than proteomics methods. In parallel with the development of

    experimental techniques increasing spread of proteomics can be predicted.

    It should be realized, however, the disadvantages of genomic approaches. First is that the expression

    level of a gene does not necessarily correspond to the appropriate high protein concentrations in cells,

    although it's more important if you are interested in relationships of disease extent and the actual protein

    expression level.

    Perhaps even more important is that a significant portion of the protein is modified after translation

    (post-translational modifications). Such alterations are the glycosylation (complex sugar units binding to

    protein surface) and phosphorylation (phosphate units binding to the protein). Post-translational

    modifications of these proteins with the same primary amino acid sequence can lead to a number of

    different versions. Genomics is not able to track these modifications which may be crucial in many

    cases.

  • 14 Bioinformatics

    Enzyme nomenclature and classification databases:

    EXPASY – ENZYME: http://www.expasy.ch/enzyme/

    BRENDA: http://www.brenda-enzymes.org/

    Enzyme databases

    13.11.2013.

    http://www.expasy.ch/enzyme/http://www.brenda-enzymes.org/

  • 15 Bioinformatics

    Enzyme databases

    Enzyme databases:

    Databases for various data and nomenclature of enzymes

    Deteiled records for every enzyme classes to which EC (Enzyme Commission) EC assigned

    an identifier (in format of EC 0.11.22.33)

    13.11.2013.

  • 16 Bioinformatics

    Search in ENZYME database:

    By EC number

    By Enzyme class

    By description (official name) or alternative name(s)

    By chemical compound

    By cofactor

    By text in comment lines

    ENZYME database – search and content

    ENZYME: http://www.expasy.ch/enzyme/

    13.11.2013.

  • 17 Bioinformatika 22009. 04. 17.

    ENZYME: http://www.expasy.ch/enzyme/

    ENZYME database

  • 18 Bioinformatika 22009. 04. 17.

    ENZYME database

    ENZYME: http://www.expasy.ch/enzyme/

  • 19 Bioinformatika 22009. 04. 17.

    ENZYME database

    ENZYME: http://www.expasy.ch/enzyme/

  • 20 Bioinformatics

    BRENDA database – search and content

    BRENDA: http://www.brenda-enzymes.org/

    Search in BRENDA database (detailed search possibilities):

    By nomenclature

    By reaction & specificity

    By functional parameters

    By isolation and preparation

    By organism-related information

    By stability

    By enzyme structure

    By disease and related information

    By application and egineering aspects

    13.11.2013.

  • 21 Bioinformatika 22009. 04. 17.

    BRENDA database

    BRENDA: http://www.brenda-enzymes.org/

  • 22 Bioinformatika 22009. 04. 17.

    BRENDA database

    BRENDA: http://www.brenda-enzymes.org/

  • 23 Bioinformatika 22009. 04. 17.

    BRENDA database

    BRENDA: http://www.brenda-enzymes.org/

  • 24 Bioinformatika 22009. 04. 17.

    BRENDA: http://www.brenda-enzymes.org/

    BRENDA database

  • 25 Bioinformatika 22009. 04. 17.

    BRENDA: http://www.brenda-enzymes.org/

    BRENDA database

  • 26 Bioinformatika 22009. 04. 17.

    BRENDA: http://www.brenda-enzymes.org/

    BRENDA database

  • 27 Bioinformatika 22009. 04. 17.

    BRENDA: http://www.brenda-enzymes.org/

    BRENDA database

  • 28 Bioinformatika 22009. 04. 17.

    BRENDA: http://www.brenda-enzymes.org/

    BRENDA database

  • 29 Bioinformatika 22009. 04. 17.

    BRENDA: http://www.brenda-enzymes.org/

    BRENDA database

  • 30 Bioinformatika 22009. 04. 17.

    BRENDA: http://www.brenda-enzymes.org/

    BRENDA database

  • 31 2009. 04. 17.

    KEGG: http://www.genome.jp/kegg/

    Bioinformatics

    KEGG databases - PATHWAY

  • 32 2009. 04. 17.

    KEGG databases - PATHWAY

    KEGG: http://www.genome.jp/kegg/

    Bioinformatics

  • 33 Bioinformatika 22009. 04. 17.

    KEGG: http://www.genome.jp/kegg/

    KEGG databases - PATHWAY

  • 34 2009. 04. 17.

    KEGG: http://www.genome.jp/kegg/

    KEGG databases - PATHWAY

    Bioinformatics

  • 35 Bioinformatika 22009. 04. 17.

    KEGG: http://www.genome.jp/kegg/

    KEGG databases - PATHWAY