V9 – orientation of TM helices

Membrane Bioinformatics SS091

V9 – orientation of TM helices

- Modelling 3D structures of helical TM bundlesPark, Staritzbichler, Elsner & Helms, Proteins (2004), Park & Helms, Proteins (2006)

- Beuming & Weinstein (2004)T. Beming & H. Weinstein (2004) Bioinformatics 20, 1822

- Adamian & Liang (2006)L. Adamian & J. Liang (2006) BMC Struct. Biol. 6, 13

- TMX: predict lipid-accessible sides of TM helices from sequencePark & Helms, Bioinformatics (2007), Park, Hayat & Helms, BMC Bioinformatics (2007),

2

Structure modelling for helical membrane proteins>P52202 RHO -- Rhodopsin. MNGTEGPDFYIPFSNKTGVVRSPFEYPQYYLAEPWKYSALAAYMFMLIILGFPINFLTLYVTVQHKKLRSPLNYILLNLAVADLFMVLGGFTTTLYTSMNGYFVFGVTGCYFEGFFATLGGEVALWCLVVLAIERYIVVCKPMSNFRFGENHAIMGVVFTWIMALTCAAPPLVGWSRYIPEGMQCSCGVDYYTLKPEVNNESFVIYMFVVHFAIPLAVIFFCYGRLVCTVKEAAAQQQESATTQKAEKEVTRMVIIMVVSFLICWVPYASVAFYIFSNQGSDFGPVFMTIPAFFAKSSAIYNPVIYIVMNKQFRNCMITT LCCGKNPLGDDETATGSKTETSSVSTSQVSPA

www.gpcr.org

EMBO Reports (2002)

1D

2D

3D


3

Design helical bundles using effective energy functions

Aim: assemble TM bundles

Glycophorin A dimer, Erb/Neu dimer, phospholamban pentamer

Method: scan 6-D conformational space of dimers of ideal helices


4

Example for parametrised

energy function between 2

residues

docking of helix-dimers: energy scoring

search 5 degrees of freedom systematically.

score conformations by residue-residue

energy function.

Park et al. Proteins (2004)Membrane Bioinformatics SS09

5

Test for Glycophorin A, dimer of two identical helices, NMR structure available

docking of helix-dimers

RMSD between best model and NMR

structure only 0.8 Å.

Energy landscape

around the minimum Minimum is truly

global minimum.

Park et al. Proteins (2004)

However, this is not the

case for dimers in

larger TMH proteins.Membrane Bioinformatics SS09

6

Need more/other information to orient helices

Early suggestion: TM proteins are „inside-out“ proteins.

That means that are hydrophobic outside and hydrophilic inside.

compute hydrophobic moment = the direction of largest hydrophobicity

N

iiprojiC rriH

N 1

1

here, rproj(i) is the projection of the side-

chain onto the helical axis, i.e. the vector

difference describes the shortest distance

between residue i and the helix axis.

H(i) is the hydrophobicity of residue i.

This method was introduced by

David Eisenberg (1982, Nature)


7

role of hydrophobic moment

According to the concept of Eisenberg,

all helices would orient their most hydrophobic side towards the bilayer.

However, this measure is quite unprecise (Park & Helms, Biopolymers 2006).

Hydrophobicity scalesww: Wimley-White scaleeis: Eisenberg scaleges: Goldman/Engelman/Steitz scalekd: Kyte-Doolittle scaleSpecialized scaleskP: kProtbw: Beuming & Weinstein scaletmlip1/2: Adamian & Liang


8

Beuming & Weinstein (2004): amino acid propensities

(1) Hydrophobic residues (A, I, L, V)

make up 48.7 % of all residues in TM

proteins

(2) Charged residues (D, E, H, K, R)

constitute only 5.5%

(3) Glycine (G) is relatively abundant

(4) Small residues (A, C, S, T) form

30.6%

(5) Aromatic residues (F,W,Y) represent

15.8%

6 -branched residues (T, I, V) form

24.9%.

(7) Proline is a helix-breaker and is

underrepresented

(8) Also, Cys, Gln, and Asn are rarely

found.


9

amino acid propensities: conclusions

The overall amino acid composition deviates significantly from that of the whole

genome.

Hydrophobic residues (A, F, G, I, L, M, V, W) occur more frequently in MPs than

in the whole genomes.

Conversely, residues C, D, E, K, N, P, Q, R are underrepresented in MPs.

H, S, T, and Y have equal distributions in MPs and whole genomes.


10

Beuming & Weinstein (2004): inside vs. outside

(1) Most of the exposed (lipid facing)

charged residues (D, E, K, H, R) that

are found in TMs are located in the

terminal regions (4.4%) rather than in

the central region (2.7%).

(2) The exposed terminal parts are very

rich in aromatic residues (21.3%)

compard to the central part (16.1%).


11

Beuming & Weinstein (2004): surface propensity scale

Table shows fraction SF of exposed residue i.

Trp has highest value of SF, His has smallest

value.

Normalize SP values with respect to His

(SP=0) and Trp (SP=1).

HISTRP

HISXX SFSF

SFSFSP


12

correlation of SP scale with other scales

Compute correlation coefficient.

SP propensity scale has high

correlation with hydrophobicity or

volume scales.

Combine SP scale with conservation

index:

Alignmentn a

iaiai p

ffCI log

pa : a priori distribution of residues


13

Beuming & Weinstein (2004)

Add propensity score and

conservation score:

total score(i) = SPi + CIi

Accuracy to detect the buried resides

is ca. 70%.


14

Beuming & Weinstein (2004)

(top) correct SASA in X-ray structure

(middle): prediction based on amino-

acid propensity + conservation

BEST!

(bottom): prediction based only on

conservation


15

Adamian & Liang (2006): interacting helices

Example for two interacting TM

helices in succinate dehydrogenase.

Interacting residues follow heptad

motiv.

Note the periodicity of 3.6 residues

per turn in an ideal -helix.


16

Adamian & Liang (2006)

Heptad motifs are generally

preferred for interacting helix pairs.

For left-handed helices, about 94.7%

and 92.4% of interacting residues

can be mapped to heptad repeats for

parallel and anti-parallel helices.

For right-handed pairs the number

are slightly less.

Assume that the residues of lipid-

accessible helices follows a similar

pattern.


17


Each TM helix has „7 faces“.

A: the anchoring residues are

0, 7, 14, and 21

contacts are also formed by residues

3, 4, 10, 11, 17, 18


18


Combine lipophilicity score Lf and positional entropy Ef of a helical face by

simply multiplying them.


19

Adamian & Liang (2006): Test fo TRP channel


20

Adamian & Liang (2006): discuss failures

Sometimes, binding sites for individual lipids (e.g. cardiolipin) are formed on the

surfaces of TM proteins. Those residues will also be highly conserved, and the

method will therefore fail.


21

What is needed for true de novo design of helical bundles?

Aim: explore new TM protein topologies.

distance-dependent residue-residue force field

Generate energetically favorable geometries of helix dimers.

Overlap helix dimers full protein structure.


22

Derivation of position scores

(1) For each test protein, 1000 similar sequences

from non-redundant database using BLAST URLAPI.

(2) generate initial multiple sequence alignment (MSA) with ClustalW.

Delete fragments < 80% of length of query sequence.

From these refined MSA, apply 6 different % identity criteria, 6 final MSAs for each test protein.

Pei & Grishin: need to align ≥ 20 sequences to accurately estimate conservation

indices from MSAs.


23

Test: correct orientation (0,0)

has lowest score.

predicting the TM-helix-orientation from sequences

CI: conservation index in MSA

SASA: Solvent accessible surface area,

relative to a single, free helix

fj(i): frequency of amino acid j

in position i.fj : frequency of amino acid j in full alignment.

C : average conservation index (CI): Standard deviation

Positive values: conserved positionsNegative values: variable positions

12

CfifCI

jjji

Assumption:

lipid-exposed positions are

less conserved.


24

Aim: construct structural model for a bundle of ideal transmembrane

helices.

(1) Construct 12 good geometries for every helix pair AB, BC, CD, DE, EF, FG

(2) overlay ABCDEFG

„thin out“ solution space containing ca. 126 models

(a) remove „solutions“ where helices collide with eachother

(b) delete non-compact „solutions“

(3) score remaining 106 solutions by sequence conservation

(4) cluster 500 best solutions in 8 models

(5) rigid-body refinement, select 5 models with best sequence conservation.

Ab initio structure prediction of TM bundles


25

Rigid-body refinement


26

dark: Model

light: X-ray structure

Additional input:

known connectivity of the

helices A-B-C-D-E-F-G.

Otherwise, the search

space would have been

too large.

Compare best models with X-ray structures

HalorhodopsinBacteriorhodopsin Sensory Rhodopsin

Rhodopsin NtpK


27

Comparing the best models with X-ray structures


28

These are our 4 best

non-native models of bR.

Because contact between

A and E was not imposed,

very different topologies

were obtained.

In 2006, our methods

could not distinguish

between these models.

but they could serve as

input for further

experiments.

Can one select the best model?


29

“Success case”: True de novo model of 4-helix bundle


30

Predicting lipid-exposure


31

Predicting lipid-exposure

Aim: derive optimal scale to predict exposure of residues

to hydrophobic part of lipid bilayer.

Scale should optimally correlate with SASA minimize quadratical error.

Y: SASA values of the training set (N = 2901 residue positions)

X: profile of residue frequencies from multiple sequence alignment ( N 21 matrix)

: wanted propensity scale for 20 amino acids + 1 intercept value (21)

Solution for minimization task


32

What does MO scale capture?


33

Improved prediction of exposure by statistical learning

Prediction method Prediction accuracy [%]

Beuming & Weinstein 68.7

TMX 78.7

Yuan ... Teasdale 71.1

Beuming & Weinstein(2004) method


34

Improved method by statistical learning

The theory of Support Vector Classifiers evolves from a simpler case of optimal

separating hyperplanes that, while separating two separable classes, maximize

the distance between a separating hyperplane and the closest point from either

class.

A: The two classes can be fully separable by a hyperplane, and the optimal separating hyperplane can be obtained by solving Eq. 9. B: It is not possible to separate the two classes with a hyperplane, and the optimal hyperplane can be obtained by solving Eq. 17.


Stockholm Univ. Sept. 200835



36



37



38



39

http://service.bioinformatik.uni-saarland.de/tmx/

input:

Putative TM helices

TopoView drawsSnake plot

Master thesisNadine Schneider


40


Top: TMD11, Bottom: TMD 12 Membrane Bioinformatics SS09

41


Top: TMD5, Bottom: TMD 12 Membrane Bioinformatics SS09

42

Summary TMX and related methods

Sequences of TM proteins reveal many powerful features to allow prediction of

2D- and 3D structural features, function, and oligomerization status.

TMX server can predict lipid exposure with ca. 78% accuracy.:


Possible applications:

(1) predict transporter pores

(2) predict lipid-exposed surface of TM proteins:

correlate with different membrane composition

collaborate with us do you have lots of solubility data?

(3) Conserved surface residues may indicate interaction sites


Documents

V9 – orientation of TM helices