42
Modeling the Structures of Macromolecular Assemblies Depts. of Biopharmaceutical Sciences and Pharmaceutical Chemistry California Institute for Quantitative Biomedical Research University of California at San Francisco Frank Alber Maya Topf Andrej Šali http:// salilab.org/ UC SF

Modeling the Structures of Macromolecular Assemblies

  • Upload
    avak

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

U. C. S. F. Modeling the Structures of Macromolecular Assemblies. Depts. of Biopharmaceutical Sciences and Pharmaceutical Chemistry California Institute for Quantitative Biomedical Research University of California at San Francisco. Frank Alber Maya Topf Andrej Šali http://salilab.org/. - PowerPoint PPT Presentation

Citation preview

Page 1: Modeling the Structures of  Macromolecular Assemblies

Modeling the Structures of Macromolecular Assemblies

Depts. of Biopharmaceutical Sciences and Pharmaceutical ChemistryCalifornia Institute for Quantitative Biomedical Research

University of California at San Francisco

Frank Alber

Maya Topf

Andrej Šali

http://salilab.org/

UCSF

Page 2: Modeling the Structures of  Macromolecular Assemblies

Contents

Modeling of assemblies by optimization (5’, Andrej)

Representation for modeling, illustrated by NPC (15’, Frank)

Comparative modeling (10’, Andrej)

Combined comparative modeling and density fitting (10’, Maya)

Page 3: Modeling the Structures of  Macromolecular Assemblies

Acknowledgmentshttp://salilab.org

Yeast NPC

Tari SupraptoJulia KipperWenzhu ZhangLiesbeth VeenhoffSveta DokudovskayaMike RoutBrian Chait

Yeast NPC

Tari SupraptoJulia KipperWenzhu ZhangLiesbeth VeenhoffSveta DokudovskayaMike RoutBrian Chait

Our group/assemblies

Frank AlberMaya TopfFred DavisDmitry KorkinM.S. MadhusudhanDamien DevosMarc A. Martí-RenomMike Kim

Bino JohnNarayanan EswarUrsula PieperMin-yi Shen

Our group/assemblies

Frank AlberMaya TopfFred DavisDmitry KorkinM.S. MadhusudhanDamien DevosMarc A. Martí-RenomMike Kim

Bino JohnNarayanan EswarUrsula PieperMin-yi Shen

E. coli ribosome

H.Gao J.SenguptaM.Valle A.Korostelev S.StaggP.Van RoeyR.AgrawalS.HarveyM.ChapmanJ.Frank

E. coli ribosome

H.Gao J.SenguptaM.Valle A.Korostelev S.StaggP.Van RoeyR.AgrawalS.HarveyM.ChapmanJ.Frank

EM

Wah ChiuMatt BakerWen Jiang

Chandrajit Bajaj

EM

Wah ChiuMatt BakerWen Jiang

Chandrajit Bajaj

NIHNSFSinsheimer FoundationA. P. Sloan FoundationBurroughs-Wellcome FundMerck Genome Res. Inst.Mathers FoundationI.T. Hirschl FoundationThe Sandler Family FoundationSUNIBMIntelStructural Genomix

NIHNSFSinsheimer FoundationA. P. Sloan FoundationBurroughs-Wellcome FundMerck Genome Res. Inst.Mathers FoundationI.T. Hirschl FoundationThe Sandler Family FoundationSUNIBMIntelStructural Genomix

Structural GenomicsStephen Burley (SGX)John Kuriyan (UCB)NY-SGRC

Structural GenomicsStephen Burley (SGX)John Kuriyan (UCB)NY-SGRC

Page 4: Modeling the Structures of  Macromolecular Assemblies

Protein and assembly structure by experiment and computation

Sali, Earnest, Glaeser, Baumeister. From words to literature in structural proteomics. Nature 422, 216-225, 2003.

Page 5: Modeling the Structures of  Macromolecular Assemblies

Determining the structures of proteins and assemblies

Use structural information from any

source: measurement, first principles, rules,

resolution: low or high resolution

to obtain the set of all models that are consistent with it.

Page 6: Modeling the Structures of  Macromolecular Assemblies

Modeling proteins and macromolecular assemblies by satisfaction of spatial restraints

1) Representation of a system.

2) Scoring function (spatial restraints).3) Optimization.

Sali, Earnest, Glaeser, Baumeister. Nature 422, 216-225, 2003.

There is nothing but points and restraints on them.

Page 7: Modeling the Structures of  Macromolecular Assemblies

Assembly representation for modeling(visualization)

Multi-scale representation:Resolution of subunit structures can vary largely.

Hierarchical representation:Spatial information about subunit-subunit interactions may be available at different resolutions.

Flexibility of subunits:Subunit structures may be subject to conformational changes upon assembly formation.

Points and restraints between them (distance, angle dihedral restraints).

Page 8: Modeling the Structures of  Macromolecular Assemblies

Software implementation

Assembler, extension of MODELLER

Provides a framework for assembly modelling:

• multi-scale hierarchical representations,

• integration of experimental input data,

• calculates models that are consistent with input data.

Page 9: Modeling the Structures of  Macromolecular Assemblies

Resolution of restraints

Site-directedmutagenesis

Cross-linkingCryo-EM/tomographyFRET

Computational docking

1Å 300Å

Site-directedmutagenesis

Cross-linking

FRET

Binding sitepredictions

Computational docking

Correlatedmutations

Binding sitepredictions

Correlatedmutations

Site-directedmutagenesis

Cross-linking

FRET

Binding sitepredictions

Computational docking

Cryo-EM/tomography

Tandem affinity purification

FRET

Site-directedmutagenesis

Binding sitepredictions

Computational docking

Cryo-EM/tomography

Cross-linking

Immuno-EM

Yeast-two-hybrid

Tandem affinity purification

FRET

Site-directedmutagenesis

Binding sitepredictions

Computational docking

Cryo-EM/tomography

Cross-linking

Page 10: Modeling the Structures of  Macromolecular Assemblies

Resolution of restraints

Tandem affinity purification

FRET

Site-directedmutagenesis

Immuno-EM

Yeast-two-hybrid

Binding sitepredictions

Computational docking

Cryo-EM/tomography

Cross-linking

Tandem affinity purification

FRET

Site-directedmutagenesis

Binding sitepredictions

Computational docking

Cryo-EM/tomography

Cross-linking Cryo-EM/tomography

Site-directedmutagenesis

Cross-linking

FRET

Cryo-EM/tomography

Site-directedmutagenesis

Cross-linking

FRET

Cryo-EM/tomography

Site-directedmutagenesis

Cross-linking

FRET

Binding sitepredictions

Computational docking

Cryo-EM/tomography

1Å 300Å

Page 11: Modeling the Structures of  Macromolecular Assemblies

Binding patches

Properties of points

Define interaction centers at each structural level.

Interaction centers transmit interactions between assembly subunits.

Interaction centers:

Page 12: Modeling the Structures of  Macromolecular Assemblies

Allow or restrict conformational degrees of freedom within assembly subunits.

Subunit flexibility:

Design alternative internal restraints sets for different conformational states.

Properties of points

Define interaction centers at each structural level.

Interaction centers transmit interactions between assembly subunits.

Interaction centers:

Page 13: Modeling the Structures of  Macromolecular Assemblies

Modeling the Yeast Nuclear Pore complex by satisfaction of spatial

restraints

Depts. Of Biopharmaceutical Sciences and Pharmaceutical ChemistryCalifornia Institute for Quantitative Biomedical Research

University of California at San Francisco

Andrej Sali

Frank Alber, Damien DevosMike Rout, T. Suprapto, J. Kipper, L. Veenhoff,S. Dokudovskaya

Brian Chait, W. Zhang

The Rockefeller University, 1230 York Avenue, New York

http://salilab.org/

Page 14: Modeling the Structures of  Macromolecular Assemblies

Integrate spatial information

NUPLocalization

NUPStoichiometry

NUP- NUP Interactions

NU

P S

hap

e SymmetryGlobal shape

1) 2) 3)

Page 15: Modeling the Structures of  Macromolecular Assemblies

Successful optimization

Page 16: Modeling the Structures of  Macromolecular Assemblies

Analysis of models

Assessing the well scoring models:How similar are the models to each other?

Do the models make sense given other data?

Search for conservation of :Protein-protein contacts

Structural features

Visualization of the results:Probability distribution of subunit positions (density maps)

Page 17: Modeling the Structures of  Macromolecular Assemblies

Comparative modeling

Page 18: Modeling the Structures of  Macromolecular Assemblies

Steps in Comparative Protein Structure Modeling

No

Target – TemplateAlignment

MSVIPKRLYGNCEQTSEEAIRIEDSPIV---TADLVCLKIDEIPERLVGEASILPKRLFGNCEQTSDEGLKIERTPLVPHISAQNVCLKIDDVPERLIPE

Model Building

START

ASILPKRLFGNCEQTSDEGLKIERTPLVPHISAQNVCLKIDDVPERLIPERASFQWMNDK

TARGET

Template Search

TEMPLATE

OK?

Model Evaluation

END

Yes

A. Šali, Curr. Opin. Biotech. 6, 437, 1995.R. Sánchez & A. Šali, Curr. Opin. Str. Biol. 7, 206, 1997.M. Marti et al. Ann. Rev. Biophys. Biomolec. Struct., 29, 291, 2000. http://salilab.org/

Page 19: Modeling the Structures of  Macromolecular Assemblies

Comparative modeling by satisfaction of spatial restraints MODELLER

3D GKITFYERGFQGHCYESDC-NLQP…SEQ GKITFYERG---RCYESDCPNLQP…

1. Extract spatial restraints

F(R) = pi (fi /I)i

2. Satisfy spatial restraints

A. Šali & T. Blundell. J. Mol. Biol. 234, 779, 1993.J.P. Overington & A. Šali. Prot. Sci. 3, 1582, 1994.A. Fiser, R. Do & A. Šali, Prot. Sci., 9, 1753, 2000.

http://salilab.org/

Page 20: Modeling the Structures of  Macromolecular Assemblies

Comparative modeling of the TrEMBL database

Unique sequences processed: 1,182,126

Sequences with fold assignments or models: 659,495 (56%)

9/15/03 ~4 weeks on 660 Pentium CPUs

70% of models based on <30% sequence identity to template.

On average, only a domain per protein is modeled(an “average” protein has 2.5 domains of 175 aa).

Page 21: Modeling the Structures of  Macromolecular Assemblies

http://salilab.org/modbasePieper et al., Nucl. Acids Res. 2002.

Page 22: Modeling the Structures of  Macromolecular Assemblies

Structural Genomics

Characterize most protein sequences based on related known structures.

There are ~16,000 30% seq id families (90%)(Vitkup et al. Nat. Struct. Biol. 8, 559, 2001).

Sali. Nat. Struct. Biol. 5, 1029, 1998.Sali et al. Nat. Struct. Biol., 7, 986, 2000.Sali. Nat. Struct. Biol. 7, 484, 2001.Baker & Sali. Science 294, 93, 2001.

The number of “families” is much smaller than the number of proteins.

Any one of the members of a family is fine.

Characterize most protein sequences based on related known structures.

Page 23: Modeling the Structures of  Macromolecular Assemblies

Comparative modeling and EM density fitting

Page 24: Modeling the Structures of  Macromolecular Assemblies

E. coli ribosomeH.Gao, J.Sengupta, M.Valle, A. Korostelev, N.Eswar, S.Stagg, P.Van Roey, R.Agrawal, S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell 113, 789-801, 2003.

1. Cryo-EM density map of the 70S ribosome of E. coli was determined at ~12Å.

2. Calculate comparative models of the proteins and rRNA based on T. thermophilus, H. marismortui, and D. radiodurans ribosome structures.

3. For each protein, select the best model among several alternatives based on model assessment by statistical potentials, model compactness, target-template sequence identity, quality of template structure, and fit into the EM map.

4. Obtain an assembly model by multi-rigid body real-space refinement of the fit into the EM map, by RSRef (M. Chapman).

5. Repeat 3 and 4 iteratively to get an optimized model.

Page 25: Modeling the Structures of  Macromolecular Assemblies

E. coli ribosome

Fitting of comparative models into 12Å cryo-electron density map.

All 29 proteins in 30S and 29 of 36 proteins in the 50S subunit could be modeled on 22-73% seq.id. to a known structure.

The models generally cover whole folds, but tails.

H.Gao, J.Sengupta, M.Valle, A. Korostelev, N.Eswar, S.Stagg, P.Van Roey, R.Agrawal, S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell 113, 789-801, 2003.

Page 26: Modeling the Structures of  Macromolecular Assemblies

Errors in comparative models vs. resolution

Distortion/shifts in

aligned regions

Region without

a template

Sidechain packingIncorrect template MisalignmentRigid body

movement

Maps courtesy of Wah Chiu.

20Å 10Å 2Å

Page 27: Modeling the Structures of  Macromolecular Assemblies

Moulding: iterative alignment, model building, model assessment

model building

alignment

model assessment

model building

alignment

model assessment

Alignments

Mod

els

per

alig

nmen

t Comparative modeling

Threading

Moulding

1 104 1030

105

1

104

B. John, A. Sali. Nucl. Acids Res. 31, 3982, 2003.

Page 28: Modeling the Structures of  Macromolecular Assemblies

Start

Sequence profiles of target & template

Generate 25 initial target-template alignments

Build all atom models for 15 parent alignments

Rank models by the GA341 score

Best GA341 score < 0.6?

Apply genetic algorithm operators to parents

Number of child alignments >300?

Select representative alignments

Build all-atom models for all representative alignments

Select best 10 models by statistical potential score(parents for next iteration)

Number of iterations >25?

No

No

No

Rank models by composite score

Stop

Select best model

Rank alignments by alignment score (top 15 parents)

Start

Sequence profiles of target & template

Generate 25 initial target-template alignments

Build all atom models for 15 parent alignments

Rank models by the GA341 score

Best GA341 score < 0.6?

Apply genetic algorithm operators to parents

Number of child alignments >300?

Select representative alignments

Build all-atom models for all representative alignments

Select best 10 models by statistical potential score(parents for next iteration)

Number of iterations >25?

No

No

No

Rank models by composite score

Stop

Select best model

Rank alignments by alignment score (top 15 parents)

Moulding

alignment

model building

alignment

model assessment

Page 29: Modeling the Structures of  Macromolecular Assemblies

Application to a difficult modeling case 1BOV-1LTS (4.4% sequence identity)

1lts structure

1lts model

C RMSD 10.1Å

initial

C RMSD 3.6Å

final

Page 30: Modeling the Structures of  Macromolecular Assemblies

Given a sequence-structure pair and an EM map: Exploring alignment (CM) and rigid body (EM)

degrees of freedom

1. Generate a set of comparative models by a standard procedure.

2. Select the model that fits the EM map best.

More generally:

Use the quality of the best fit into the EM map as one of the

model fitness criteria in MOULDER.

With Wah Chiu et al.

Page 31: Modeling the Structures of  Macromolecular Assemblies

Moulding protocol

model building

alignment

model fitting

model assessment

helixhunter, sheehunter

PROF, PSIPRED

MODELLER

Initial alignment

fold assignment

foldhunter,

EM docking algorithm

GA341 score PROSA score

EM fitting score

MODELLER

S

E

model building

alignment

model assessment

Page 32: Modeling the Structures of  Macromolecular Assemblies

Application: Rice Dwarf Virus (6.8Å EM)

Fold assignment: HELIXHUNTER and DEJAVU - 2 templates:

Upper domain (beta sheet) - bluetongue virus VP7 (1bvp)

Lower domain (helical) - rotavirus VP6 (1qhd)

Generate initial alignments : HELIXHUNTER and secondary structure prediction methods

MOULDING:

Iterative alignment - restrained by EM data

Model building with MODELLER, loop optimization

Model assessment (GA341 score and PROSA score)

Model fitting (FOLDHUNTER)

With Wah Chiu et al.

Page 33: Modeling the Structures of  Macromolecular Assemblies

loop optimization: EM-derived restraints + MM forcefield

Upper domain - good fit

Page 34: Modeling the Structures of  Macromolecular Assemblies

MODELLER-derived restraints

Lower domain - to improve fit

EM-derived restraints

best fitted structure

model fitting

Page 35: Modeling the Structures of  Macromolecular Assemblies

END

Page 36: Modeling the Structures of  Macromolecular Assemblies

Acknowledgmentshttp://salilab.org

Comparative ModelingBino JohnNarayanan EswarUrsula PieperMarc A. Martí-Renom

Structural GenomicsStephen Burley (SGX)John Kuriyan (UCB)NY-SGRC

NIHNSFSinsheimer FoundationA. P. Sloan FoundationBurroughs-Wellcome FundMerck Genome Res. Inst.Mathers FoundationI.T. Hirschl FoundationThe Sandler Family FoundationSUNIBMIntelStructural Genomix

RibosomesH.Gao J.SenguptaM.Valle A.Korostelev S.StaggP.Van RoeyR.AgrawalS.HarveyM.ChapmanJ.Frank

LSDPatsy Babbitt, Fred Cohen, Ken Dill, Tom Ferrin, John Irwin, Matt Jacobson, Tanja Kortemme, Tack Kuntz, Brian Shoichet, Chris Voigt

Page 37: Modeling the Structures of  Macromolecular Assemblies

Analysis

Score

Assessing the well scoring modelsHow similar are the models to each other?

Do the models make sense given other data?

Using “toy” models as benchmarks.

Page 38: Modeling the Structures of  Macromolecular Assemblies

Analysis

Search conservation of :Protein-protein contacts

Structural features

freq

uen

cy

NUP type

NU

P t

yp

e

Page 39: Modeling the Structures of  Macromolecular Assemblies

E. coli ribosomeH.Gao, J.Sengupta, M.Valle, A. Korostelev, N.Eswar, S.Stagg, P.Van Roey, R.Agrawal, S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell 113, 789-801, 2003.

1. …

2. Calculate comparative models of the proteins and rRNA based on T. thermophilus, H. marismortui, and D. radiodurans ribosome structures.

3. For each protein, select the best model among several alternatives based on model assessment by statistical potentials, model compactness, target-template sequence identity, quality of template structure, and fit into the EM map.

4. Obtain the final assembly model by rigid body real-space refinement of the fit into the EM map, by RSRef (M. Chapman).

5. Repeat 3 and 4.

Page 40: Modeling the Structures of  Macromolecular Assemblies

Combining Modeller and Docking of Atomic Models into EM Density Maps (ModEM?)

Page 41: Modeling the Structures of  Macromolecular Assemblies

Improving resolution and accuracy of molecular models determined by cryo-EM

and comparative protein structure prediction

Map courtesy of Wah Chiu.

Page 42: Modeling the Structures of  Macromolecular Assemblies

Typical errors in comparative models

Distortion/shifts in

aligned regions

Region without a

template

Sidechain packing

Incorrect template

MODEL

X-RAY

TEMPLATE

Misalignment

Marti-Renom et al. Annu.Rev.Biophys.Biomol.Struct. 29, 291-325, 2000.