40
Computational Protein Design 1. Challenges in Protein Engineering Pablo Carbonell [email protected] iSSB, Institute of Systems and Synthetic Biology Genopole, University d’Évry-Val d’Essonne, France mSSB: December 2010 Pablo Carbonell (iSSB) Computational Protein Design mSSB: December 2010 1 / 40

Computational Protein Design. 1. Challenges in Protein Engineering

Embed Size (px)

Citation preview

Page 1: Computational Protein Design. 1. Challenges in Protein Engineering

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 1 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 2 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 3 40

Protein Engineering

Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels

The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property

Both experimental data and in silico predictions can contribute to the model

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40

Protein Engineering

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40

The Protein Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40

Computational Protein Design in the Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 2: Computational Protein Design. 1. Challenges in Protein Engineering

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 2 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 3 40

Protein Engineering

Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels

The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property

Both experimental data and in silico predictions can contribute to the model

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40

Protein Engineering

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40

The Protein Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40

Computational Protein Design in the Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 3: Computational Protein Design. 1. Challenges in Protein Engineering

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 3 40

Protein Engineering

Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels

The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property

Both experimental data and in silico predictions can contribute to the model

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40

Protein Engineering

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40

The Protein Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40

Computational Protein Design in the Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 4: Computational Protein Design. 1. Challenges in Protein Engineering

Protein Engineering

Protein engineering is a technology that alters protein structures in order toimprove their properties in applications such as pharmaceuticals green chemistryand biofuels

The main challenge is to build more accurate models to predict whichsubstitutions are the best candidates to insert in the parent protein in order toenhance the desired property

Both experimental data and in silico predictions can contribute to the model

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 4 40

Protein Engineering

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40

The Protein Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40

Computational Protein Design in the Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 5: Computational Protein Design. 1. Challenges in Protein Engineering

Protein Engineering

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 5 40

The Protein Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40

Computational Protein Design in the Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 6: Computational Protein Design. 1. Challenges in Protein Engineering

The Protein Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 6 40

Computational Protein Design in the Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 7: Computational Protein Design. 1. Challenges in Protein Engineering

Computational Protein Design in the Engineering Cycle

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 7 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 8: Computational Protein Design. 1. Challenges in Protein Engineering

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 8 40

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 9: Computational Protein Design. 1. Challenges in Protein Engineering

Locating the Substitutions

How to select the best residues to mutate in theparent protein

If detailed structural information on the parentenzyme is available a rational approach canbe applied to the design

When partial information on structure isavailable a semi-rational approach is used

If there is no information available then arandom search is used

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 9 40

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 10: Computational Protein Design. 1. Challenges in Protein Engineering

Choosing the Right Strategy

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 10 40

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 11: Computational Protein Design. 1. Challenges in Protein Engineering

Additivity and Cooperativity Effects

Additivity of the effects of substitutions israrely seen when screening mutants

In order to avoid dead ends typically ascreening strategy is designed based onbuilding libraries with simultaneous mutationsin order to find cooperativity effectsTesting for simultaneous mutations comes atthe cost of a larger screening

Natural evolution however has favoredsingle-step mutations beneficial althoughneutral drift in this case has probably allowedfor a larger search in the sequence space Additivitycooperativity experiments searching for high affinity

antibody variants

[Chodorge et al 2008]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 11 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 12: Computational Protein Design. 1. Challenges in Protein Engineering

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 12 40

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 13: Computational Protein Design. 1. Challenges in Protein Engineering

Types of Protein Interactions

Protein-ligand binding(drug-target enzyme-substrate)

Protein-nucleotide(DNARNA) binding)

Protein-peptide interaction Protein-protein interaction

Protein-Protein interactions

Adapted from [Perkins et al 2010]

Protein-protein complexes

homo-oligomeric hetero-oligomericnon-obligate obligate(weak and strong) transient permanent

[Nooren and Thornton 2003]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 13 40

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 14: Computational Protein Design. 1. Challenges in Protein Engineering

Protein Specificity and Promiscuity

Multispecificity broad partner specificity(multiple substrates proteins ligands)

Small molecule ligand similar chemicalstructure usually with stereoselectivityProteins or peptides structural similar motifsrather than sequence motifs

Promiscuity the ability to participate in afunction other than the native one(moonlighting)

Allostery regulation of the protein by bindingof some ligand (the effector) at the allostericsite

Lock and key Induced fit

[Fischer 1894] [Koshland 1958]

Conformational selection

[Boehr et al 2009]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 14 40

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 15: Computational Protein Design. 1. Challenges in Protein Engineering

Protein Specificity and Promiscuity The Case of PPIs

PPI any physical binding between proteins that occurin vivo in the cellPPI screening methods still have some limitations

Y2H high FP-rateTAP-MS limited scalabilityLuminiscence-based methods proteome chipsco-immunoprecipitation MS real-time analysis (3rdgeneration DNA-seq)

Transient and PTM-dependent interactions are oftenmissed

Biological context developmental stageco-localization protein modifications presence ofcofactors presence of other binding partners

Protein hubs highly connected proteins related toessentiality robustness modularity evolvability Partyand date hubs under debate

single-interface multi-interface

[Kim et al 2006]

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 15 40

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 16: Computational Protein Design. 1. Challenges in Protein Engineering

Data Sources

Enzymatic activityBRENDA experimental parametersKEGG MetaCyc metabolic networksCatalytic Site Atlas catalytic sites

Data validation and predictionGeneMANIA lists of genes with functionally similar or shared propertiesSTRING based on genomic context HT experiments co-expression literatureComPASS assign confidence to an interaction detected by MS

Primary PPI databasesDIP BioGRID IntAct MINTCommon languages PSICQUIC expression co-localization genetic metabolicsignaling pathways experimental data SBMLBuilding the network Cytoscape

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 16 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 17: Computational Protein Design. 1. Challenges in Protein Engineering

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 17 40

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 18: Computational Protein Design. 1. Challenges in Protein Engineering

Overview of Protein Engineering Technology

From a need to adjust enzyme properties for industrial processes

to the challenge of generating novel proteins for therapeutic and biomedicalapplications

GoalsIncreased catalytic function related to the parent

Altered specificity stereospecificity or affinity to interacting partners

Increased stability

Property ParametersThermostability T50

Catalytic activity kcat KM kcatKM

Binding specificity (kcatKM )A(kcatKM )B

Kd KI

Binding affinity Ka = 1Kd

∆G = minusRT ln 1Kd

A paradigm shift in the last 2decades

PCR and recombinant genetechnologies

Recreation of evolution in thelab

Computer algorithms

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 18 40

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 19: Computational Protein Design. 1. Challenges in Protein Engineering

Goal 1 Increasing the Thermostability

Thermostability quantifies the ability of proteinrsquos secondary and tertiarystructures to withstand high temperatures avoiding denaturation

Thermostability is typically measured experimentally by T50 the temperature atwhich 50 of the proteins are inactivated in 10 minutes

Increasing the thermostability can be considered the first step in proteinengineering in order to make the protein tolerant to a greater range of amino acidsubstitutions

Main design techniquesSequence-based design comparison through multiple alignmentsStructure-based approach assumes that a more rigid protein will be more stable athigh temperatures

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 19 40

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 20: Computational Protein Design. 1. Challenges in Protein Engineering

Goal 2 Increasing the Catalytic Activity

How to quantify enzyme activity Michaelis-Menten model of kinetics

E + Sk1

kminus1

ES k2

E + P (1)

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) (2)

d [P]

dt= k2[ES] (3)

k2 is also known as kcat or turnover rate (in morecomplex cases kcat is function of several rates)

kcat alone is not enough we need to quantify the affinityof the enzyme to the substrate

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 20 40

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 21: Computational Protein Design. 1. Challenges in Protein Engineering

Enzyme Kinetics

AssumptionsFirst assumption the concentration of the substrate-bound enzyme [ES] isapproximately constant compared with the rate of change of the concentration ofsubstrate [S] and product [P]

d [ES]

dt= k1[E ][S]minus [ES](kminus1 + k2) asymp 0 (4)

Second assumption the total concentration of enzyme [E ]0 does not changewith time

[E ]0 = [E ] + [ES] asymp const (5)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 21 40

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 22: Computational Protein Design. 1. Challenges in Protein Engineering

The Michaelis constant KM

0 = k1[S]([E ]0 minus [ES])minus [ES](kminus1 + k2) (6)

k1[S][E ]0 = k1[S][ES] + [ES](kminus1 + k2) (7)

[S][E ]0 = [S][ES] + [ES]kminus1 + k2

k1(8)

(9)

KM Michaelis constant

KM =kminus1 + k2

k1(10)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 22 40

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 23: Computational Protein Design. 1. Challenges in Protein Engineering

The Michaelis Constant KM and the steady-state flux

Rate of product formation (flux)

d [P]

dt= v = k2[ES] = k2[E ]0

[S]

KM + [S](11)

v =vmax [S]

KM + [S]=

11 + KM

[S]

vmax (12)

KM can be measured as the concentration of substrate [S] that corresponds to aproduct formation yield half of the maximum

v =vmax

2(13)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 23 40

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 24: Computational Protein Design. 1. Challenges in Protein Engineering

Determining KM from the concentration curve

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 24 40

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 25: Computational Protein Design. 1. Challenges in Protein Engineering

Evaluating Enzyme Efficiency

kcatKM is often used as a specificity constant to compare relative enzyme ratesof reaction of pairs of substrates transformed by an enzyme

For an enzyme acting simultaneously on two substrates SA SB at rates vA vB

vA

vB=

kAcatK A

M [SA]

kBcatK B

M [SB](14)

At [SA] = [SB] kcatKM provides a measure of substrate promiscuity efficiency

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 25 40

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 26: Computational Protein Design. 1. Challenges in Protein Engineering

Goal 3 Protein Binding Affinity and Specificity

Proteins can bind to different partners

Protein-ligand binding interaction with a small molecule such as drug-target orenzyme-substrate

Protein-nucleotide (DNARNA) binding in transcription regulation promotersetc

Protein-protein interactionPermanent or obligated in multi-units proteins it could have a structural or functionalroleTransient in signaling transport and regulation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 26 40

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 27: Computational Protein Design. 1. Challenges in Protein Engineering

31 Protein Binding Affinity

Dissociation constant

A + Bk1

kminus1

AB (15)

d [AB]

dt= k1[A][B]minus kminus1[AB] (16)

In equilibrium

0 = k1[A][B]minus kminus1[AB] (17)

kd =kminus1

k1=

[A][B]

[AB](18)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 27 40

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 28: Computational Protein Design. 1. Challenges in Protein Engineering

31 Protein Binding Affinity

Affinity constant

ka =1kd

(19)

In antibodies

Ab + Agkforward

kback

AbAg (20)

Binding free energy

∆G = minusRT ln ka = minusRT ln1kd

(21)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 28 40

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 29: Computational Protein Design. 1. Challenges in Protein Engineering

Simplified Thermodynamics of an Enzymatic Reaction

[Jonas and Hollfelder in Protein Engineering Handbook (2009)]

Ground-state binding (KM )

Transition-state binding (Ktx )

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 29 40

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 30: Computational Protein Design. 1. Challenges in Protein Engineering

32 Protein Binding Specificity

These concepts are central to modern protein design in applications such as drugdesign biosynthesis and degradation

Binding specificity to some partner is determined by comparing either kcatKM kaor kd for all partners

KI inhibition constant When an inhibitor competes with a ligand

Multispecificity the protein has broad partner specificity multiple substratesproteins or ligands

Small molecule ligand similar chemical structure usually with stereoselectivityProteins or peptides structural similar motifs rather than sequence motifs

Promiscuity the ability to participate n a function other than the native one

Allostery regulation of a protein by binding of some ligand (the effector)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 30 40

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 31: Computational Protein Design. 1. Challenges in Protein Engineering

Thermodynamics of a Reaction with 2 Competing Substrates

[Desari and Miller in Protein Engineering Handbook (2009)]

Specificity reflects differences in the absolute heights of the transition states

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 31 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 32: Computational Protein Design. 1. Challenges in Protein Engineering

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 32 40

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 33: Computational Protein Design. 1. Challenges in Protein Engineering

Introducing the Substitutions

Site-directed (saturation) mutagenesis1 Cloning the DNA of interest into a plasmid vector2 The plasmid DNA is denatured to produce single strands3 A synthetic oligonucleotide with desired mutation (point

mutation deletion or insertion) is annealed to the targetregion

4 Extending the mutant oligonucleotide using a plasmidDNA strand as the template

5 The heteroduplex is propagated by transformation in Ecoli

Error-prone PCRModifications of standard PCR methods designed to alterand enhance the natural error rate of the polymerase

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 33 40

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 34: Computational Protein Design. 1. Challenges in Protein Engineering

Outline

1 The Protein Design Cycle

2 Locating the Substitutions

3 Types of Protein Interactions

4 Engineering Protein Activity

5 Introducing the Substitutions

6 Screening and Library Creation

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 34 40

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 35: Computational Protein Design. 1. Challenges in Protein Engineering

Recombination and DNA-shuffling

A natural approach to making multiplemutations is recombinationCircular permutation to alter proteintopology

DNA-shuffling to perform functionaldomain or motif shuffling in vitro

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 35 40

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 36: Computational Protein Design. 1. Challenges in Protein Engineering

Recombinant Protein Folding

E coli is a typically first choice for expressing a heterologous protein

However numerous recombinant proteins fail to fold into soluble form whenexpressed in E coli

Some misfolding-related issues

Multidomains proteins usually require the assistance of folding modulators such aschaperones asor foldases

The environment (crowding pH osmolarity etc)

Post-translational modifications such as disulfide bond formation or glycoslylation (usuallyconfined to extra-cytoplasmic compartments)

Two possible outcomes for a misfolded proteinInsoluble aggregation into inclusion bodiesDegradation proteolysis

E coli expressing human leptin as

inclusion body

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 36 40

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 37: Computational Protein Design. 1. Challenges in Protein Engineering

Directed Evolution

A remarkable property of proteins is their evolvability they can adapt underpressure of selection by changing their behavior function or even fold

Inspired by natural evolution directed evolution uses iterative rounds of randommutation and artificial selection or screening to discover protein variants with novelfunctionalities

An iterative process

Identifying a good starting sequence usually containing some level of latentpromiscuity

Creation of a library of variants

Selecting variants with improved function (mutation and screening)

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 37 40

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 38: Computational Protein Design. 1. Challenges in Protein Engineering

From Natural Enzymes to Protein Engineeringto Computational Protein Design

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 38 40

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 39: Computational Protein Design. 1. Challenges in Protein Engineering

Computational Protein Design1 Challenges in Protein Engineering

Pablo Carbonellpablocarbonellissbgenopolefr

iSSB Institute of Systems and Synthetic BiologyGenopole University drsquoEacutevry-Val drsquoEssonne France

mSSB December 2010

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 39 40

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References
Page 40: Computational Protein Design. 1. Challenges in Protein Engineering

Bibliography I

David D Boehr Ruth Nussinov and Peter E Wright The role of dynamic conformational ensembles in biomolecular recognition Nature chemical biology 5(11)789ndash796 November 2009 ISSN 1552-4469 doi 101038nchembio232 URL httpdxdoiorg101038nchembio232

Matthieu Chodorge Laurent Fourage Gilles Ravot Lutz Jermutus and Ralph Minter In vitro DNA recombination by L-Shuffling during ribosome displayaffinity maturation of an anti-Fas antibody increases the population of improved variants Protein Engineering Design and Selection 21(5)343ndash351 May2008 doi 101093proteingzn013 URL httpdxdoiorg101093proteingzn013

Philip M Kim Long J Lu Yu Xia and Mark B Gerstein Relating three-dimensional structures to protein networks provides evolutionary insights Science(New York NY) 314(5807)1938ndash1941 December 2006 ISSN 1095-9203 doi 101126science1136174 URLhttpdxdoiorg101126science1136174

D E Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis Proceedings of the National Academy of Sciences of the United States ofAmerica 44(2)98ndash104 February 1958 ISSN 0027-8424 URL httpviewncbinlmnihgovpubmed16590179]

Irene M Nooren and Janet M Thornton Diversity of protein-protein interactions The EMBO journal 22(14)3486ndash3492 July 2003 ISSN 0261-4189 doi101093embojcdg359 URL httpdxdoiorg101093embojcdg359

James R Perkins Ilhem Diboun Benoit H Dessailly Jon G Lees and Christine Orengo Transient Protein-Protein Interactions Structural Functional andNetwork Properties Structure 18(10)1233ndash1243 October 2010 ISSN 09692126 doi 101016jstr201008007 URLhttpdxdoiorg101016jstr201008007

Pablo Carbonell (iSSB) Computational Protein Design mSSB December 2010 40 40

  • The Protein Design Cycle
  • Locating the Substitutions
  • Types of Protein Interactions
  • Engineering Protein Activity
  • Introducing the Substitutions
  • Screening and Library Creation
  • References