20
LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes of molecules famous for their roles in biological systems. Nucleic acids, DNA and RNA, are essential for the transmission of genetic information. We will discuss the organic chemistry related to the biosyntheses of nucleic acids, and the organic chemistry that takes place between small organic molecules and nucleic acids leading to the initiation or treatment of cancer. Proteins are required for the maintenance of cellular structures and for the catalysis of biochemical reactions, and enzyme-catalyzed reactions and their cofactors will be discussed. Finally, natural products are small organic molecules with enormous diversity of structure. Despite this diversity, they are synthesized in cells by pathways conceptually related to those of the above macromolecules. The biosynthetic pathways that lead to the production of natural products will be examined in terms of the principles of organic chemistry.

INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

Embed Size (px)

Citation preview

Page 1: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

LECTURE NOTES

INTRODUCTION

This course will explore the organic chemistry of three major classes of molecules famous for

their roles in biological systems. Nucleic acids, DNA and RNA, are essential for the transmission of

genetic information. We will discuss the organic chemistry related to the biosyntheses of nucleic acids,

and the organic chemistry that takes place between small organic molecules and nucleic acids leading to

the initiation or treatment of cancer. Proteins are required for the maintenance of cellular structures and

for the catalysis of biochemical reactions, and enzyme-catalyzed reactions and their cofactors will be

discussed. Finally, natural products are small organic molecules with enormous diversity of structure.

Despite this diversity, they are synthesized in cells by pathways conceptually related to those of the above

macromolecules. The biosynthetic pathways that lead to the production of natural products will be

examined in terms of the principles of organic chemistry.

Page 2: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

STRUCTURE AND CHEMISTRY OF PROTEINS

PROTEIN STRUCTURE

AMINO ACIDS

Amino acids are the building blocks of proteins and have the following general structure:

H3NO

OR H

Amino acids are linked via peptide (amide) bonds in order to form peptides. Proteins are simply

long chains of amino acids, or polypeptides. The stereochemistry at the central α-carbon is always as

depicted in natural amino acids (usually the S configuration, but see cysteine). These are called the L-

isomers by convention, since plane-polarized light undergoes levorotation as it passes through a solution

of S amino acids.

Note that glycine is the only amino acid that has hydrogen as the R substituent and hence is an

achiral molecule. However, the two hydrogens at the α-carbon of glycine are not identical but rather are

enantiotopic. If one of these hydrogens is substituted with a hydrogen isotope such as deuterium, the

resulting molecule is chiral. Once other amino acids are linked to glycine, a chiral environment will be

created in which the enantiotopic relationship of the two hydrogens will become important. In such an

environment, the two protons are not identical and can therefore undergo different chemistry.

The amino acids can be grouped into sets of hydrophobic, polar, and charged side chains. Each

amino acid has both a three-letter and a one-letter abbreviation.

Page 3: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

Hydrophobic Amino Acids

H3NO

OH3C H

H3NO

OH

H3NO

OH

H3NO

OH

CH3H3C

H3C CH3H3C

H3CH

alanine/Ala/A valine/Val/V leucine/Leu/L isoleucine/Ile/I

H3NO

OH

H3NO

OH

H3NO

OH

H3NO

OH

NH2 O

OH

NHOH

SCH3

phenylalanine/Phe/F tyrosine/Tyr/Y proline/Pro/P

tryptophan/Trp/W methionine/Met/M

Polar Amino Acids

H3NO

OH H

H3NO

OH

H3NO

OH

OH OHH3C

H

glycine/Gly/G serine/Ser/S threonine/Thr/T

H3NO

OH

H3NO

OH

H3NO

OH

SHO

NH2

OH2N

cysteine/Cys/C asparagine/Asn/N glutamine/Gln/Q

Charged Amino Acids

H3NO

OH

H3NO

OH

H3NO

OH

H3NO

OH

H3NO

OH

O OO

O

H3NNH

H2NNH2

NHHN

aspartate/Asp/D lysine/Lys/K histidine/His/H

glutamate/Glu/E arginine/Arg/R

Page 4: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

For a polypeptide chain to come together to form one unique conformation (the “folded”

structure) is one of the miracles of the life sciences. It is an enormously complex problem to fold a

polypeptide chain because of the large number of rotatable bonds in the polypeptide chain. One common

feature that is apparent in the hundreds of known crystal structures of proteins is that hydrophobic side

chains tend to pack in the middle of the protein. What is truly amazing is the jigsaw puzzle network of

these packing interactions. There is a beautiful fit of these residues in the protein, with very little space

left.

CONFORMATIONAL ANALYSIS

The three dimensional structures of the amino acids can be determined using the principles of

conformational analysis. We will analyze each of the natural amino acids by using four very simple

model compounds: ethane, butane, pentane, and propene.

EthaneConsider ethane. In this system, there is only one important conformational concept, that of the

staggering of bonds. The following Newman projection of ethane depicts the staggered conformation.

Rotation about the carbon-carbon bond by 60o produces the eclipsed conformation.

H

H

H

H

HHH

HHH

HH

+ 3.0 kcal / mol

staggered eclipsedMicrowave spectroscopy can be used to examine the energetics of this process. The conversion

from the staggered to the eclipsed conformation is endothermic (requires the input of energy) by 3.0

kilocalories per mole (kcal/mol), which is a rather substantial energy requirement. In terms of an

equilibrium ratio between the two conformations, such an energy difference results in a 160 to 1 ratio of

the staggered to the eclipsed form in a population of molecules at room temperature. The energy

difference has been the subject of some controversy, but is thought to arise from a favorable interaction

Page 5: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

between the C-H bonding orbitals of one carbon and the C-H antibonding orbitals on the adjacent carbon

in the staggered conformation. This interaction lowers the energy of the staggered conformer by 3

kcal/mol relative to the eclipsed conformer. Since three pairs of eclipsed bonds produce this energy

difference, each single eclipsed bond raises the energy by one kcal/mol. This energetic penalty applies

not only to eclipsed C-H bonds, as in ethane, but to most others as well, including eclipsed C-C or C-F σ

bonds. Throughout the course, you will see other examples where this overlap of bonding and

antibonding orbitals plays a crucial role in determining the structure and reactivity of a molecule.

C

H

C

H

σ C-H σ∗ C-H

σ C-H

σ∗ C-H

Molecules that have cylindrical symmetry about their σ bonds and can rotate very easily will

avoid this energetic penalty at all costs and adopt a staggered conformation preferentially. In fact, there

are now over 200,000 high-resolution X-ray structures in the Cambridge Structural Database of small

organic molecules, and many of these contain methyl groups. Of these molecules, only two of them have

an eclipsed conformation. Nature will always try hard to avoid an eclipsed interaction.

ButaneNow consider the conformations of butane. The conformation of larger molecules in this course

will be drawn according to the following lattice skeleton, which we will call the template projection, in

order to standardize our analyses:

Page 6: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

H

H

H

H

H

H

H

H

H

H

In the case of butane, there is a new conformational issue with respect to the relationship of the

two methyl groups. As in ethane, the central bond is a σ bond, which has almost free rotation, but in this

case two different staggered relationships can exist: the anti and gauche conformations.

Me

HHMe

HHH

HMeMe

HH

+ 0.9 kcal / mol

anti gauche

H

H

HH

H

H

H

H

H

HH

H

HH

H

H

H

H H

H

Note that the anti conformation produces a symmetrical molecule. However, if butane could be

locked into the gauche conformation it would be chiral. Of course, the rotation is so rapid that butane is,

overall, an achiral molecule although at fractions of an instant it can adopt chiral shapes.

Page 7: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

The energetics of this process are shown above. The conversion of the anti to the gauche

conformation is endothermic; the anti conformation is favored by 0.9 kcal/mol. It is a less dramatic

difference than that between the staggered and eclipsed forms; in this case, at a room temperature

equilibrium the ratio of these two molecules would be on the order of 82% anti, 18% gauche. The origin

of this destabilizing force is an electrostatic repulsion between electrons in the C-H bonds of the methyl

groups that begin to overlap slightly when the two methyls are close to each other in the gauche

conformation. The 0.9-kcal/mol energy increase due to the gauche conformation can be applied to a

variety of systems other than butane, as will be demonstrated shortly.

PentanePentane has two rotatable bonds, not including the terminal methyl groups (note that these will

simply adopt the staggered conformation about their C-C σ bonds). The conformation below is called the

anti-anti conformation because it has an anti conformation about both central bonds. Note that the

overall structure about each bond is similar to that seen in anti-butane.

H

H

MeH

HMe

H

H

What are the energetic consequences of rotating these carbon-carbon bonds? Consider what

happens when you rotate the indicated carbon-carbon bond by 120o. This conformation of pentane is

called anti-gauche, hopefully for obvious reasons. One central σ bond still contains an anti butane-like

conformation, but the other has been rotated to adopt a gauche conformation. The energetic consequence

of this rotation is +0.9 kcal/mol. You can dissect out the two butane-like conformations; one stayed anti

and therefore had no energy change, while the other converted to a gauche conformation. Recall that for

butane, the gauche conformation is +0.9 kcal/mol more energetic than the anti is, and this quantity holds

in this more complex system. A second 120o rotation about the other C-C bond produces gauche-gauchepentane. This conformation is +1.8 kcal/mol more energetic relative to the anti-anti conformation, since

Page 8: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

now a second gauche interaction has been introduced. A third rotation of 120o about this same bond

produces a gauche-gauche interaction again. However, the energy of this conformation is approximately

4-5 kcal/mol greater in energy relative to the anti-anti conformation. This conformation has a new kind of

effect, which is manifest only with pentane and not with butane. This particular gauche-gauche pentane

has conspired, through the five carbon atoms, to place the two methyl groups particularly close to each

other — this is called the syn-pentane conformation, and it is avoided at all costs.

H

H

HH

HH

Me

Me

Me

H

HH

HH

Me

H

syn-pentanegauche-gauche

H

H

HH

HMe

Me

H

anti-gauche

CyclohexaneNow consider cyclohexane. Recall that this molecule can undergo a chair-chair flip. In the case

of 1,3-dimethylcyclohexane with the cis stereochemistry, there are two possible chair conformations. The

energetic difference between these two can be estimated based on the principles outlined in the model

systems above.

H

Me

Me

H

Me

H

Me

H

When both groups are equatorial, the molecule has an anti-anti pentane conformation. From the

point of view of the methyl groups, the same molecule, flipped into the diaxial conformation, has a syn-

pentane interaction. Hence, only the diequatorial conformation of 1,3-dimethylcyclohexane is observed.

The previous model systems also allow an estimation of energy differences for cyclohexanes with single

substituents, such as a single methyl group.

Page 9: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

Me

H

Me

H

It is obvious that axial substituents are disfavored relative to equatorial ones, but now the energy

differences between the two conformations can be estimated. Note that when the methyl is equatorial,

there are two anti butane conformations, whereas after a ring flip there are two gauche interactions. The

difference between these two conformations is thus +1.8 kcal/mol. The term for such an energy

difference based on one substituent is the A value. It is a number assigned to a particular substituent on a

cyclohexane ring, based on the energy difference between the axial and equatorial conformation. Hence,

the methyl group A value is +1.8 kcal/mole. This is the energetic cost of putting a methyl group in the

axial position of cyclohexane, relative to the equatorial position.

Finally, why is the chair conformation of cyclohexane is very stable despite the fact that it

contains a series of apparent syn-pentane interactions? Cyclohexane does have the built-in syn-pentane

geometry, but in this case, one of the hydrogens from each of the two terminal methyl (-CH3) groups has

been replaced with a methylene (-CH2R) unit. Hence, the electrons from the C-H bonds that would be

repelling each other in the syn-pentane are instead forming bonds to the same carbon atom; they are part

of a bond, which is a very stabilizing situation. Hence, there are no true syn-pentane interactions in chair

cyclohexane.

ValineThese principles of conformational analysis can be applied to amino acids as well. Peptide

chemists have devised a nomenclature for the rotatable bonds along a polypeptide chain, as shown here:

HN

NH

O

HR

ω

ψφχ1

Page 10: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

Consider the amino acid valine, depicted below in the standard template projection. With valine

and other amino acids, the carboxylic acid and amino groups will be considered here, to a first

approximation, to be about the size of a methyl group. Note that in the case of peptides however, the

actual steric effects seen in simple hydrocarbons tend to be magnified. This is because, in the case of the

amino substituent, there is a bulky carbonyl group is attached to it. The same holds true for the carboxyl

group. Hence, these structures are actually more sterically bulky than what is observed in the simple

hydrocarbon models. Nevertheless, the same conformational principles still apply.

H

NH

Me

H

MeO

χ1 Valine/Val/V

The χ1 bond of valine is rotatable, with three different possible positions for the hydrogen in the

staggered conformation produced by a 120o rotation. There are three possible conformations: one with

two gauche and two anti interactions, while the other two have one anti and three gauche interactions.

The first has the lowest energy, and 90-95% of the side chains will be in this conformation. I will call this

the 180o conformation, since there is an angle of 180o between the two C-H bonds that are attached to the

carbon atoms of the χ1 bond. The remaining molecules in the population have side chains that will be in

one of the two 60o orientations. If butane were the perfect model, the energy difference would be

estimated at +0.9 kcal/mol, with about 82% of the structures in the 180o conformation. However, because

the amine and carboxylic acid groups are large, the percentage of structures having this conformation is

greater than 82%.

LeucineIn the case of leucine, the side chain is longer by one carbon atom and thus the conformation of a

second C-C bond in the side chain (χ2) must be considered.

Page 11: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

H

NHH

H

O

Me

H

Me

χ1

χ2

Leucine/Leu/L

First, consider the effect of a rotation about χ2 value by 120o. The major effect of this rotation is

that the methyl group is placed in a syn-pentane position relative to the carbonyl. This is a prohibitive

interaction, and hence this conformation is not observed. What happens if χ1 is rotated by 120o

(counterclockwise as drawn)? If χ1 is rotated and χ2 remains fixed, another syn-pentane interaction is

created (you should make a plastic model to convince yourself of this fact). Hence, this conformation is

not observed either. However, there is a 120o rotation about χ1 that is allowed — it simply requires a

simultaneous rotation about χ2. The small hydrogen can now go back into that very crowded position and

a syn-pentane interaction can be avoided. To a first approximation, this conformation is isoenergetic with

the first one. The Keq for the two conformers related by that simultaneous rotation of both χ1 and χ2 is

about one. Any other rotation about χ2 is not allowed because it will create another syn-pentane

interaction. Hence there are only two lower energy conformations for the leucine side chain and these are

approximately equally populated. Note also that leucine is a very large amino acid with a hydrophobic

side chain. Therefore, leucine is usually found in the hydrophobic interior of a folded protein in water.

IsoleucineIsoleucine is an isomer of leucine. Again, the starting conformation will be similar to that of

valine, with the preferred χ1 value.

H

NH

H

MeO

Me

H

HIsoleucine/Ile/I

Page 12: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

The hydrogen will again be placed in the more crowded “down” position. What is the location of

the methyl and ethyl groups? These are on a chiral carbon atom; hence, the absolute stereochemistry

dictates the position of the methyl. The question then becomes, on the ethyl group, in which of the three

different positions should the methyl be placed? It is placed in the more extended position to avoid a syn-

pentane interaction. Therefore isoleucine has one preferred conformation, and 95% of isoleucine amino

acid side chains adopt this single conformation. Isoleucine can thus be viewed as a rigid amino acid in

spite of its seemingly rotatable bonds.

Page 13: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

MethionineThe final amino acid we will consider is methionine. It has an unusually floppy side chain and

therefore can access a very large number of conformations.

H

NH

H

HO

S

H

H

Methionine/Met/M

Me

The first issue concerns the χ1 value. To a first approximation, there will be an equal population

of conformers with the methylene either to the right or to the left. Note again that it would be unfavorable

to place the chain in the more hindered down position. The next substituent on the side chain is the sulfur

atom. Again, this will not be placed in the very crowded position since this would produce a syn-pentane

interaction. Therefore, as in the case of isoleucine, it is placed in the least crowded “forward” position.

Finally, how are the three groups attached to sulfur arranged? Note that there are three — two of them

are lone pairs of electrons. By analogy to the previous discussion, the methyl, the largest of the three, is

placed in the least crowded position between the two hydrogens.

Consider now a rotation about χ2. Placing the S-methyl group in the more hindered "up" position

creates a gauche-butane-like interaction. However, it turns out that the conformations with methionine up

or in the first position drawn are almost isoenergetic. The reason is due to the nature of the C-S bond,

which is a very long bond relative to a C-C bond. There is thus very little steric clash between the α-

carbon and the thiomethyl.

Now consider a rotation about χ3. In one conformation, there is a gauche interaction between the

terminal S-methyl and the rest of the chain. However, since two very long C-S bonds separate them, the

steric clash is minimized. Hence, these two conformations, rotated about χ3, are roughly isoenergetic.

Overall, there are very few steric interactions built into the methionine side chain.

The unusually flexible methionine sidechain plays a special role in biology. One illustrative

example shows how nature uses this flexibility to allow a single host protein to recognize many different

guest partners. When proteins are synthesized that function outside of the cell, the cell has to solve the

problem of exporting that protein across the plasma membrane. It does so by equipping the protein with a

“signal peptide,” which consists of about 20 hydrophobic amino acids. In the plasma membrane is a

protein complex called the signal recognition particle (SRP), which binds the signal peptide and helps

export the protein. However, while there are many different signal peptide sequences, there is only one

Page 14: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

type of receptor. How can a single receptor recognize these peptides based on their hydrophobicity rather

than their precise sequence? This question was answered when the sequence of the receptor protein was

determined. The amazing feature of this protein is that 40% of its amino acid residues are methionine!

Methionine is very hydrophobic, and undoubtedly all of these methionines project inwards; hence, the

receptor protein can be viewed as a hydrophobic channel with very flexible side chains. Thus, when any

hydrophobic peptide is inserted into it, the channel will be able to rotate its many methionine sidechains

to accommodate the signal peptide. If the peptide is hydrophilic, it will be expelled from the hydrophobic

channel.

PropeneThe final model molecule for conformational analysis is propene (propylene). In particular we

will consider the sp2-sp3 bond similarly to the sp3-sp3 bond in ethane.

H

H

H

H

HH

H

H

H

HH

H+ 2.0 kcal / mol

Eclipsed Staggered

Imagine looking at a Newman projection down this bond. There are two conformations with

respect to the C=C double bond: staggered and eclipsed, just as in ethane. It turns out that the eclipsed

conformation is more stable by about +2.0 kcal/mol. Again, the source of the energy difference is still

controversial, but it is generally believed to result from electron repulsion between the π orbital system of

the double bond and two of the methyl C-H σ-bonds in the staggered conformation. Note that in the

eclipsed conformation, the C-H bond facing the double bond is projected into the nodal plane of the π-

bond, where these is no electron cloud to repel.

+ 2.0 kcal / mol H

H

H

Repulsion betweenπ-system and σ-orbitals

H

HH

Now consider a substituted form of propene.

Page 15: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

Me

Me

H

H

MeMe Me

Me

H

Me

MeH

+ 3.5 kcal / mol

In the eclipsed conformation, there are now two different positions that the methyl groups on the

allylic (sp3) carbon can occupy. The conformation on the right is reminiscent of a syn-pentane interaction

or a 1,3-diaxial methyl-methyl interaction. This form is energetically unfavorable and is therefore not

observed. Strain that exists in such an allylic moiety is called A1,3 (or allylic) strain, since the sterically

interacting substituents are on atoms 1 and 3.

How does this model relate to amino acids? Rather than looking at side chains, we will now

focus more on the main polypeptide chain. Amides have a lone pair on the nitrogen that can delocalize

into the adjacent carbonyl as illustrated in the resonance structure:

HN R

OH

Me HN R

OH

Meamide resonance+

O O

Hence, there is a certain similarity between the amide functionality and the trimethyl-substituted

propylene. Both are, to a first approximation, isosteric. This substitution pattern is found in every single

amide linkage in a polypeptide chain. Thus, in every amide linkage, the main chain rotatable bonds will

adopt conformations that minimize A1,3 strain. If the only important consideration in peptide chain

conformation were allylic strain, minimizing it would produce one of two recurring motifs in peptides, the

β conformation. This conformation places the hydrogen in the same plane as the carbonyl. This

conformation repeats itself in the long extended strand. An important aspect of β-strands is that two β-

strands, aligned in either parallel or antiparallel fashion, completely satisfy the hydrogen-bonding

propensity of the amide carbonyl and the amide NH. Note that minor rotations away from the

conformation minimizing A1,3 strain are permissible and will produce additional peptide structures such as

α-helices.

There are additional considerations with respect to the amide bond resonance. In order for

delocalization to occur, the amide moiety has to be planar. Therefore, the dihedral angle ω, which

corresponds to the C-N bond, has two allowed conformations of 0o and 180o, with the angle

corresponding to the angle between the two largest groups on the carbon and nitrogen. These angles

correspond to cis and trans stereochemistry, respectively. Note that the nomenclature is with respect to

the two R groups. If they are opposed to each other, they are trans; if they are on the same side of the C-

Page 16: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

N bond, the stereochemistry is cis. However, while the energetic difference between cis- and trans-2-

butene is rather small — about +1 kcal/mol in favor of the cis form — the energetic difference between

the cis- and trans-amide is very large. This is due to the fact that the R groups in this case are much

larger and the steric clash is more pronounced. Hence, the cis conformation is rarely found in polypeptide

chains. Occasionally, however, the cis conformation is found in proline and glycine residues. It tends to

be very common in proline residues, and much less so in glycine, though it is still observed. It is

interesting that there are several disease states in which the cis conformation of glycine is important.

For example, there is a fiber found in the brains of Alzheimer’s patients called the amyloid fiber.

It has been a contentious issue for years as to whether this is a cause or an effect of the disease, though

there is evidence on the side of it being a cause. The one difference between the fibrous form of the

protein found in patients and the non-fibrous natural form is that the glycine amides have rotated into a cis

conformation.

It is easily understood why proline readily adopts both cis and trans conformations. Proline is a

unique amino acid in that the three atoms attached to nitrogen are all carbon atoms — there is no

hydrogen. However, the methylene is still smaller than the methyne (-CHR2) bearing the acyl group.

Therefore, the trans conformation is still the more common one, in a ratio of approximately 85:15 trans to

cis conformers. In light of the propene model, proline is interesting in that one of the main chain bonds,

whose dihedral angle is denoted Ψ, becomes fixed in a peptide chain in the conformation that minimizes

the A1,3 interaction. The hydrogen of this allylic carbon is usually found in the same plane as either the

methylene or the methyne in a proline peptide. Thus, even though Ψ is a cylindrically symmetric and

rotatable σ bond, it is essentially frozen in proline.

ω

ψ

ω

HN

N

H

Me +1.0 kcal / molOHN

N

H

MeO

H

ψ

O

H

O

PROTEIN FOLDING

One of the miracles of proteins is that although their polypeptide chains are floppy, they manage

to find one single global conformation in the folded state of the protein. Four major forces act on these

Page 17: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

polypeptides to achieve this folded state. The first is hydrogen bonding. This is an essential element in

the previously described β-sheet structures, formed by hydrogen bonding between two strands. In the

case of α-helices in peptide chains, the main element that holds helix together are hydrogen bonds

between the amide NH and carbonyl.

Recall that with any amide moiety (other than proline) there is an NH on one side and a carbonyl

with the oxygen lone pairs on the other side. These groups would like to form hydrogen bonds, with the

NH group donating a proton and the carbonyl donating the lone pair electrons. Proteins fold in a way to

maximize this propensity for hydrogen bonding. Helices and sheets are common motifs for

accomplishing this. Another common folding motif in polypeptide chains are turns. These form when a

carbonyl and an NH that are 10 atoms apart share a hydrogen bond. In a turn, the main chain comes in

one direction and exits in another — it is a way to change the directionality of the chain.

The second important effect is electrostatic in nature: the formation of salt bridges between

negatively and positively charged groups. Salt bridges often form between a positively charged arginine

residue and a nearby negative charge, such as a glutamate. Such salt bridges are very common, but they

Page 18: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

are probably most energetically critical in the rare cases in which they are found in the interior core of the

protein. This core usually has a low dielectric constant and is nonpolar; hence, any highly charged polar

residues must be neutralized. Salt bridges are also found on the surface of proteins, but since the

dielectric constant in the surrounding water is high, this electrostatic interaction is of less consequence to

the overall energy of the protein.

The third important component for protein folding is the disulfide bond. These are formed when

two thiols are oxidized to release two electrons and two protons, and form a bond between the two sulfur

atoms.

S SHN NH

SHHN

O O O

+ 2 e-, 2 H+ 2

If cysteines are close in space, they will form such a bond. Clearly, this bond is much stronger

than hydrogen bonds and imposes a major constraint on protein structure. One important aspect of

disulfide bonds is that the lowest energy conformation is that with a 90o angle between the S-R bonds.

Note that the inside of a cell is a reducing environment. Disulfide bonds are therefore not observed

frequently in intracellular proteins. However, proteins that are secreted from the cell, such as hormones,

often have many such bonds.

The fourth force is the hydrophobic effect. There is no way to do justice to such a very

complicated phenomenon in a brief description, but it can be illustrated in the following manner.

Consider a polypeptide chain, as it goes from an unfolded to a folded state. One of the problems with the

unfolded state is that there are hydrophobic amino acids like phenylalanine, valine, and leucine in an

aqueous milieu — surrounded and solvated by water molecules. Upon folding, these hydrophobic side

chains pack together inside the protein and solvate each other. The result is that water molecules

previously solvating the side chains are now released into the bulk media.

Consider the energetics of this process. The unfolded state will be called the first state, and the

folded the second state. The energetics of the process can be determined by monitoring the changes in

entropy (the amount of disorder) and enthalpy (the amount of “heat”) as the protein goes from the first to

the second state. A hypothetical phenylalanine side chain can be used as a model for the protein. In the

unfolded state, the phenylalanine is solvated by water. However, the phenylalanine side chain is

hydrophobic and cannot directly hydrogen bond to water. Rather, the water molecules form extensive

hydrogen bonds with each other, thereby creating a lattice of hydrogen-bonded water molecules around

the phenylalanine. The water molecules become very ordered, and form a three dimensional ice-like

Page 19: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

structure. They form six-membered rings using three water molecules held together by three hydrogen

bonds. This shell is probably several molecules thick and continues around the phenylalanine side chain.

The formation of an ice-like water lattice has interesting consequences. The unfolded (state 1)

form, with the lattice of water molecules, allows each of the negatively charged lone pairs on oxygen to

feel a positively charged proton. This electrostatic effect lowers the enthalpy (H) of the water molecules.

In other words, H1 is lower than H2. Therefore, ΔH for the folding reaction is positive. This is

counterintuitive; the folding of a protein increases the enthalpy of the system because water loses these

very strong hydrogen bonds. By folding the protein, the water molecules are released into the bulk media,

where they are moving around rapidly and cannot form stable hydrogen bonds.

Due to this lattice of water molecules, state 1 is more ordered — it has a lower entropy (S).

Hence, S1 is smaller than S2 and therefore ΔS for the folding process is positive. Recall the equation ΔG =

ΔH - T ΔS, where ΔG is the free energy of the process. A process is favored when there is a decrease in

the free energy; hence, ΔG must be negative. If ΔH and ΔS for protein folding are both positive, then ΔG

can only be negative if T (temperature) is large enough to make the TΔS term larger than the ΔH term.

A simple experiment highlights these principles. Years ago a whole series of heat of transfer

reactions were carried out using calorimetry to measure the change in energy and using temperature

dependency to study the role of S and H. In this experiment, water was used as the solvent in one case

and carbon tetrachloride (CCl4) in the other. Methane will be a model for valine and phenylalanine since

it is a very hydrophobic molecule. Water as a solvent is a model for the unfolded state, while carbon

tetrachloride, being nonpolar, is a model for the folded state, mimicking the hydrophobic interior of the

folded protein. This experimental system can be used to measure the change in ΔG calorimetrically.

Varying the temperature and monitoring its effect on ΔG allows the calculation of ΔH and ΔS according

to the previous equation.

The ΔG for the movement of methane from water to carbon tetrachloride is -2.5 kcal/mol: the

reaction is exothermic (releases energy). This is expected intuitively since methane prefers to be in

carbon tetrachloride rather than water. The surprising finding is that ΔH for this reaction is +2.5

kcal/mol. This is an enthalpically endothermic process. In terms of the enthalpy, methane would rather

be in water! Why? Because when methane is in water, the shells of water molecules form and there are

very strong hydrogen bonds. This lowers the energy of the system. In order to balance the contribution

of enthalpy and still give a negative ΔG, -T ΔS must be strongly negative. Experimentally, it was found

to be –5.0 kcal/mol. Hence, ΔS is a positive number, as expected. This is a very simple model for protein

folding.

This phenomenon can be observed in biological systems in a process called cold denaturation.

Folded proteins are known that will unfold at lower temperatures. This demonstrates that entropy plays

Page 20: INTRODUCTION - Harvard Universitysites.fas.harvard.edu/~chem27/announce/RedBook.pdf · LECTURE NOTES INTRODUCTION This course will explore the organic chemistry of three major classes

an important role in the hydrophobic effect. With a positive ΔS, lowering the temperature decreases the

favorable contribution to ΔG. The lower temperature shifts the equilibrium to the unfolded state.

A fluorescent reagent can be used to visualize the cellular cytoskeleton under the microscope. It

is the microtubule component of the cytoskeleton that forms a rigid lattice, allowing the cell to maintain

its physical shape. Microtubules are made of two proteins, α-tubulin and β-tubulin. Not surprisingly, to

make such a three-dimensional structure, these tubulin proteins fold together. If the temperature is

lowered, it is possible to see the unfolding of the microtubules under the microscope — the fluorescence

disappears. This is a completely reversible process; increasing the temperature, thereby increasing the

entropic component to folding, allows the cytoskeleton to refold and form the skeletal architecture.

There is small organic molecule called taxol that it is one of the few major new anti-cancer drugs.

It binds to tubulin and stabilizes its folded state. In the presence of taxol, at low temperature, the

cytoskeleton persists. Taxol binds to the cytoskeleton and stabilizes it. This is a problem for cancer cells

because the process of segregation of chromosomes during replication requires that the spindle apparatus,

composed of microtubules, disassemble. When this process is prevented by taxol, replication is arrested

and the cell cannot divide.