Structure, functions and folding problems of protein

By_ Saurav K. Rawat M.Sc. Chem.

(Physical special)

Structure Of Proteins & Protein Folding Problems

Saurav K. Rawat Department of Chemistry, St. John’s College, Agra ;

Sat. Dec.14th

,2013

Presentation By_

What the Proteins Are?Importance and Biological

FunctionsClassificationMolecular Masses of Some ProteinsAmino Acids as Monomers of

Proteins20 Types of Amino Acids4 levels of Protein Structure- viz.

Primary, Secondary, Tertiary, and Quaternary Structures

Corey -Pauling RulesStructure of Peptide Bond

Ramachandran Plotα- and β- Pleated Sheet StructuresStability and Folding of ProteinAnfinsen’s Experiment, Levinthal Paradox and

KineticsHsp and Molecular Chaperons in Protein FoldingProbes for Conformational DetectionDo You Know?Disorders Due to Conformational ChangeQuick and Hot ReviewReferencesUniversity Questions

Proteins (Gr. Protiose : first of foremost)Berzelius (1837) and Mulder (1838) coined the term

protein.Proteins are macronutrients that support the growth

and maintenance of body tissues.Chemical composition- C-51%, O- 25%, H- 7%, S- 0.4%, sometimes P- also present in traces.Amino acids are the basic building blocks of proteins

and are classified as essential or non-essential. Essential amino acids are obtained from protein-rich

foods such as meat, legumes and poultry, while non-essential ones are synthesized naturally in your body.

According to the Centers for Disease Control and Prevention, you should obtain 10 percent to 25 percent of your daily calorie needs from proteins

WHAT THE PROTEINS ARE..!

Importance of Proteins and Their Biological Functions

Type Examples Occurrence/function

Contractile Proteins

• Actin• Myosin• Dynein

•Thin filaments in myofibril•Thick filaments in myofibril•Cilia and flagella

Enzymes

• Hexokinase• Lactatae dehydrogenase• Cytoochrome c• DNA Polymerase

•Phosphorylates glucose•Dehydrogenates lactate•Transfer electrons•Replicates and repairs DNA

Hormones

• Insulin•Adrenocorticotrophic hormone• Growth hormone

•Regulates glucose metabolism•Regulates corticosteroid synthesis

•Stimulate growth of bones


Receptors

•Ion channel receptors•G protein linked receptors•Tyrosine kinase receptors

•Present on cell membrane and cytoplasm and receives the stimulations from the outer environment so as to cell may respond according to them.

Toxins

• Clostridium bolulinum toxin• Diphtheria toxin• Snake venom

• Ricin• Gossypin

•Causes bacterial food poisoning•Bacterial toxin•Enzymes that hydrolyze phosphoglycerides•Toxic protein of castor bean•Toxic protein of cottonseed

Storage proteins

• Ovalbumin• Casein• Ferritin• Gliadin• Zein

•Egg-white protein•A milk protein•Iron storage in spleen•Seed protein of wheat•Seed protein of corn


Defensive proteins

•Antibodies•Fibrinogen•Thrombin

•Form complexes with foreign proteins•Precursor of fibrin in blood clotting•Component of clotting mechanism

Transport proteins

• Hemoglobin• Hemocyanin

• Myoglobin• Serum albumin• Ceruloplasmin

•Transports O2 in blood of vertebrates•Transports O2 in blood of some invertebrates•Transports O2 in muscle cell •Transports fatty acids in blood•Transports copper in blood

Structural proteins

•Viral coat protein• Glycoprotein•α- keratin• Sclerotin• Fibroin• Collagen• Elastin• Mucoprotein

•Sheath around nucleic acid•Cell coats and walls•Skin, feathers, nails, hoofs etc•Exoskeletons of insects•Silk of cocoons, spider webs•Fibrous connective tissues(tendons,bone,cartilages)•Elastic connective tissue(ligaments)•Mucous secretions,synovial fluid

Classification of ProteinsBased on Conformation Based on Composition

FibrousInsoluble in H2O

GlobularSoluble in H2O

•α-Keratin•β-Keratin •Collagen

•Myoglobin•Hemoglobin•Lysozyme•Ribonuclease•Chymotrypsin•Cytochrome-c•Lactate dehydrogenase•subtilisin

Simple Conjugated Derived

•Albumin•Globulin•Glutalins•Prolamins•Protamines•Histones•Scleroproteins

•Nucleoprotein•Lipoprotein•Phosphoprotein•Metalloprotein•Glycoprotein•Flavoprotein•Hemoprotein•chromoproteins

•Protiose•Peptones•Small peptides•Fibrin•Metaproteins•Coagulated proteins

Based on Nature of MoleculesAcidic Basic

•Blood proteins •Histones

Molecular Mass of Some Proteins

Protein Relative molecular mass

Insulin 5,700

Hemoglobin 64,500

Myoglobin 16,900

Hexokinase 102,000

Glycogen phosphorylase

370,000

Glutamine synthetase 592,000

Protein synthesis (DNA transcription, translation and folding).MP4

VIDEO SHOWING PROTEIN SYNTHESIS

AND FOLDING

Proteins are Linear Polymers of Amino Acids

R1

NH3＋ C CO

H

R2

NH C CO

H

R3

NH C CO

H

R2

NH3＋ C COO ー

H

＋

R1

NH3＋ C COO ー

H

＋

H2OH2O

Peptide bond

Peptide bond

The amino acid sequence is called

as primary structure A AF

NGG

S TS

DK

A carboxylic acid condenses with an amino group with the release of a water

Amino Acid: Basic Unit of Protein

Different side chains, R, determin the properties of 20 amino acids.

COO-NH3+ C

R

HAmino group Carboxylic

acid group

Facts About Amino Acids

Though approximately 300 amino acids occur in nature but only 20 make the composition of proteins.

All amino acids, apart from the simplest one (glycine) show optical isomerism.

This can result in two different arrangements viz. D- amino acid and L- amino acid.

With a few minor exceptions, e.g., bacterial cell wall contains D- amino acids only the L- forms are found in living organisms.

Gamma Amino Butyric Acid (GABA), Histamine serotonin, Ornithine, Citruline and β- alanine are the amino acids, which are not found in proteins.

20 Types of Amino Acids

Glycine (G)

Glutamic acid (E)Asparatic acid (D)

Methionine (M)

Threonine (T)Serine (S)

Glutamine (Q)

Asparagine (N)

Tryptophan (W)Phenylalanine (F)

Cysteine (C)

Proline (P)

Leucine (L)Isoleucine (I)Valine (V)

Alanine (A)

Histidine (H)Lysine (K)

Tyrosine (Y)

Arginine (R)

White: Hydrophobic, Green: Hydrophilic, Red: Acidic, Blue: Basic

Hierarchical Nature of Protein Structure

Primary structure (Amino acid sequence)↓

Secondary structure （ α-helix, β-sheet ）↓

Tertiary structure （ Three-dimensional structure formed by assembly of

secondary structures ）↓

Quaternary structure （ Structure formed by more than one polypeptide chains ）

Definitions of the Four Levels of Structure Primary structure- refers to the covalent backbone of the

polypeptide chain and the sequence of its amino acid residues. The enzyme ribonuclease and the protein myoglobin function

only in their primary structure. Secondary structure- refers to a regular recurring

arrangement in space of the polypeptide chain along one dimension.

Secondary structures are stabilized by H-bonds. Keratin (a fibrous protein found in skin) is composed of almost

entirely of α- helices, while Fibrion (silk protein) is almost entirely composed of β- sheets.

Tertiary structure- refers to how the polypeptide chain is bent or folded in three dimensions, to form the compact, tightly folded structure of globular proteins.

The interactions involved in folding include weak ionic bonds, H-bonds, hydrophobic interactions and strong disulphide bonds b/w neighbouring cysteine amino acids.

Enzymes are functional with a tertiary structure only. Quaternary structure- refers to how individual polypeptide

chains of a protein having two or more chains are arranged in relation to each other. Most larger proteins contain two or more polypeptide chains b/w which there are usually no covalent linkage.

Primary Structure of Protein

• It is a globular protein

• It contains two polypeptide chains

• Alpha unit has 21 amino acid residues

• Beta subunit has 30 amino acid residues

• Neighbouring cysteines are linked by disulphide bond

Introduction to Structure of Proteins

• Unlike most organic polymers, protein molecules adopt a specific 3-dimensional conformation in the aqueous solution.

• This structure is able to fulfill a specific biological function

• This structure is called the native fold• The native fold has a large number of favorable

interactions within the protein• There is a cost in conformational entropy of folding

the protein into one specific native fold

Corey- Pauling RulesA set of rules, formulated by Robert Corey and Linus

Pauling in 1951, that govern the secondary nature of proteins. The Corey-Pauling rules are concerned with the stability of structures provided by hydrogen bonds associated with the –CO-NH– peptide link. The Corey-Pauling rules state that:

(1) All the atoms in the peptide link lie in the same plane.

The planarity of the link is due to delocalization of pi electrons over the O ,C and N atoms and the maintenance of maximum overlap of their p- orbitals. (2) The N, H, and O atoms in a hydrogen bond are approximately on a straight line. (3) All the CO and NH groups are involved in bonding.

Two important structures in which the Corey-Pauling rules are obeyed are the alpha helix and the beta sheet.

http://www.answers.com/topic/pauling-s-rules-1

http://www.answers.com/topic/hydrogen-bond

http://www.answers.com/topic/beta-sheet-1

Scheme Showing Peptide Structure

Structure of the Peptide Bond

Structure of the protein is partially dictated by the properties of the peptide bond

The peptide bond is a resonance hybrid of two canonical structures

The resonance causes the peptide bondsbe less reactive compared to e.g. esters

be quite rigid and nearly planar

exhibit large dipole moment in the favored trans configuration

The Rigid Peptide Plane and the Partially Free Rotations

Rotation around the peptide bond is not permittedRotation around bonds connected to the alpha

carbon is permitted f (phi): angle around the -carbon—amide

nitrogen bond y (psi): angle around the -carbon— carbonyl

carbon bondIn a fully extended polypeptide, both y and f are

180°

Distribution of f and y Dihedral Angles

• Some f and y combinations are very unfavorable because

of steric crowding of backbone atoms with other atoms in

the backbone or side-chains

• Some f and y combinations are more favorable because of

chance to form favorable H-bonding interactions along the

backbone

• Ramachandran plot shows the distribution of f and y

dihedral angles that are found in a protein

• shows the common secondary structure elements

• reveals regions with unusual backbone structure

Ramachandran Plot

PROTEIN SECONDARY STRUCTURES

Secondary structure refers to a local spatial arrangement of the polypeptide chain

Two regular arrangements are common: The helix

stabilized by hydrogen bonds between nearby residues

The sheetstabilized by hydrogen bonds between adjacent

segments that may not be nearby

Irregular arrangement of the polypeptide chain is called the random coil

Basic structural units of proteins: Secondary structure

α-helix β-sheet

Secondary structures, α-helix and β-sheet, have regular hydrogen-bonding patterns.

The helix

The backbone is more compact with the y dihedral (N–C—C–N) in the range

( 0 < < -70)y Helical backbone is held

together by hydrogen bonds between the nearby backbone amides

Right-handed helix with 3.6 residues (5.4 Å) per turn

Peptide bonds are aligned roughly parallel with the helical axis

Side chains point out and are roughly perpendicular with the helical axis

alpha helix.MP4

VIDEO SHOWING ALPHA HELIX

The helix: Top View

• The inner diameter of the helix (no side-chains) is about 4 – 5 Å• Too small for anything to fit

“inside” • The outer diameter of the

helix (with side chains) is 10 – 12 Å• Happens to fit well into the

major groove of dsDNA• Residues 1 and 8 align nicely

on top of each other• What kind of sequence

gives an helix with one hydrophobic face?

Sequence Affects Helix Stability

Not all polypeptide

sequences adopt -helical

structures

Small hydrophobic residues

such as Ala and Leu are

strong helix formers

Pro acts as a helix breaker

because the rotation around

the N-Ca bond is impossible

Gly acts as a helix breaker

because the tiny R-group

supports other conformations

The Helix Macro-Dipole

Peptide bond has a strong dipole momentCarbonyl O negativeAmide H positive

All peptide bonds in the helix have a similar orientation

The helix has a large macroscopic dipole moment

Negatively charged residues often occur near the positive end of the helix dipole

Sheets

The backbone is more extended with the y dihedral

(N–C—C–N) in the range ( 90 < < 180)y

The planarity of the peptide bond and tetrahedral geometry of the -carbon create a pleated sheet-like structure

Sheet-like arrangement of backbone is held together by hydrogen bonds between the more distal backbone amides

Side chains protrude from the sheet alternating in up and down direction

Beta sheet.MP4

VIDEO SHOWING BETA SHEET

Parallel and Antiparallel b Sheets

Parallel or antiparallel orientation of two chains

within a sheet are possible

In parallel b sheets the H-bonded strands run in

the same direction

In antiparallel b sheets the H-bonded strands

run in opposite directions

Structure of -Keratin in Hair

Chemistry of Curly Hair

Structure of Collagen Collagen is an important constituent of connective tissue: tendons, cartilage, bones,

cornea of the eye

Each collagen chain is a long Gly- and Pro-rich left-handed helix

Three collagen chains intertwine into a right-handed superhelical triple helix

The triple helix has higher tensile strength than a steel wire of equal cross section

Many triple-helixes assemble into a collagen fibril

Collagen Fibrils

Silk Fibroin Fibroin is the main protein in silk from moths and spiders

Antiparallel b sheet structure

Small side chains (Ala and Gly) allow the close packing of sheets

Structure is stabilized byhydrogen bonding within sheetsLondon dispersion interactions between sheets

b Turns (Hairpins) b-turns occur frequently whenever strands in b sheets change the

direction The 180° turn is accomplished over four amino acids The turn is stabilized by a hydrogen bond from a carbonyl oxygen to

amide proton three residues down the sequence Proline in position 2 or glycine in position 3 are common in b-turns

• Tertiary structure refers to the overall spatial arrangement of atoms in a polypeptide chain or in a protein

• One can distinguish two major classes– fibrous proteins

¤ typically insoluble; made from a single secondary structure– globular proteins

¤ water-soluble globular proteins¤ lipid-soluble membraneous proteins

PROTEIN TERTIARY STRUCTURE

Favorable Interactions in Proteins

• Hydrophobic effect– Release of water molecules from the structured solvation layer

around the molecule as protein folds increases the net entropy

• Hydrogen bonds– Interaction of N-H and C=O of the peptide bond leads to local

regular structures such as -helixes and -sheets

• London dispersion – Medium-range weak attraction between all atoms contributes

significantly to the stability in the interior of the protein

• Electrostatic interactions– Long-range strong interactions between permanently charged

groups– Salt-bridges, esp. buried in the hydrophobic environment strongly

stabilize the protein

Motifs (folds)

Arrangements of several secondary structure elements

Three-dimensional structure of proteins

zzzzz

Tertiary structure

Quaternary structure

• Quaternary structure is formed by spontaneous assembly of individual polypeptides into a larger functional cluster together. Proteins with two or more polypeptide chains are known as oligomeric proteins.

PROTEIN QUATERNARY STRUCTURE

Close relationship between protein structure and its function

enzyme

A

B

A

Binding to A

Digestion of A!

enzyme

Matching the shape to A

Hormone receptor AntibodyExample of enzyme reaction

enzyme

substrates

The Four Levels of Protein Structure.MP4

VIDEO SHOWING

FOUR LEVELS OF

STRUCTURE

Protein Stability and Folding•A protein’s function depends on its three-dimensional structure.

•Loss of structural integrity with accompanying loss of activity is called denaturation

•Proteins can be denatured by

• heat or cold; pH extremes; organic solvents

• chaotropic agents: urea and guanidinium hydrochloride

• Ribonuclease is a small protein that

contains 8 cysteins linked via four

disulfide bonds

• Urea in the presence of 2-

mercaptoethanol fully denatures

ribonuclease

• When urea and 2-mercaptoethanol

are removed, the protein

spontaneously refolds, and the

correct disulfide bonds are reformed

• The sequence alone determines the

native conformation

• Quite “simple” experiment, but so

important it earned Chris Anfinsen

the 1972 Chemistry Nobel Prize

Ribonuclease Refolding/Anfinsen’s Experiment

How Can Proteins Fold So Fast?

Proteins fold to the lowest-energy fold in the

microsecond to second time scales. How can they

find the right fold so fast?

Protein folding is a very finely tuned process. Hydrogen bonding between different atoms provides the force required. Hydrophobic interactions between hydrophobic amino acids pack the hydrophobic residues.

It is mathematically impossible for protein folding to

occur by randomly trying every conformation until

the lowest energy one is found (Levinthal’s

paradox)

Search for the minimum is not random because the

direction toward the native structure is

thermodynamically most favorable

The Levinthal Paradox and KineticsLevinthal's paradox is a thought experiment, also

constituting a self-reference in the theory of protein folding. In 1969, Cyrus Levinthal noted that, because of the very large number of degrees of freedom in an unfolded polypeptide chain, the molecule has an astronomical number of possible conformations. An estimate of 3300 or 10143 was made in one of his papers.

The Levinthal paradox observes that if a protein were folded by sequentially sampling of all possible conformations, it would take an astronomical amount of time to do so, even if the conformations were sampled at a rapid rate

(on the nanosecond or picosecond scale). Based upon the observation that proteins fold much faster than this, Levinthal then proposed that a random conformational search does not occur, and the protein must, therefore, fold through a series of meta-stable intermediate states.

If we assume that a protein molecule has n amino acid residues, that each residue has 2 bonds capable of rotation, and that there are 3 possible conformations (ϕ or ψ angles) for each rotatable bond in he backbone, the maximum number of possible conformations is 32n , which is approximately equal to 10n . Since each single bond can rotate completely in about 10-13 s, the total time required for every formal single bond in the backbone to rotate once is about 2×10-13s. Therefore the time required for a peptide chain to try out every possible conformation it can assume that t=10n (2n×10-13) . For a polypeptide chain of 6 residues t is in the range of microseconds, for a chain of 11 residues, about 0.2s, but for a chain of 100 residues it would be about 2×10 89s. or longer than the age of the earth. Yet staphylococcal nuclease, which has 149 residues, requires at most 0.1 to 0.2 s. How…? Why the chain fold so quickly into native conformation?

Why it is not trying out all its possible conformations?This question is a major problem in biochemistry and researches are going

on..This is only a hypothesis that it works on The Principle of cooperativety- once a

weak bonds (hydrogen bonds or hydrophobic interactions) have correctly formed in a part of polypeptide chain, they greatly increase the probability of the formation of further correct bonds without requiring the chain to try out all possible conformations.

Heat shocked proteins (Hsp) – These proteins are being synthesize vigorously when the cell is on the heat, or the environment where they have high heat.

High heat can trigger the translation of more and more Hsp.

Hsp help to fold protein properly.There are two major classes of Hsp viz.Hsp 70- also called Chaparones (DnaJ-

DnaK)Hsp 60- also called Chaparonins (GroEL-

GroES)

MOLECULAR CHAPARONES

Chaperones Assisted Protein folding

Chaperones Prevent Misfolding

Chaperonins Facilitate Folding

Probes of Protein Conformation

X-Ray AnalysisORD- optical rotatory dispersionCD- circular dichroismFluorescenceFluorescence polarizationNMR- nuclear magnetic resonance spectroscopy

Protein Structure Methods: X-Ray Crystallography

Steps needed: Purify the protein Crystallize the protein Collect diffraction data Calculate electron density Fit residues into densityPros: No size limits Well-establishedCons: Difficult for membrane

proteins Cannot see hydrogens

Circular Dichroism (CD) Analysis

CD measures the molar

absorption difference of left-

and right- circularly polarized

light: = L – R

Chromophores in the chiral

environment produce

characteristic signals

CD signals from peptide

bonds depend on the chain

conformation

Proton NMR spectrum of a protein

Amides Aromatics Alphas Aliphatics Methyls

Structure Methods: Biomolecular NMR

Steps needed: Purify the protein Dissolve the protein Collect NMR data Assign NMR signals Calculate the structure

Pros: No need to crystallize the protein Can see many hydrogens

Cons: Difficult for insoluble proteins Works best with small proteins

Do You Know…..?Collagen is the most abundant protein

in animal world and RibUlose BISphosphate Carboxylase Oxygenase (RUBISCO) is the most abundant protein in the whole biosphere.

Monellin, a Protein is the sweetest chemical obtained from an African Berry.

3 billion base pair => 6 G letters &

1 letter => 1 byte

The whole genome can be recorded in just 10 CD-ROMs!

In 2003, Human genome sequence was deciphered!

Genome is the complete set of genes of a living thing.

In 2003, the human genome sequencing was completed.

The human genome contains about 3 billion base pairs.

The number of genes is estimated to be between 20,000 to 25,000.

The difference between the genome of human and that of chimpanzee is only 1.23%!

Some Common Diseases Caused by Conformational Change in Protein Structure

Proteopathy (Proteo- [pref. protein]; -pathy [suff. disease]; refers to a class of diseases in which certain proteins become structurally abnormal, and thereby disrupt the function of cells, tissues and organs of the body. Often the proteins fail to fold into their normal configuration; in this misfolded state, the proteins can become toxic in some way (a gain of toxic function) or they can lose their normal function.The proteopathies (also known as proteinopathies, protein conformational disorders, or protein misfolding diseases), include such diseases as Alzheimer’s disease, Parkinson's disease, Prion disease, Type 2 Diabetes, Amyloidosis,and a wide range of other disorders

Mutations are because of this abnormality.

Sickle cell Disease- in sickle cell hemoglobin (Hb-S) the glutamic acid residue in the 6th position of the β- chains are replaced by valine.

Sodium cyanate injections are given to recovery from sickle cell anemia

Proteopathy Major aggregating protein

Alzheimer's disease Amyloid β peptide (Aβ);

Tau Protein

Prion diseases (multiple) Prion protein

Parkinson's disease and other synucleinopathies (multiple) α-Synuclein

Familial British dementia ABri

Familial Danish dementia ADan

Type II diabetes Islet amyloid polypeptide (IAPP; amylin)

Cataracts Crystallins

Retinitis pigmentosa with rhodopsin mutations Rhodopsin

REFERENCESHarper’s Illustrated BiochemistryBiochemistry by Albert L. LehningerBiophysical Chemistry by Gurtu & GurtuPrinciples of Physical Chemistry by

Puri,Sharma & PathaniaAtkins’ Physical ChemistryMolecular Biology by Dr. Virbala RastogiCompetitive Biology by K.N. Bhatia & K.

BhatiaText book of biology by S. ChakrabartyNCERT text books of Chemistry and Biology

Frequently Asked University Questions-Explain the structure of Protein.Describe the folding problems in

protein.How protein fold?

The truth shall make you free….!!!

Tribute to Deptt. Of Chemistry

Thanks A Lot-Our HOD SirDr. Susan Ma’m,Who Gave Me

This OpportunityAnd All Respected Teachers Special Thank Goes To-Dr. Girish Maheshwary SirDr. Jyoti Zack Ma’m (Deptt. of Zoology, St. John’s

College)

Rawat’s [email protected]@yahoo.co.uk

RawatDAgreatt/LinkedInwww.slideshare.net/RawatDAgreatt

Google+/blogger/Facebook/Twitter-@RawatDAgreatt

+919808050301+919958249693

mailto:[email protected]

mailto:[email protected]

http://www.slideshare.net/

Education

Structure, functions and folding problems of protein