View
218
Download
1
Category
Tags:
Preview:
Citation preview
Part II : Sequence Analysis
Paul Tan Thiam Joo
paul@bic.nus.edu.sg
Department of Biochemistry, Medicine Faculty, NUS
Institute for Infocomm Research
What is sequence analysis?
• Nucleic acids: DNA and RNA
• Proteins: amino acid composition, pI, molecular weight, hydrophobicity.
Why do sequence analysis?
• Assessing potential allergenicity (Gendel, 2002)
• Parkinson's disease (neurodegenerative) (Tversky and Fink, 2002)
• Human Genome Project (completed in 2001)
Sequence analysis of proteins
• Backtranslation
• Amino acid composition
• Molecular weights, pIs
• Hydropathy profile
http://kr.expasy.org
Backtranslation
• Protein -> DNA
• Use for cloning protein of interest where it may be present in low amount.
• Beware of codon bias and degeneracy of codons.
UUU-Phe UCU-Ser UAU-Tyr UGU-Cys UUC-Phe UCC-Ser UAC-Tyr UGC-Cys UUA-Leu UCA-Ser UAA-Stop UGA-Stop UUG-Leu UCG-Ser UAG-Stop UGG-Trp
CUU-Leu CCU-Pro CAU-His CGU-Arg CUC-Leu CCC-Pro CAC-His CGC-Arg CUA-Leu CCA-Pro CAA-Gln CGA-Arg CUG-Leu CCG-Pro CAG-Gln CGG-Arg
AUU-Ile ACU-Thr AAU-Asn AGU-Ser AUC-Ile ACC-Thr AAC-Asn AGC-Ser AUA-Ile ACA-Thr AAA-Lys AGA-Arg AUG-Met ACG-Thr AAG-Lys AGG-Arg
GUU-Val GCU-Ala GAU-Asp GGU-Gly GUC-Val GCC-Ala GAC-Asp GGC-Gly GUA-Val GCA-Ala GAA-Glu GGA-Gly GUG-Val GCG-Ala GAG-Glu GGG-Gly
Biased codon usageAmino acid Codon Bacteria Yeast Fruit Fly Human
Leu UUAUUG PreferredCUUCUCCUACUG Preferred Preferred Preferred
Val GUU Preferred PreferredGUCGUAGUG Preferred Preferred
Amino Acid Composition
• Determine the percentages of amino acid residues present in a protein molecule.
• Uses:– determine the lifestyles of organisms: high
percentages of Glutamate (- charge) and both Lysine and Arginine (+ charge) in hyperthermophiles vs. mesophiles -> absent (Tekaia et al., 2002).
– predict structural class (Luo et al., 2002).
Nonpolar amino acids (FILMWAV)
Polar uncharged (S-Q+T-N+Y-)
Polar charged (KHERD)
Unique Properties
Protein functions from specific residues
• C Disulphide-rich, zinc fingers
• G Collagens
• H Histidine-rich glycoprotein
• KR Nuclear proteins, nuclear localisation
• P Collagen, filaments
Molecular weights, pIs
• Aid in designing of purification experiments e.g. SDS-PAGE, IEF, 2-Dimensional Gel, Column chromatography etc.
Hydropathy Profiles
• Hydropathy - describe the hydrophobicity and hydrophilicity of a protein sequence.
• A graph in which hydropathy values are calculated within a sliding window and plotted for each residue in a protein sequence.
A sliding window
M K F F L M C L I I F P I M G V L G
Signal region
Alpha-helix
Alpha-helix
Alpha-helix
Beta-sheet
A schematic representation of a 3-D structure of a scorpion toxin
Hydropathy Profiles
• Hydropathy scale - each amino acid is assigned a value reflecting its relative hydrophobicity and hydrophilicity.
• 2 broad classes of scales:– Environmental characteristics of protein
residues.– Experimental measurements of amino acid
physiochemical properties.
Venn Diagram of the 20 amino acid physiochemical properties
Hydropathy Profiles
• Basic ranking: internal {FILMV}, external {DEHKNQR}, ambivalent {ACGPSTWY}
Hydropathy Profiles
• Detect possible transmembrane domains (consecutive 20-25 runs of hydrophobic amino acids).
• Hydrophobic protein cores
• Predict neurotoxicity in snake Phospholipases A2 (Kini and Iwanaga, 1986)
References
• Kini RM, Iwanaga S. (1986) Toxicon 24(6):527-541.
• Rehm BH. (2001) Appl Microbiol Biotechnol. 57(5-6):579-92.
• Weir M, Swindells M, OveringTon J. (2001) Trends Biotechnol 19(10 Suppl):S61-6.
• Gendel S. M. (2002) Ann. N.Y. Acad. Sci. 964: 87–98.• Luo RY, Feng ZP, Liu JK. (2002) Eur J Biochem 2002 269(17):4219-
4225
• Tekaia F, Yeramian E, Dujon B. (2002) Gene 297:51-60.
• Tversky VN, Fink AL. (2002) FEBS Lett 522(1-3):9-13.
• EXPASY http://cn.expasy.org/
Recommended