26
Part II : Sequence Analysis Paul Tan Thiam Joo [email protected] Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Part II : Sequence Analysis Paul Tan Thiam Joo [email protected] Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Embed Size (px)

Citation preview

Page 1: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Part II : Sequence Analysis

Paul Tan Thiam Joo

[email protected]

Department of Biochemistry, Medicine Faculty, NUS

Institute for Infocomm Research

Page 2: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

What is sequence analysis?

• Nucleic acids: DNA and RNA

• Proteins: amino acid composition, pI, molecular weight, hydrophobicity.

Page 3: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Why do sequence analysis?

• Assessing potential allergenicity (Gendel, 2002)

• Parkinson's disease (neurodegenerative) (Tversky and Fink, 2002)

• Human Genome Project (completed in 2001)

Page 4: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Sequence analysis of proteins

• Backtranslation

• Amino acid composition

• Molecular weights, pIs

• Hydropathy profile

Page 5: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

http://kr.expasy.org

Page 6: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Backtranslation

• Protein -> DNA

• Use for cloning protein of interest where it may be present in low amount.

• Beware of codon bias and degeneracy of codons.

Page 7: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

UUU-Phe UCU-Ser UAU-Tyr UGU-Cys UUC-Phe UCC-Ser UAC-Tyr UGC-Cys UUA-Leu UCA-Ser UAA-Stop UGA-Stop UUG-Leu UCG-Ser UAG-Stop UGG-Trp

CUU-Leu CCU-Pro CAU-His CGU-Arg CUC-Leu CCC-Pro CAC-His CGC-Arg CUA-Leu CCA-Pro CAA-Gln CGA-Arg CUG-Leu CCG-Pro CAG-Gln CGG-Arg

AUU-Ile ACU-Thr AAU-Asn AGU-Ser AUC-Ile ACC-Thr AAC-Asn AGC-Ser AUA-Ile ACA-Thr AAA-Lys AGA-Arg AUG-Met ACG-Thr AAG-Lys AGG-Arg

GUU-Val GCU-Ala GAU-Asp GGU-Gly GUC-Val GCC-Ala GAC-Asp GGC-Gly GUA-Val GCA-Ala GAA-Glu GGA-Gly GUG-Val GCG-Ala GAG-Glu GGG-Gly

Page 8: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Biased codon usageAmino acid Codon Bacteria Yeast Fruit Fly Human

Leu UUAUUG PreferredCUUCUCCUACUG Preferred Preferred Preferred

Val GUU Preferred PreferredGUCGUAGUG Preferred Preferred

Page 9: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Amino Acid Composition

• Determine the percentages of amino acid residues present in a protein molecule.

• Uses:– determine the lifestyles of organisms: high

percentages of Glutamate (- charge) and both Lysine and Arginine (+ charge) in hyperthermophiles vs. mesophiles -> absent (Tekaia et al., 2002).

– predict structural class (Luo et al., 2002).

Page 10: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Nonpolar amino acids (FILMWAV)

Page 11: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Polar uncharged (S-Q+T-N+Y-)

Page 12: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Polar charged (KHERD)

Page 13: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Unique Properties

Page 14: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Protein functions from specific residues

• C Disulphide-rich, zinc fingers

• G Collagens

• H Histidine-rich glycoprotein

• KR Nuclear proteins, nuclear localisation

• P Collagen, filaments

Page 15: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research
Page 16: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research
Page 17: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Molecular weights, pIs

• Aid in designing of purification experiments e.g. SDS-PAGE, IEF, 2-Dimensional Gel, Column chromatography etc.

Page 18: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Hydropathy Profiles

• Hydropathy - describe the hydrophobicity and hydrophilicity of a protein sequence.

• A graph in which hydropathy values are calculated within a sliding window and plotted for each residue in a protein sequence.

Page 19: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

A sliding window

M K F F L M C L I I F P I M G V L G

Page 20: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Signal region

Alpha-helix

Page 21: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Alpha-helix

Alpha-helix

Beta-sheet

A schematic representation of a 3-D structure of a scorpion toxin

Page 22: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Hydropathy Profiles

• Hydropathy scale - each amino acid is assigned a value reflecting its relative hydrophobicity and hydrophilicity.

• 2 broad classes of scales:– Environmental characteristics of protein

residues.– Experimental measurements of amino acid

physiochemical properties.

Page 23: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Venn Diagram of the 20 amino acid physiochemical properties

Page 24: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Hydropathy Profiles

• Basic ranking: internal {FILMV}, external {DEHKNQR}, ambivalent {ACGPSTWY}

Page 25: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

Hydropathy Profiles

• Detect possible transmembrane domains (consecutive 20-25 runs of hydrophobic amino acids).

• Hydrophobic protein cores

• Predict neurotoxicity in snake Phospholipases A2 (Kini and Iwanaga, 1986)

Page 26: Part II : Sequence Analysis Paul Tan Thiam Joo paul@bic.nus.edu.sg Department of Biochemistry, Medicine Faculty, NUS Institute for Infocomm Research

References

• Kini RM, Iwanaga S. (1986) Toxicon 24(6):527-541.

• Rehm BH. (2001) Appl Microbiol Biotechnol. 57(5-6):579-92.

• Weir M, Swindells M, OveringTon J. (2001) Trends Biotechnol 19(10 Suppl):S61-6.

• Gendel S. M. (2002) Ann. N.Y. Acad. Sci. 964: 87–98.• Luo RY, Feng ZP, Liu JK. (2002) Eur J Biochem 2002 269(17):4219-

4225

• Tekaia F, Yeramian E, Dujon B. (2002) Gene 297:51-60.

• Tversky VN, Fink AL. (2002) FEBS Lett 522(1-3):9-13.

• EXPASY http://cn.expasy.org/