54
Bioinformatics, Genomics, and Proteomics (Part II)

Bioinformatics, Genomics, and Proteomics (Part II)

Embed Size (px)

Citation preview

Page 1: Bioinformatics, Genomics, and Proteomics (Part II)

Bioinformatics, Genomics, and Proteomics (Part II)

Page 2: Bioinformatics, Genomics, and Proteomics (Part II)

• The comprehensive study of all the proteins of a cell, tissue, body fluid, or organism from a variety of perspectives, including structure, function, expression, profiling, and protein-protein interactions.

• Insight into the proteins that are present in a cell or tissue under particular biological conditions can aid in our understanding of the cell’s activities.

• Genomic sequence has limitation.

Proteomics

Page 3: Bioinformatics, Genomics, and Proteomics (Part II)

• Some annotated open reading frames (ORFs) are subsequently found not to encoded proteins.

• Others encode proteins whose functions cannot be predicted from the sequence.

• Post translational modifications that influence the protein function and cellular localization often cannot be predicted from the sequence.

• mRNA levels do not always correlate with protein levels, and interactions between proteins cannot be accessed by genomics.

Limitation in Genomics

Page 4: Bioinformatics, Genomics, and Proteomics (Part II)

• On the other hand, a protein’s function can sometimes be inferred by determining the condition under which it is expressed and active.

• From a practical stand point, proteomics can be used to track clinical disorders and detect targets for therapeutic treatments.

Proteomics

Page 5: Bioinformatics, Genomics, and Proteomics (Part II)

• In eukaryotes, there are many more proteins than genes due to the alternative splicing, post translation modifications, and post transcriptional modifications to RNA (RNA editing.)

• It is impossible to account experimentally for every member of a proteome with a single technique because proteins are susceptible to degradation; have different properties, including solubilities; and range considerably in abundance.

Proteomics - Complications

Page 6: Bioinformatics, Genomics, and Proteomics (Part II)

• First dimension – isoelectric focusing is performed to first separate the proteins in a mixture on the basis of their net charge.

• The protein mixture is applied to a pH gradient gel. When an electric current is applied, protein will migrate to ward either the anode (+) or cathode (-), depending on their net charge.

• As proteins move through the pH gradient, they will gain or lose protons until they reach a point in the gel where their net charge is zero.

• The pH in this position of the gel is known as the isoelectric point.

2D PAGE

Page 7: Bioinformatics, Genomics, and Proteomics (Part II)

• Second dimension – separate by molecular weight.• Several proteins in a sample may have the same

isoelectric point and therefore migrate to the same position in the gel.

• Proteins are further separated on the basis of differences in their molecular weights (MW) by electrophoresis, at a right angle to the first dimension, through a sodium dodecyl sulfate (SDS) polyacrylamide gel. Gel is visualized by Coomassie blue or silver protein stain.

• A 2D polyacrylamide gel can resolve up to 2,000 different proteins.

2D PAGE

Page 8: Bioinformatics, Genomics, and Proteomics (Part II)

2D PAGE

Page 9: Bioinformatics, Genomics, and Proteomics (Part II)

• The pattern of stained spots is captured by densitometric scanning.

• Databases have been established with images of 2D PAGE from different cell types.

• Software packages are available for detecting spots, matching patterns between gels, and quantifying the protein content of the spots.

• The next task after separation is to excise the individual proteins from the gel, and to identify as many of the proteins as possible using mass spectrometry (MS.)

2D PAGE

Page 10: Bioinformatics, Genomics, and Proteomics (Part II)

• Proteins with either low or high molecular weights, those that are found in cellular membranes, and those that are present in small amounts are not readily resolved by 2D PAGE.

• Highly charged proteins, such as ribosomal proteins and histone proteins, are not separated by standard conditions.

2D PAGE - Limitation

Page 11: Bioinformatics, Genomics, and Proteomics (Part II)

MALDI - MS• A spot is excised from the

gel and treated with trypsin.• Purified trypsin peptides are

separated by MALDI – time of flight (TOF) MS.

• The set of peptide masses from the unknown protein are used to search a database, and the best match is determined.

Page 12: Bioinformatics, Genomics, and Proteomics (Part II)

ESI – MS - MS• A spot is excised from the gel

and treated with trypsin.• Purified trypsin peptides are

separated according to their mass/charge (m/z) ratios, and the amino acid sequence of a selected peptide is determined with a MS.

• The unknown protein is identified by searching a protein database with the amino acid sequences from two or more peptides.

Page 13: Bioinformatics, Genomics, and Proteomics (Part II)

• The 2D differential in-gel electrophoresis method for quantitative analysis of protein expression.

• The proteins of two proteomes are labeled with the fluorescent dyes Cy3 and Cy5, respectively.

• The samples are combined and run on 2D PAGE.• The gel is scanned for each fluorescent dye, and the

relative levels of two dyes in each protein spot are recorded.

• The gel is stained with protein dye and unknown spot is excised and treated with trypsin.

• The peptides are separated by ESI-MS-MS, and the amino acid sequences are determined.

Protein Expression profiling

Page 14: Bioinformatics, Genomics, and Proteomics (Part II)
Page 15: Bioinformatics, Genomics, and Proteomics (Part II)

• Proteins from two proteomes are labeled with light and heavy ICAT reagent.

• The samples are combined and treated with trypsin.

• The peptides are captured by affinity chromatography using avidin, and fractionated by LC.

• The ration of light:heavy is determined by MS.

• Amino acid sequences are determined by ESI-MS-MS.

ICAT- LC - MS - MS

Page 16: Bioinformatics, Genomics, and Proteomics (Part II)

• Conceptually, protein microarrays are similar to DNA microarrays.

• They consist of large numbers of proteins individually immobilized in known positions on the coated surface of glass slide or silicon chip.

• The proteins arrayed can be antibodies specific for each protein in an organism, purified recombinant proteins, or short synthetic peptides.

• There are many ways of attaching a protein to a support surface.

• The major objective of any coupling system is maintenance of protein structure and function.

Protein Microarray

Page 17: Bioinformatics, Genomics, and Proteomics (Part II)

• Some systems bind proteins to a chemical group that coat the surface of the support.

• With other protocols, recombinant proteins are prepared with a short amino acid sequence (tag) at N or C terminus that bind to a recognition sequence on the support. In this case, all the protein molecules are uniformly oriented.

• Instead of spotting proteins on a flat surface, some microarrays are engineered with tiny depression (nanowells) that keep each protein moist and prevent mixing with adjacent proteins.

Protein Microarray

Page 18: Bioinformatics, Genomics, and Proteomics (Part II)

• The purpose of protein microarray analyses is to detect, on a large scale, the molecules that a protein interacts with.

• These interacting molecules can be other proteins, nucleic acid sequences, or low molecular-weight compounds.

• Protein populations from different samples can be compared, for example, in control versus treated samples or in normal versus diseases tissues.

Protein Microarray

Page 19: Bioinformatics, Genomics, and Proteomics (Part II)

• Direct labeling – to label the test samples directly with a fluorescent dye and then detect the labeled molecules that bind to the proteins of a microarray with a laser scanner. Two-dye strategy (e.g. Cy3 or Cy5) can be used to compare proteins in two different sample on a single array.

• Sandwich style – the sample molecules are biotinylated, and after the initial incubation, a streptavidin-fluorescent-dye conjugate that binds to biotin to facilitated the detection of sample molecules is applied.

Protein Microarray - Visualizing

Page 20: Bioinformatics, Genomics, and Proteomics (Part II)

Protein array detection method

Page 21: Bioinformatics, Genomics, and Proteomics (Part II)

• Analytical protein microarray. Different types of ligand, including antibodies, antigens, DNA or RNA aptamers, carbohydrates or small molecules, with high affinity and specificity, are spotted down onto a surface.

• These chips can be used for monitoring protein expression level, protein profiling and clinical diagnostics.

• Similar to the procedure in DNA microarray experiments, protein samples from two biological states to be compared are separately labeled with red or green fluorescent dyes, mixed, and incubated with the chips.

• Spots in red or green color identify an excess of proteins from one state over the other.

Analytical VS Functional

Page 22: Bioinformatics, Genomics, and Proteomics (Part II)

• Functional protein microarray. Native proteins or peptides are individually purified or synthesized using high-throughput approaches and arrayed onto a suitable surface to form the functional protein microarrays.

• These chips are used to analyze protein activities, binding properties and post-translational modifications.

• With the proper detection method, functional protein microarrays can be used to identify the substrates of enzymes of interest.

• Consequently, this class of chips is particularly useful in drug and drug-target identification and in building biological networks.

Analytical VS Functional

Page 23: Bioinformatics, Genomics, and Proteomics (Part II)

http://www.nature.com/nature/journal/v422/n6928/images/nature01512-f1.2.jpg

Page 24: Bioinformatics, Genomics, and Proteomics (Part II)

• Analytical microarrays are used for protein profiling, that is, detection and quantification of proteins present in a sample.

• It could be antibody microarray or antigen microarray.• Antibody microarrays are often probed with proteins from

biological sources, such as plasma or serum, or proteins that are secreted from cells in culture to determine disease-specific profiles.

• For example, antibody microarrays that specifically detect cytokines have been formulated.

Analytical microarray

Page 25: Bioinformatics, Genomics, and Proteomics (Part II)

• Cytokine antibody microarrays are used to examine cytokines in both normal and diseased states, and from a variety of sources after various treatments.

• A sandwich immunoassay is often used to detect cytokines that bind to immobilized antibodies.

• After the microarray is treated, biotynylated cytokine antibodies are added and bind to the corresponding captured cytokine.

• For visualization, a streptavidin-fluorescent-dye conjugate attaches to the biotin of the secondary antibody.

• The signals are detected with a laser scanner.

Analytical microarray

Page 26: Bioinformatics, Genomics, and Proteomics (Part II)

Cytokine antibody microarray

Page 27: Bioinformatics, Genomics, and Proteomics (Part II)

• Plasma samples from individuals with Alzheimer disease and those from individuals with no dementia were applied to a microarray made up of antibodies against 120 cytokines.

• Eighteen cytokines were found to be associated with Alzheimer disease.

• The levels of 7 of these were higher and 11 were lower in individuals with Alzheimer disease than in the subjects without dementia.

• Possibly, the Alzheimer disease-specific cytokine signature may provide basis for a diagnosis test.

Analytical microarray

Page 28: Bioinformatics, Genomics, and Proteomics (Part II)

• Another type of analytical microarray is protein (antigens) microarray. Proteins are attached to a solid support and then probed with antibodies, mostly in serum samples.

• The purpose of these studies is to discover whether the production of antibodies against specific proteins correlates with particular diseases or biological process.

• A microarray of 5,000 different human proteins was created and used to determine if serum from ovarian cancer patients has a distinctive set of antibodies in comparison to the antibody population of healthy individuals.

Analytical microarray – Antigen

Page 29: Bioinformatics, Genomics, and Proteomics (Part II)

• The initial results revealed 94 proteins that were specific ally recognized by antibody in the sera from the ovarian cancer patients.

• With further testing, three proteins were consistently found to be specific for ovarian cancer.

• The ovarian-cancer-specific proteins may help in the early detection of the disease.

• The earlier ovarian cancer is diagnosed, the better the chance of survival.

Analytical microarray – Antigen

Page 30: Bioinformatics, Genomics, and Proteomics (Part II)

• Analytical antibody microarray is also used to detect whether posttranslational modifications, such as phosphorylation of tyrosine or glycosylation, are associated with specific diseases.

• Proteins are fist captured by primary antibodies immobilized on a microarray.

• Then, the microarray is flooded with biotynylated anti-phosphotyrosine antibody.

• Next, streptavidin conjugated with a fluorescent dye is added, and the protein spot with the fluorescent is detected.

• Detection of glycan group is performed in similar manner.

Analytical microarray

Page 31: Bioinformatics, Genomics, and Proteomics (Part II)
Page 32: Bioinformatics, Genomics, and Proteomics (Part II)

• A multiprotein sample, for example, from a cell lysate or tissue specimen, is immobilized in a single spot on a support.

• Several such multiprotein samples are spotted on the microarray.

• Then, the microarray is probed with a single target molecule.

• The advantage is that a large number of samples can be compared at one time.

• With a reverse-phase microarray, the presence of specific proteins in multiple complex samples can be readily determined.

Analytical microarray – Reverse phase

Page 33: Bioinformatics, Genomics, and Proteomics (Part II)

Reverse-phase microarray format

Page 34: Bioinformatics, Genomics, and Proteomics (Part II)

• Functional protein microarrays feature large sets of individual proteins that are used predominately to determine interactions with other proteins or low molecular-weight compounds, such as lipids, drugs, and metabolites.

• Ideally, the functional protein array should consist of all possible proteins of a proteome under study.

• To obtain comprehensive representation of a proteome, a library containing all of the protein coding sequences is first constructed.

• A library of cloned protein-encoding ORFs has been dubbed an ORFeome.

Functional protein microarray

Page 35: Bioinformatics, Genomics, and Proteomics (Part II)
Page 36: Bioinformatics, Genomics, and Proteomics (Part II)

• The starting point for producing an ORFeome is usually PCR amplification of the coding sequences for cloning into a vector.

• For prokaryotic organisms, the protein-coding sequences can often be readily identified from genomic sequences.

• On the other hand, full-length cDNA libraries are the primary sources of the coding sequences of a eukaryotic proteome.

Functional protein microarray

Page 37: Bioinformatics, Genomics, and Proteomics (Part II)

Integration and excision of bacteriophage λ into and from the E. coli genome via recombination between attachment (att) sites in the bacteria and bacteriophage DNA.

Page 38: Bioinformatics, Genomics, and Proteomics (Part II)

Primer pair used to amplify ORFs for recombinational cloning to generate an ORFeoem.

Page 39: Bioinformatics, Genomics, and Proteomics (Part II)

Recombinational cloning• Primer pair is used to amplify ORFs resulting in PCR-

amplified ORF with attachment sites (attB1 and attB2).

• Recombination between PCR-amplified ORF and a donor vector with attP1 and attP2 sites on either side of the ccdB gene results in an entry clone in which ORF is flanked by attL1 and attL2 sites.

• The selectable marker (SM1)selects transformed cells with an entry clone.

• The protein encoded by ccdB is toxic to transformed cells with non-recombined donor vector molecule.

Page 40: Bioinformatics, Genomics, and Proteomics (Part II)

Recombinational cloning• The next step is the expression of each cloned ORF.• Recombination between the entry clone with attL1

and attL2 sites and a destination vector with attR1 and attR2 results in an expression clone with attB1 and attB2 sites flanking the ORF.

• The selectable marker (SM2) selects transformed cells with an expression clone.

• Cells with an intact destination vector that did not undergo recombination are killed by CcdB protein.

• For construction of a microarray, each protein encoded by ORF is isolated by affinity purification.

Page 41: Bioinformatics, Genomics, and Proteomics (Part II)
Page 42: Bioinformatics, Genomics, and Proteomics (Part II)

• Proteins seldom act alone. On average, one protein interacts with five others.

• The two-hybrid method is used to determine pairwise protein-protein interactions.

• The underlying principle of this assay is that the physical connection between two proteins reconstitutes an active transcription factor that initiates the expression of a reporter gene.

• Generally transcription factors have two domains, DNA binding domain and activation domain.

• These two domains need not to be part of the same protein to be functioning.

Protein – protein interaction mapping

Page 43: Bioinformatics, Genomics, and Proteomics (Part II)

• The availability of complete genome sequences, make it possible to use the yeast two-hybrid system to screen for all possible interactions between the proteins in an organism rather than to test one bait at a time.

• The ORFs from an organism’s genome are cloned into two plasmid vectors, one that expresses the bait (target) and another that produces the prey (interacting proteins are to be identified.)

• Each is introduced into yeast cells by transfection.• A high-throughput mating method is then used to

introduce each bait plasmid into yeast cells with each prey plasmid, and the hybrids are screened for expression of the reporter gene.

Yeast Two-Hybrid System

Page 44: Bioinformatics, Genomics, and Proteomics (Part II)
Page 45: Bioinformatics, Genomics, and Proteomics (Part II)

http://www.sumanasinc.com/webcontent/animations/content/yeasttwohybrid.html

Page 46: Bioinformatics, Genomics, and Proteomics (Part II)

Complementation assay for detecting pairwise protein interactions in mammalian cells.

Page 47: Bioinformatics, Genomics, and Proteomics (Part II)

Large-scale screens for protein interactions using the yeast two-hybrid system. Two libraries are prepared, one containing genomic DNA fragments or cDNAs fused to the DNA coding sequence for the DNA binding domain (bait library) and the other fused to activation domain (prey library.)

Page 48: Bioinformatics, Genomics, and Proteomics (Part II)

Protein interaction map of calcium signaling protein clusters of D. melanogaster.

Page 49: Bioinformatics, Genomics, and Proteomics (Part II)

• The yeast two-hybrid system is powerful but it has some shortcomings.

• The assay is based on transcription, so the bait and the prey proteins must enter the nucleus and interact in a cellular location very different from their normal environment.

• Microarray can be used to screen for protein-protein interactions.

Protein Arrays

Page 50: Bioinformatics, Genomics, and Proteomics (Part II)

• All the ORFs from the yeast genome are used to express each yeast protein tagged with a glutathione-S-transferase (GST) epitope.

• GST-tagged proteins are purified and spotted onto glass slides to generate protein microarrays.

• The protein under investigation is labeled and added to the array under gentle conditions that allow the proteins to interact.

• The spots on microarrays are then analyzed for the intensity of the signal from the labeled interacting test protein.

Protein Arrays

Page 51: Bioinformatics, Genomics, and Proteomics (Part II)
Page 52: Bioinformatics, Genomics, and Proteomics (Part II)

• Two DNA sequences (tag1 and tag2), each encoding a short amino acid sequence with high affinity for a specific molecule, are cloned together and fused in frame to the 3’ end of a cDNA.

• The tagged cDNA construct is introduced into a host cell, where it is transcribed and translated.

• Other cellular proteins bind to the protein encoded by cDNA X. The complex interacting proteins are separated by the binding of tag1 to its affinity partner.

• The cluster is eluted from the affinity partner by cleaving off tag 1.

TAP tag procedure for protein interactions

Page 53: Bioinformatics, Genomics, and Proteomics (Part II)

• A second purification step is carried out with tag 2 and its affinity partner.

• The proteins of the cluster are separated by one-dimensional PAGE.

• Single bands are excised from the gel and treated with trypsin.

• Peptide amino acid sequences are obtained with ESI-MS-MS and searched against a protein database.

TAP tag procedure for protein interactions

Page 54: Bioinformatics, Genomics, and Proteomics (Part II)