47
台台台台台 台台台 601 20000Chapter 9 slide 1 CHAPTER 10 Genomics Peter J. Russell edited by Yue-Wen Wang Ph. D. Dept. of Agronomy, NTU A molecular Approach 2 nd Edition

Ch10 Genomics

Embed Size (px)

DESCRIPTION

Structural genomics, genome sequencing, genome sequencing using mapping approach, genome mapping, physical mapping, generating the sequence of the genome, shot gun sequencing, bacterial genomes, eukaryotic genomes, archeon genomes, human genome project

Citation preview

Page 1: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 1

CHAPTER 10Genomics

Peter J. Russell

edited by Yue-Wen Wang Ph. D.Dept. of Agronomy, NTU

A molecular Approach 2nd Edition

Page 2: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 2

Structural Genomics1. The advent of DNA sequencing techniques changed experimental

biology, and automation has enhanced the rate of change.

2. Genomics is the development and application of techniques for:

a. Mapping chromosomes.

b. Sequencing genomes.

c. Computational analysis of entire genomes.

3. Subfields of genomics are:

a. Structural genomics, the genetic and physical mapping and sequencing of chromosomes.

b. Functional genomics, comprehensive analysis of gene functions and of non-gene sequences in entire genomes.

c. Comparative genomics, comparison of entire genomes across species, looking at functions and evolutionary relationships.

4. This section focuses on structural genetics, specifically genome sequencing.

Page 3: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 3

Sequencing Genome

1. Genome projects use two general approaches: a. The mapping approach divides the genome into

segments with genetic and physical mapping, refines the map of each segment, and finally sequences the DNA.

b. A “shotgun” approach breaks the genome into random, overlapping fragments, and sequences each fragment. Based on overlaps, the sequences are assembled by computer. An advantage is that physical mapping is not required.

Page 4: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 4

Genome Sequencing Using a Mapping Approach

1. Genetic and physical maps are made first to provide markers for sequencing. Examples illustrate the logic of this approach in the human genome project.

Page 5: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 5

Genetic Mapping of a Genome

1. Genetic maps are constructed for each chromosome using genetic crosses and pedigree analysis. Any detectable allele can mark a locus on the chromosome, and crossing over indicates the distance between marker genes.

2. High-density genetic mapping has been important in the Human Genome Project (HGP). Some aspects of this procedure:a. A sequence tagged site (STS) is a unique genomic DNA

sequence used as a genetic marker. Short tandem repeats (STRs) are used extensively for STS mapping, but nonpolymorphic markers are also used.

b. Polymorphic STRs are the best DNA markers for generating genetic maps of STSs.

Page 6: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 6

Physical Mapping of a Genome1. Genetic maps generated for some species (e.g., E. coli) are sufficient to begin

sequencing, but in humans even the detailed genetic map described above lacks the required resolution. Therefore, a physical map derived directly from genomic DNA rather than analysis of recombinants has been generated.

2. In humans there are 24 physical maps for the autosomes plus X and Y. Types of physical maps are presented in order of increasing resolution:

a. Cytogenetic maps of chromosomal banding patterns (Chapter 16)

b. Fluorescent in situ hybridization (FISH) maps (Chapter 16)

c. Restriction maps

i. Restriction enzymes that cut are rarely used, due either to a large (7–8bp) recognition sequence or to scarcity of the recognition sequence in the DNA under study.

ii. The map for even a rarely cutting restriction enzyme is very complex, and so far has been obtained for only the smallest human chromosome (chromosome 21 was mapped with NotI).

d. Radiation hybrid maps (Chapter 16)

Page 7: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 7

e. Clone contig (contig shortened form of contiguous) maps

i. A partial restriction digest produces a set of large, overlapping DNAs, which are cloned into YAC vector cut with a compatible restriction enzyme. Shearing may also be used to make high-molecular-weight DNA that is blunt-end cloned into a YAC.

ii. An entire genome or a single chromosome may be represented in a YAC clone library.

iii. YAC clones are then assembled into a map either by matching with a FISH-generated chromosome map or by DNA fingerprinting and assembly based on overlaps. Nonpolymorphic STSs are especially useful for YAC contig mapping (Figure 10.1).

iv. A complete library should yield a complete contig map that indicates the order in which the cloned fragments occur in the chromosome.

v. Problems arise when some of the YAC inserts contain DNA from more than one chromosomal location. This has complicated efforts at generating a YAC contig map of human chromosomes.

vi. Many labs have switched to bacterial artificial chromosome (BAC) vectors with a capacity of 300kb and the ability to replicate in E. coli as a resource for their sequencing projects.

Page 8: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 8Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.

Fig. 10.1 A representative YAC contig map assembled by STS mapping

Page 9: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 9

Generating the Sequence of a Genome1. When a high-resolution map is available, sequencing is possible. Briefly:

a. Dideoxy sequencing is used. DNA is synthesized from a template, and terminates with incorporation of a fluorescently labeled ddNTP.

b. All four reactions (ddA, ddG, ddC, and ddT) occur in the same tube. Each ddNTP carries a different fluorescent label.

c. Products are separated electrophoretically, colored bands are detected with lasers, and the data are converted to a computer sequence file.

d. PCR-based sequencing uses one oligonucleotide primer and thermostable DNA polymerase. The advantages of this approach are:

i. Double-stranded DNA is sequenced directly.

ii. Only a small amount of template DNA is required.

2. One sequencing reaction is limited to about 500 nucleotides, and for accurate sequences both strands must be sequenced several times.

3. Progress on the human genome and other projects has been accelerated by improved technologies for sequencing and analysis.

Page 10: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 10

4. Human genome sequencing by the mapping approach used BACs, but a BAC insert is far too large to sequence in one reaction. Instead, the inserts were each sequenced using a shotgun approach:

a. Each insert is cut from the vector, sheared into fragments that will be partially overlapping, and cloned into a plasmid vector.

b. Each subclone is sequenced, and overlaps are used by a computer to assemble the data into one contiguous sequence representing the BAC insert.

c. Using the chromosomal map for BAC clones, the BAC insert sequences are put in order to yield the complete chromosome sequence.

5. In theory, sequencing contigs for a total length of 6.5–8 times the genome will span more than 99.8 percent of the genomic sequence.

6. In practice, the HGP did its sequencing seven-times over, and has obtained 97 percent of the genome.

Page 11: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 11

Genome Sequencing Using a Direct Shotgun Approach

Animation: Direct Shotgun Sequencing of Genomes

1. The shotgun approach obtains a genomic sequence by breaking the genome into overlapping fragments for cloning and sequencing. A computer is then used to assemble the genomic sequence.

2. Advances that have made this approach practical for large genomes include:

a. Better computer algorithms for assembling sequences.

b. Automation in the actual sequencing.

Page 12: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 12

3. A pioneer of this approach is J. Craig Venter, whose Celera Genomics has also sequenced (5-fold) the human genome to 97%, with complete assembly of the fragments except for gaps caused by the missing 3%.

4. Direct shotgun sequencing involves (Figure 10.2):

a. Mechanical shearing and cloning of small (about 2 kb) genomic DNA fragments.

b. Sequencing about 500 bp on each end of the insert DNA. Sequences in the center of the cloned DNA are obtained from an overlapping clone rather than directly.

c. Computer analysis gives the sequence of most of the genome, with gaps caused by sequences missing from the library.

d. A second library is made with larger (about 10 kb) random fragments, allowing resolution of repeated sequences.

5. Advances in automated DNA sequencing and computer algorithms for sequence analysis allow the whole-genome approach to be used with even large genomes. BAC maps are often also part of these projects.

6. Assembling and finishing genome sequences requires arranging sequences in the order they are found in the genome and then finishing the details of the sequence (<1 error/10,000 bases).

Page 13: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 13Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.

Fig. 10.2 The direct shotgun approach to obtaining the genomic DNA sequence of an

organism

Page 14: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 14

Selected Examples of Genomes Sequenced

1. Following is a discussion of some genomes that have been sequenced, with the rationale for their selection.

Page 15: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 15

Bacterial Genomes1. Haemophilus influenzae, the first cellular organism to have its genome

sequenced, was selected for its typical bacterial genome size and its GC content close to humans (Figure 10.3).a. No genetic or physical map existed, so a shotgun approach was used.

b. The H. influenzae genome is 1.83Mb.

c. Annotation of the sequence involved computer analysis to find significant sequences, including:

i. 1,743 open reading frames (ORFs), regions with no stop codon in a particular reading frame. Arbitrarily, ORFs that are over 100 codons are considered likely to encode proteins.

ii. Repeated sequences.

iii. Operons.

iv. Transposable elements.

d. 736 of the predicted genes have no “role assignment,” meaning that no function is yet verified for them.

Page 16: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 16Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.

Fig. 10.3 The annotated genome of H. influenzae

Page 17: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 17

2. Escherichia coli was selected because it is an important model system for molecular biology, genetics, and biotechnology, as well as a common bacterium in animal intestines and the environment.a. A shotgun approach was used.

b.The genome is 4.64Mb with a GC content of 50.8 percent.

c. Analysis of the genome sequence shows that 87.8 percent of the genome is made up of ORFs.

d.Of 4,288 ORFs, 38 percent are of unknown function.

Page 18: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 18

Archaeon Genomes1. Methanococcus jannaschii is an anaerobic, hyperthermophilic

methanogen that reduces CO2 to methane.a. A shotgun approach was used.

b. The genome has three parts:

i. A large circular chromosome of about 1.66Mb, with 1,682 ORFs.

ii. A circular extrachromosomal element (ECE) of about 58kb, with 44 ORFs.

iii. A smaller circular ECE of about 17kb, with 12 ORFs.

2. Analysis of the sequence confirms Archaea’s unique taxonomic position, showing that:a. Most M. jannaschii genes involved in energy production, metabolism, and

cell division are similar to those of eubacteria.

b. Most of the genes involved in DNA replication, transcription, and translation are similar to those of eukaryotes..

Page 19: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 19

Eukaryotic Genomes1. Saccharomyces cerevisiae is a model eukaryote for many

types of research. It was the first eukaryotic genome to be completely sequenced (Figure 10.4).a. The mapping approach was used.b. The 16-chromosome genome is 12Mb. An estimated 969kb

of repeated sequences are missing from the published sequence.

c. Analysis reveals 6,183 ORFs, 233 with introns.d. ORFs make up about 70 percent of the total genome, and

about 1⁄3 have no known function.

Page 20: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 20

2. Caenorhabditis elegans, a nematode, has been important in both genetic and molecular study of embryogenesis, morphogenesis, development, nerve development and function, aging, and behavior (Figure 10.5).

a. The nearly complete genome sequence spans 97Mb distributed between six chromosomes (five autosomes and an X chromosome).

b. Analysis shows:

i. The genome is 100.3Mb

ii. There are 20,443 genes with 1,270 that do not encode proteins.

Page 21: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 21

3. Drosophila melanogaster, the fruit fly, has been important in both classical genetics and the molecular genetics of development.

a. Sequencing used the direct shotgun approach, supported by clone‑based sequencing and a BAC-derived physical map.

b. The genome is 118.4Mb. Another 1⁄3 (60Mb) is currently unclonable heterochromatin located near centromeres.

c. There are 14,015 genes. Comparison with genomic sequences from other species indicates:

i. Drosophila has about twice the number of genes found in S. cerevisiae.

ii. Of 289 genes known to be involved in human disease, Drosophila has homologs for 177..

Page 22: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 22

4. Arabidopsis thaliana was the first flowering plant to be sequenced, and is an important model for genetic and molecular biology of plants.

a. The genome is 120Mb with about 25,900 genes.

b. Arabidopsis has about twice the number of genes as Drosophila.

c. The number of genes in Arabidopsis is near the lower estimates of the human gene number.

d. About 100 Arabidopsis genes have human homologs, including genes for breast cancer and cystic fibrosis.

e. Ongoing work is focused on defining functions of all genes, determining gene regulation, and understanding the fates of gene product proteins.

Page 23: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 23

5. Homo sapiens DNA from a variety of anonymous donors has been sequenced. The “human genome sequence” does not exactly match the genome of any human being.

a. A “working draft” of the human genome was announced in June 2000 jointly by:

i. Francis Collins for the National Human Genome Research Institute.

ii. J. Craig Venter of Celera Genomics.

b. By June 2000, the sequencing effort had generated 7-fold coverage of the genome, with about 50 percent of the genome sequence considered to be near finished, and 24 percent completely finished.

c. The sequencing approaches:

i. The Human Genome Sequencing Project Consortium focused on sequencing the gene-rich euchromatin regions, ignoring the generally unclonable heterochromatin, using existing genetic and physical maps.

ii. Celera Genomics used shotgun sequencing followed by a very large computer calculation looking for overlaps in the random DNA fragments (enough to represent 4.6-fold coverage of the human genome). Shotgun assembly results were verified by comparison with BAC clone sequences available in public databases.

d. The next step in the Human Genome Project is annotating the sequence, analyzing its genes and other features.

Page 24: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 24

6. Mus musculus (mouse) and Rattus norvegicus (rat) genomes have also been sequenced (Figure 10.6).a. The human genome is largest, followed by rat and then

mouse.

b.All three have about the same number of genes.

c. Rodents serve as models for mammalian physiology, and about 99 percent of the genes in mouse and rat have direct human counterparts.

Page 25: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 25

Insights from Genome Analysis: Genome Sizes and Gene Densities

1. Genomes can be compared for genes and intergenic regions.

2. The C value paradox says there is no relationship between the amount of haploid DNA and the complexity of the organism.

3. Gene density (number of genes per length of DNA) varies between organisms.

Page 26: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 26

Genomes of Bacteria

1. The range of sequenced bacterial genomes is 0.58mB (Mycoplasma genitalium) to 9.11Mb (Bradyrhizobium japonicum) (Table 10.1).

2. Gene densities in bacterial genomes are similar, 1–2kb. Examples:a. Mycoplasma genitalium has one gene per 1.15kb.

b. E. coli has a gene density of one gene per 1.05kb.

3. Bacterial genes are packed densely in the chromosome. In both Bacteria and Archaea 85–90 percent of the genome is typically coding DNA.

Page 27: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 27

Page 28: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 28

Genomes of Archaea

1. Archaea are generally extremophiles with regard to conditions such as temperature, pressure, pH, metal ions, or salt.

2. Similarities with Bacteria include:

a. Morphology (rods, spheres, spirals).

b. Lack of introns in protein-coding genes.

c. High gene density.

3. Similarities with Eukarya include:

a. Genes for replication, transcription, and translation.

b. Introns in tRNA genes.

4. Archaea genomes range widely, from 1.56Mb (Thermoplasma acidophilum) to 5.75Mb (Methanosarcina acetivorans).

Page 29: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 29

Genomes of Eukarya1. Increasing genomic DNA content tends to correlate with

increasing complexity, but there is not a direct relationship. For example:a. The insects Drosophila melanogaster (fruit fly)and Locusta

migratoria (locust) have similar complexity, but the locust genome size is 50´ that of fruit fly.

b. The locust genome is twice the size of the mouse genome.

2. Differences in gene density account for many differences in genome size. For example:a. Fruit fly has an average of one gene per 13kb of genome.

b. Locust has an average of one gene per 365kb of genome.

3. Eukarya generally has lower gene density and more variability than Bacteria or Archaea. The range is large, with a trend of increasing gene density with increasing complexity (Figure 10.7).

Page 30: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 30

Fig. 10.7 Regions of the chromosome of E. coli, yeast, fruit fly, and human, showing the differences in gene density

Page 31: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 31

4. Genes are not distributed evenly in the chromosome. Some regions are gene rich, while others are gene deserts (³1Mb without a gene).

5. The majority of the eukaryotic genome is intergenic regions, and in humans these are mostly repetitive DNA.

a. Finding genes in this gene-sparse genome is often difficult.

b. The pufferfish (Fugu rubripes) is used as a model vertebrate because it has a gene density 8-fold higher than humans and many of its genes are homologous to human genes (Figure 10.8).

Page 32: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 32

Functional Genomics1. Functional genomics analyzes all genes in genomes to determine their

functions and their gene control and expression.

2. Research questions about gene expression, physiology, and development can now be answered at the genomic level.

3. Current functional genomics relies on molecular biology lab research and sophisticated computer analysis by bioinformatics researchers.

4. This fusion of biology with math and computer science is used for many things. Examples:

a. Finding genes within a genomic sequence.

b. Aligning sequences in databases to determine matching.

c. Predicting structure and function of gene products.

d. Describing interactions between genes and gene products in the cell, between cells and between organisms.

e. Considering phylogenetic relationships..

Page 33: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 33

Identifying Genes in DNA Sequences

1. Annotation begins the process of assigning functions to genes, especially protein-coding genes, using computer algorithms to search both strands for ORFs. Introns complicate analysis of eukaryotic genes.

2. ORFs exist in all sizes, and not all encode proteins. To focus on sequences most likely to encode proteins, a minimum ORF size is arbitrarily set and shorter sequences are not analyzed.

Page 34: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 34

Sequence Similarity Searches to Assign Gene Function1. Computers are used to find homology between sequences in a database (e.g., a BLAST

search). Similarity reflects evolutionary relationships and often shared functions.

2. Either DNA or amino acid sequences can be searched, but amino acids yield more specific information, since there are 20 possible matches, rather than just four. Often no convincing match is found, due in part to the limitations of current databases.

3. Sometimes matches are found only at the domain level, when a region in the new protein matches protein domains in the database. This provides clues to the new protein’s function and the evolution of its gene.

4. As databases grow, so does our knowledge of gene functions. The current distribution of knowledge about the genes of yeast is (Figure 9.14):

a. About 30% of the genes have known functions.

b. Of the remaining 70% of ORFs:

i. 30% encode a protein that either has homology to protein(s) of known function, or has domains related to functionally characterized domains.

ii. 10% are FUN (function unknown) genes. They have homologs in databases, but function(s) of the homologs are unknown. Groups of homologous genes of unknown function are orphan families.

iii. 30% of ORFs have no homologs in the databases. These include 6–7% that may not actually encode proteins. The remainder may represent genes known only in yeast, the single orphans.

5. Every genome sequenced contains “function unknown” genes, but as databases are expanded the problem should decrease.

Page 35: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 35Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.

Fig. 10.9 The distribution of predicted ORFs in the genome of yeast

Page 36: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 36

Assigning Gene Function Experimentally1. One approach to determining gene function is to delete the gene, and

observe the phenotype when that gene’s function is knocked out. PCR may be used to produce and screen a gene knockout (Figure 10.10):a. Using known genome sequences, PCR primers are designed to

construct an artificial linear DNA deletion module. It consists of:

i. The gene sequence upstream and through the start codon.

ii. A kanR (kanamycin) marker gene conferring resistance to a chemical, G418.

iii. The gene sequence downstream of and including the stop codon.

b. The amplified linear DNA is transformed into yeast, and G418-resistant colonies selected. These are generated when the new DNA replaces the gene of interest in the genome by homologous recombination.

c. They now express kanR instead of the gene under study, producing a loss-of-function (null) mutation.

Page 37: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 37Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.

Fig. 10.10 Creating a gene knockout in yeast

Page 38: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 38

2. Molecular screening with specific primers and PCR is used to confirm that a deletion occurred in the ORF of interest. A deletion results in no priming with primers directed toward that region, and may be confirmed by showing insertion of a selectable marker (e.g., kanR).

3. The yeast knockout (YKO) project systematically deleted each yeast gene. Some results:a. Essential genes give a lethal phenotype.

b. About 4,200 of yeast’s 6,200 genes are nonessential, and yield viable knockout mutants.

c. Of the viable knockouts, about half show a detectable phenotype and half do not.

4. Null alleles are widely used to investigate gene functions. Mice knockouts are used to study genes with human analogs.

Page 39: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 39

Describing Patterns of Gene Expression

1. Genomic sequencing makes it possible to determine all genes that are expressed in a cell by analyzing the total RNA transcripts of the cell, its transcriptome. The transcriptome is an indicator of cell phenotype and function. Similarly, the complete set of proteins in a cell is its proteome.

Page 40: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 40

The Transcriptome

Animation: Analysis of Gene Expression Using DNA Microarrays

1. The transcriptome changes as the cell responds to stimulus and moves through its cell cycle, and so is a tool for understanding cellular function.

2. Probe arrays are used to study gene expression. Yeast sporulation is one example:a. Yeast sporulation produces four haploid spores, and involves

four stages, each associated with its own transcripts (Figure 10.11).

i. DNA replication and recombination.

ii. Meiosis.

iii. Meiosis II.

iv. Spore maturation.

Page 41: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 41Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.

Fig. 10.11 Global gene expression analysis of yeast sporulation using a DNA

microarray

Page 42: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 42

b. Samples of mRNA taken at intervals during sporulation were converted to cDNAs and analyzed on microarrays of PCR-amplified ORF sequences. The results were correlated with cellular events

c. Control cDNA was made from preinduction mRNAs, and labeled green. The cDNAs from postinduction mRNAs were labeled red. Microarrays were probed with a mix of both, and results were interpreted as follows:

i. Red spots indicate a gene induced during sporulation.

ii. Green spots indicate a gene repressed during sporulation.

iii. Yellow spots mark genes whose expression is unchanged during sporulation.

d. Results show more than 1,000 genes with altered expression during sporulation, about 1⁄2 repressed and the other 1⁄2 not repressed. Patterns of expression over time become apparent in this type of experiment.

Page 43: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 43

3. DNA microarrays are now widely used, although still expensive. Examples of studies that currently use this technology:

a. Changes in Drosophila gene expression during morphogenesis.

b. Human cancers and their characteristic patterns of gene expression (transcriptional fingerprints) that reveal distinctions between different types of cancer.

c. Screening for genetic diseases, especially those resulting from one of many alleles. A patient’s blood, for example, can be screened for hundreds of possible mutations in the BRCA1 and BRCA2 genes associated with breast cancer.

4. Pharmacogenomics studies how the individual’s genome affects the body’s response to medication, with the hope of eventually tailoring treatment to the patient’s genetic factors.

a. Based in biochemistry, pharmacogenomics develops drugs associated with RNA molecules and proteins associated with genes and diseases.

b. This new approach has few successes to date, but one example is in developing tests to detect patients with deficient cytochrome p450 (CYP) liver enzymes, who are susceptible to drug overdose.

Page 44: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 44

The Proteome1. Proteomics is cataloging and analysis of the proteome, or complete set

of expressed proteins in a cell at a given time. Proteomics focuses on which proteins are made and in what quantities, and their interactions with other proteins.

2. Goals of proteomics are to:

a. Identify every protein in the proteome.

b. Develop a database with the sequence of each protein.

c. Analyze protein levels in different cell types and stages of development.

3. Protein identification and sequencing is very complex. Celera Genomics is involved in identification, sequencing, and computer analysis of the data.

4. Proteome complexity far exceeds genome complexity, due to:

a. Alternative RNA splicing.

b. Posttranslational modifications of proteins.

Page 45: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 45

5. Conventional proteome analysis uses 2-D acrylamide gel electrophoresis and mass spectrometry, but is neither sensitive enough to detect low levels nor able to analyze many proteins at once.

6. Protein arrays, similar to DNA microarrays, are used to detect, quantify, and characterize proteins on a large scale. Automation allows large numbers of measurements in parallel.

a. Proteins are fixed on a solid substrate (glass, membrane, or microtiter plate).

b. Target proteins are labeled fluorescently.

c. Binding to immobilized probe array is detected by laser, and data are analyzed via computer.

d. Two types of protein arrays are commonly used:

i. A capture array is a set of antibodies bound to a surface and used to detect labeled target molecules from cells. Capture arrays are used in diagnosis and in protein expression profiling.

ii. A large-scale protein array uses purified proteins from an expression library, spotted onto a substrate and used to detect labeled target molecules for biological functions including protein-protein or drug-target interactions.

Page 46: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 46

Comparative Genomics

iActivity: Personalized Prescriptions for Cancer Patients

1. Comparative genomics provides a way to study functions of human genes by working with non-human homologs. Genes and their arrangement also provide valuable clues to evolutionary relationships between organisms.

Page 47: Ch10 Genomics

台大農藝系 遺傳學 601 20000 Chapter 9 slide 47

Ethics and the Human Genome Project1. The ability to identify human genes raises complex ethical issues

involving the right to information about one’s own genome, access to genomic information by employers, insurance companies and government agencies, and concerns about the ability to diagnose but not treat genetic disorders.

2. Federal agencies funding the HGP devote 3–5% of their budgets to study of ethical, legal and social issues (ELSI), producing the world’s largest bioethics program. Areas currently emphasized by the ELSI program:

a. Privacy of genetic information.

b. Appropriate use of genetic information in the clinical setting.

c. Fair use of genetic information.

d. Professional and public education.