Molecular Basis of Inheritance

Embed Size (px)

DESCRIPTION

hh

Citation preview

CHAPTER 6

MOLECULAR BASIS OF

INHERITANCERNA

RNA though it also acts as a genetic materials1 in some viruses, mostly functions as a messenger2 (mRNA). RNA has additional roles as well. It functions as adapter3 (tRNA), structural4 molecule (rRNA), and in some cases as a catalytic5 molecule (riboswitches, Ribozymes) or a regulatory6 molecule (snRNA like miRNA and siRNA).

DNA

DNA is a long polymer of deoxyribonucleotides.

The length of DNA is usually defined as number of nucleotides or base pairs.

Bacteriophage 174 has 5386 nucleotides,

Bacteriophage lambda has 48502 base pairs

Escherichia coli has 4.6 106 bp,

Haploid content of human DNA is 3.3 109 bp.

Structure of DNA

A nucleotide has three components a nitrogenous base, a pentose sugar (ribose in case of RNA, and deoxyribose for DNA), and a phosphate group.

There are two types of nitrogenous bases Purines (Adenine and Guanine), and Pyrimidines (Cytosine, Uracil and Thymine).

A nitrogenous base is linked to the pentose sugar through a N-glycosidic linkage to form a nucleoside, such as adenosine, guanosine, cytidine and uridine or its DNA counterparts deoxyadenosine, deoxyguanosine, deoxycytidine and deoxythymidine.

A phosphate group is linked to 5'-OH of a nucleoside through phosphoester linkage, to form a nucleotideTwo nucleotides are linked through 3'-5' phosphodiester linkage to form a dinucleotide.

Polynucleotides are formed through these phosphodiester linkages.

In RNA, every nucleotide residue has an additional OH group present at 2'-position in the ribose.

Also, in DNA the methylated form of uracil, known as thymine (5-methyl uracil) is found at the place of uracil.

DNA as an acidic substance Nuclein present in nucleus was first identified by Friedrich Meischer in 1869.

In 1953 that James Watson and Francis Crick, based on the X-ray diffraction data produced by Maurice Wilkins and Rosalind Franklin, proposed the famous Double Helix model for the structure of DNA. Main Hallmark of the propositionThe unique property called complementary base-pairing (discovered based on the observation of Erwin Chargaff that for a double stranded DNA, the ratios between Adenine and Thymine and Guanine and Cytosine are constant and equals one.)

This property allowed each of the two strands of parental DNA to act as a template for synthesis of new daughter strands that are identical to the parental DNA molecule. Because of this, the genetic implications of the structure of DNA became very clear.Key features of the double-helix structure

It is made of two polynucleotide chains, where the backbone is constituted by sugar-phosphate, and the bases project inside.

The two chains have anti-parallel polarity.

The bases in two strands are paired through hydrogen bonding. Adenine forms two hydrogen bonds with Thymine, while Guanine forms three H-bonds with Cytosine. This as well as the fact that only a uniform distance between the two strands of helix is energetically feasible makes sure that only a purine comes opposite to a pyrimidine. This is also the molecular reason for complementarity. The two chains are coiled in a right-handed fashion. The pitch of the helix is 3.4 nm and there are roughly 10 bp in each turn. So, the distance between each base-pair is approximately equal to 0.34 nm. The diameter of the B-DNA helix is roughly 2nm.Figure: The central dogma of molecular biology (extended)

PACKAGING OF THE DNA HELIXIn prokaryotes, such as, E. coli, though they do not have a defined nucleus, the DNA is not scattered throughout the cell. DNA (being negatively charged) is held with some proteins (that have positive charges) in a region termed as nucleoid. The DNA in nucleoid is organised in large loops held by proteins.

In eukaryotes, this organisation is much more complex. There is a set of positively charged, basic proteins called histones. A protein acquires charge depending upon the abundance of amino acid residues with charged side chains. Histones are rich in the basic amino acid residues lysine and arginine. Both the amino acid residues carry positive charges in their side chains. Histones are organised to form a unit of eight molecules called as histone octamer. The negatively charged DNA is wrapped around the positively charged histone octamer to form a structure called nucleosome. A typical nucleosome contains 200 bp of DNA helix. Nucleosomes constitute the repeating unit of a structure in nucleus called chromatin, thread-like stained (coloured) bodies seen in nucleus. The Nucleosomes in chromatin are seen as beads-on-string structure when viewed under electron microscope.

The beads-on-string structure in chromatin is packaged to form chromatin fibres that are further coiled and condensed at metaphase stage of cell division to form chromosomes. The packaging of chromatin at higher level requires additional set of proteins that collectively are referred to as Non-histone Chromosomal (NHC) proteins.

In a typical nucleus, some region of chromatin are loosely packed (and stains light) and are referred to as euchromatin. The chromatin that is more densely packed and stains dark are called as Heterochromatin. Euchromatin is said to be transcriptionally active chromatin, whereas heterochromatin is inactive.

THE SEARCH FOR GENETIC MATERIAL

By 1926, the quest to determine the mechanism for genetic inheritance had reached the molecular level. Previous discoveries by Gregor Mendel, Walter Sutton, Thomas Hunt Morgan and numerous other scientists had narrowed the search to the chromosomes located in the nucleus of most cells. But the question of what molecule was actually the genetic material had not been answered.

Transforming Principle

In 1928, Frederick Griffith, in a series of experiments with Streptococcus pneumoniae, witnessed a miraculous transformation in the bacteria (a literal change in the physical form of a living organism).There are two kinds of strains of Streptococcus pneumoniae bacteria: smooth shiny strain (S) with mucous polysaccharide coat (which make them virulent) and rough strain (R) which lacks the coat. Observations:

Mice infected with the S strain (virulent) die from pneumonia infection Mice infected with the R strain did not die. Heat-killed S strain bacteria injected into mice did not kill them either. But, when he injected a mixture of heat-killed S and live R bacteria, the mice died. Moreover, he recovered living S bacteria from the dead mice.

Conclusion: the R strain bacteria had somehow been transformed by the heat-killed S strain bacteria. Some transforming principle, transferred from the heat-killed S strain, had enabled the R strain to synthesize a smooth polysaccharide coat and become virulent. This must be due to the transfer of the genetic material. However, the biochemical nature of genetic material was not defined from his experiments.Biochemical Characterization of Transforming Principle

Oswald Avery, Colin MacLeod and Maclyn McCarty were first experimenters behind the determination of the biochemical nature of Griffith's transforming principle.They purified biochemicals (proteins, DNA and RNA) from the heat-killed S cells to see which ones could transform live R cells into S cells. They discovered that DNA alone from S bacteria caused R bacteria to become transformed.

For greater credibility, they used protein-digesting enzymes (proteases), RNA-digesting enzymes (RNases) and DNA-digesting (DNases) and found that only DNases inhibited transformation, suggesting that the DNA is the hereditary material once again.The unequivocal proof: the Genetic Material is DNA

This proof came in 1952 from the blender experiments of Alfred Hershey and Martha Chase. They worked with viruses that infect bacteria called bacteriophages. The bacteriophage attaches to the bacteria and its genetic material then enters the bacterial cell. The bacterial cell treats the viral genetic material as if it was its own and subsequently manufactures more virus particles. The bacteriophage had two components: a protein coat and a DNA.

Hershey and Chase worked to discover whether it was protein or DNA from the viruses that entered the bacteria.

They grew some viruses on a medium that contained radioactive phosphorus and some others on medium that contained radioactive sulfur.

It was known that DNA contains phosphorous and no sulfur while protein contained sulfur but no phosphorous. So, viruses/bacteriophages grown in the presence of radioactive phosphorus contained radioactive DNA but not radioactive protein and similarly, viruses grown on radioactive sulfur contained radioactive protein but not radioactive DNA. Radioactive phages were allowed to attach to E. coli bacteria. Then, as the infection proceeded, the viral coats were removed from the bacteria by agitating them in a blender. The virus particles were separated from the bacteria by spinning them in a centrifuge. The bacteriophage coat was frothed up in the supernatant while the infected bacteria settled as sediments.Radioactivity was detected in the supernatant for the culture grown on radioactive sulphur while the batch grown on radioactive phosphorous showed radioactivity as associated with the sediment.This result clearly indicated that the genetic material passed on from virus to bacteria is the DNA.

Properties of Genetic Material (DNA versus RNA)

Though protein was displaced by DNA as the true candidate for the genetic material, it subsequently became clear that rarely, in some viruses, RNA is the genetic material not the DNA e.g. Tobacco Mosaic viruses and QB bacteriophage. Why does the DNA act as the predominant genetic material, whereas the RNA mostly performs the dynamic functions of a messenger and an adapter? or to paraphrase the question, why does the DNA seem a better molecule to build a genome when compared to RNA?A molecule that can act as a genetic material must fulfill the following criteria:

It should be able to generate its replica. It should be chemically and structurally stable.

It should provide the scope for slow changes (mutation) that are required for evolution.

It should be able to express itself in the form of 'Mendelian Characters.

Since, the feature of complementary base pairing applies to both DNA and RNA, both are capable of replication.But when the stability criterion is looked at, it becomes clear that DNA supersedes RNA in terms of both physical and chemical stability. The naturally occurring double stranded nature of DNA confers it great stability in terms of physical influences like heat or mechanical stress. Chemically, two key reasons associated with the biochemical nature of the RNA makes it more labile and reactive.

1. The 2-hydroxyl group makes the RNA susceptible to base-catalyzed hydrolysis and hence easier degradation.2. The replacement of Uracil by its methylated form, Thymine grants self-repair ability to DNA molecules. [Cytosine deaminates at a perceptible rate to become Uracil in all nucleic acids. But this change can have deleterious effects as far as the genomic information is concerned. Since Cytosine base pairs with Guanine while Uracil base pairs with Adenine, every such change becomes equivalent to a point-mutation. But in DNA, a repair mechanism involving the enzyme, Uracil DNA glycolase, hydrolyses Uracil residues (and replaces it with Cytosine) while keeping its methylated form, Thymine, unscathed. In fact, the methyl group on thymine is a tag that distinguishes thymine from deaminated cytosine. But in RNA the original Uracil and the deaminated product of cytosine are undistinguishable and hence incapable of repair.] Both DNA and RNA are able to mutate. In fact, RNA being unstable, mutate at a faster rate. This is the reason why viruses having RNA as genome and having shorter life span mutate and evolve faster.The DNA and not the RNA forms the chromosomes. So, only the DNA is able to express itself in the form of Mendelian Characters.

Concluding from these reasons it is clear that while both DNA and RNA can act as the genetic material, DNA seems to appear a better candidate molecule for genomic information storage, especially in higher organisms.

RNA WORLD

Since, it is clear that RNA as well as DNA are found as genetic material in organisms, with RNA genomes being more frequently observed in primitive species like some bacteriophages, an immediate logical question blooms as to whether DNA had replaced RNA in course of evolution due to its selective advantages over the latter.The phrase "The RNA World" was coined by Walter Gilbert in 1986 on the then recent observations of the catalytic properties of various RNAs. In fact, the RNA which was initially thought to be only a passive messenger molecule was found in many active catalytic roles like adapters, Ribozymes, riboswitches, regulatory molecules and so on. As the RNA was found to be associated with such a variety of functions and essential life processes like metabolism, translation or splicing, it was hypothesized logically that this very biological molecule should have been the carrier, executor and the maintainer of life, as it began. Some of the many roles in the sustenance of life could have been then distributed over to DNA (which appeared to be a better candidate for information storage) and to proteins (which appeared to be way too better in folding and catalysis) as evolution progressed. The RNA World refers to this hypothetical stage in the origin of life on Earth during which, proteins were not yet engaged in biochemical reactions and RNA carried out both the information storage task of genetic information as well as the full range of catalytic roles necessary in a very primitive self-replicating system. This is the main logic behind the hypothesis.DNA REPLICATION

The copying mechanism for the molecule is inherent in its property of complementary base pairing by which each of the two strands would separate and act as a template for the synthesis of new complementary strands. After the completion of replication, each DNA molecule would have one parental and one newly synthesized strand. This scheme was termed as semi-conservative DNA replication.

The Experimental Proof for semi-conservative replicationMatthew Meselson and Franklin Stahl performed the following experiment in 1958:

A culture of E.coli bacteria (prokaryote) grown for several generations on heavy nitrogen N15 source was introduced on a medium having only N14 source and the density of DNA at each subsequent generation (replication) was studied.A similar experiment was performed on the beans, Vicia faba (a eukaryote) by Taylor and colleagues in 1958 using radioactive Thymidine.

The experiments proved that the free DNA as well as DNA packed within chromosomes, both replicate semi-conservatively.

The Molecular Machinery of Bacterial DNA Replication

Even while a prokaryotic cell has its genome built out of a few million base pairs, which is relatively short in comparison to eukaryotes, which have billions of base pairs, DNA replication involves an incredibly sophisticated, highly coordinated series of molecular events, even in bacteria. These events can be divided into four major stages: initiation, unwinding, primer synthesis and elongation.Unwinding and separating the entire length of DNA is energetically herculean, if not impossible and so the instantaneous act of replication occurs only within a small opening of the DNA helix, referred to as replication fork. Also the replication fork does not initiate randomly at any place in the DNA but at specific sequences called ori (which have double H-bonded A-T rich regions and are hence relatively easier to separate than triple H-bonded C-G regions). Such points in the DNA are known as the origin of replication. While prokaryotes have generally only one such point, eukaryotes have hundreds of them.Initiation and Unwinding

During initiation, initiator proteins bind to thestretch of DNA called the replication origin, thus triggering events that unwind the DNAdouble helixinto two single-stranded DNA molecules.

DNA helicases are responsible for breaking the hydrogen bonds that join thecomplementarynucleotide bases to each other. Because the newly unwound single strands have a tendency to rejoin, another group of proteins, the Single-strand-binding proteins, keep the single strands stable and separated until elongation begins. A third family of proteins, the Topoisomerases, which includes Gyrase, reduces the torsionalstraincaused by the unwinding of the double helix.

The entire process of replication becomes possible only because of the concerted activity of a team of many such proteins which form a multifunctional complex or a replication machine.Primer synthesis and Elongation

At the heart of the replication machine is an enzyme called DNA Polymerase III often referred to as DNA-dependent DNA polymerase (since it uses a DNA template to catalyze the polymerization of deoxyribonucleotides). This enzyme is very fast (joins 2000 nucleotides s-1 and is very efficient (have proof-reading activity). Mg2+ is the cofactor for the enzyme.Deoxyribonucleoside triphosphates serve dual purposes here. They are the substrates as well as the fuel for the reaction (same as in case of ATP).But even then, the DNA polymerases, on their own, can only add deoxyribonucleotides to the 3'-OH group of an existing chain and cannot begin synthesisde novo. So, an enzyme called a DNA Primase (which is essentially a RNA polymerase) fixes a temporary primer (short stretches of RNA) initially, for the polymerase to start working. Later, these primer fragments are replaced by DNA Polymerase I and the sugar-phosphate backbone stitched up by DNA ligase.The DNA-dependent DNA polymerases catalyze polymerization only in one direction, i.e. 5'(3', by extending the 3OH of the growing polymer. This directionality is a consequence of the need for proof-reading activity.This directionality also creates some additional complications at the replicating fork.

On one strand (the template with polarity 3'(5'), the replication is continuous, while on the other (the template with polarity 5'(3'), it is discontinuous. The discontinuously synthesized (Okazaki) fragments are later joined by the enzyme DNA ligase.

*In eukaryotes, the replication of DNA takes place at S-phase of the cell-cycle. TRANSCRIPTION and TRANSLATIONDetermination of the structure of the DNA gave unmistakable clues to the fact that the hereditary information in cells must be encoded in DNAs sequence of nucleotides. This code would be the basis for the production of the molecules which make biological life possible: the protein and the RNA. Thus DNA was conferred the title, the blueprint of life. Replication is the means of the safe transfer of this coded information to subsequent generations while transcription and translation are the 2 steps by which the cell decodes and uses this information to make life happen: direct the formation, development and sustenance of every form of life, be it a bacterium, a fruit fly, or a human.

Transcription and translation stands in the way between the genotype and the phenotype of a living organism. Since proteins are the principal constituents of cells, they determine the structure as well as the functions of the cell at the immediate visible level and are in fact the molecular basis for phenotype. As we know that proteins are the ultimate form of expression of the DNA codes, the genetic instructions carried by the DNA must therefore specify the amino acid sequences of proteins and somehow direct their production.

DNA does not direct protein synthesis itself, but acts rather like a manager, delegating the various tasks to a team of workers. When a particular protein is needed by the cell, the nucleotide sequence of the appropriate section of the immensely long DNA molecule in a chromosome is first copied into a more versatile type of nucleic acid - the RNA (transcription). These RNA copies of short segments of the DNA may then be used to direct the synthesis of the protein (translation). . In some cases, the RNA molecule itself is a "finished product" that serves some important function within the cell.

All cells, from bacteria to humans, express their genetic information in this waya principle so fundamental that it has been termed the central dogma of molecular biology.

History: Discovering the Relationship between DNA and Protein ProductionGene-protein connection

In 1902, Archibald Garrod, recorded observations of Alkaptonuria patients, whose urine turned black due to the buildup of a chemical called Homogentisate. Knowledge of the biochemical pathway of the phenylalanine metabolism, made it clear to him that Homogentisate was one of the intermediates through which the amino acid ultimately got degraded into maleylacetoacetate. This led him to surmise that the enzyme homogentisate oxidase, which metabolized homogentisate, must be defective in his patients.Correlating with the information that Alkaptonuria followed a recessive Mendelian inheritance pattern, Garrod made an even bolder prediction that a defective gene must be responsible for the defective enzyme. Garrod's proposition, attributing a defective enzyme to a defective gene, was the first ever to suggest a direct link between genes and proteins. All that was known at the stage was that the genetic material (DNA) was housed in the nucleus within chromosomes. The proposition led investigators to subsequently suggest that the nucleus could also be the site of protein synthesis. Site of protein synthesis

The exact site of protein synthesis was confirmed as being the cytoplasm only after serious investigations involving an alga called Acetabularia, whose interesting life cycle posed a stage that created an opportunity in which the nucleus could be removed without causing major damage to the cell. After the removal of the nucleus, the cells protein production was measured over time. The unexpected discovery was that the enucleated alga could still live for months. Protein production did not stop instantaneously, pointing out that nucleus was not the direct site as previously thought. But, the production ceased within 2 weeks, indicating that nucleus did in fact have some role in the long-term production of proteins. Missing link between DNA and protein

Although Garrod and several other scientists had demonstrated a clear association between genes (which were known to be on chromosomes in the nucleus) and proteins, the precise nature of this link remained mysterious for some time. Researchers wondered whether chromosomes participated directly in protein production. If so, one would expect that some DNA would be found beyond the nucleus, in the cytoplasm, at least some of the time. However, no evidence of DNA outside the nucleus had ever been found. Thus, the exclusive localization of DNA to the nucleus could only be linked to protein synthesis in the cytoplasm if there were some kind of intermediate messengera substance "between" the DNA in the nucleus and the protein production machinery in the cytoplasm. The early work of Brachet et al with dyes predicted that another type of nucleic acid, ribonucleic acid (RNA), might be the intermediary. Several pieces of evidence implicated RNA in protein production, including the following:

RNA is found in both the nucleus and the cytoplasm.

RNA concentration correlates with protein production.

Cells that produce large amounts of protein had cytoplasmic dye- and radiation-absorbing regions indicative of the presence of nucleic acids and the treatment of such cells with ribonuclease decreased the cells' dye- and radiation-absorbing regions.

The process of copying genetic information from one of the strands of the DNA into RNA is termed as transcription. Here also, the principle of complementarity governs, except that adenosine now forms base pair with Uracil instead of thymine. However, unlike in the process of replication, which once set in, the total DNA of an organism gets duplicated, in transcription only a segment of DNA and only one of the strands is copied into RNA. But, the strand of DNA that serves as the coding template for one genemay be non-coding for othergeneswithin the samechromosome.This necessitates defining the boundaries that would demarcate the region and the strand of DNA that would be transcribed.

Transcription Unit

A transcription unit in DNA is defined primarily by the three regions in the DNA:

(i) A Promoter

(ii) The Structural gene

(iii) A TerminatorThere is a convention in defining the two strands of the DNA in the structural gene of a transcription unit. Since the two strands have opposite polarity and the DNA-dependent RNA polymerase also catalyze the polymerization in only one direction, that is, 5'3' , the strand that has the polarity 3'5' acts as a template, and is also referred to as template strand. The other strand which has the polarity (5'3') and the sequence same as RNA (except thymine at the place of uracil), is displaced during transcription. Strangely, this strand (which does not code for anything) is referred to as coding strand. All the reference point while defining a transcription unit is made with coding strand. The promoter and terminator flank the structural gene in a transcription unit. The promoter is said to be located towards 5'-end (upstream) of the structural gene (the reference is made with respect to the polarity of coding strand). It is a DNA sequence that provides binding site for RNA polymerase, and it is the presence of a promoter in a transcription unit that also defines the template and coding strands. By switching its position with terminator, the definition of coding and template strands could be reversed. The terminator is located towards 3'-end (downstream) of the coding strand and it usually defines the end of the process of transcription. There are additional regulatory sequences that may be present further upstream or downstream to the promoter.

Transcription Unit and the Gene

A gene is defined as the functional unit of inheritance. Though there is no ambiguity that the genes are located on the DNA, it is difficult to literally define a gene in terms of DNA sequence. The DNA sequence coding for tRNA or rRNA molecule also define a gene. However since a cistron is defined as a segment of DNA coding for a polypeptide, the structural gene in a transcription unit could be said as monocistronic (mostly in eukaryotes) or polycistronic (mostly in bacteria or prokaryotes). In eukaryotes, the monocistronic structural genes have interrupted coding sequences the genes in eukaryotes are split. The coding sequences or expressed sequences are defined as exons. While the intervening sequences which do not appear in the mature or processed RNA are called introns. The split-gene arrangement further complicates the definition of a gene in terms of a DNA segment.

Inheritance of a character is also affected by promoter and regulatory sequences of a structural gene. Hence, sometime the regulatory sequences are loosely defined as regulatory genes, even though these sequences do not code for any RNA or protein.

Types of RNA and the process of Transcription

In bacteria, there are three major types of RNAs: mRNA (messenger RNA), tRNA (transfer RNA), and rRNA (ribosomal RNA). All three RNAs are needed to synthesize a protein in a cell. The mRNA provides the template, tRNA brings amino acids and reads the genetic code, and rRNA play structural and catalytic role during translation. There is single DNA-dependent RNA polymerase that catalyses transcription of all types of RNA in bacteria. RNA polymerase binds to promoter and initiates transcription (Initiation). It uses nucleoside triphosphates as substrate and polymerizes in a template depended fashion following the rule of complementarity. It somehow also facilitates opening of the helix and continues elongation. Only a short stretch of RNA remains bound to the enzyme. Once the polymerase reaches the terminator region, the nascent RNA as well as the RNA polymerase falls off. This results in termination of transcription.

An intriguing question is that how is the RNA polymerases able to catalyze all the three steps, which are initiation, elongation and termination. The RNA polymerase is only capable of catalyzing the process of elongation. It associates transiently with initiation-factor () and termination-factor () to initiate and terminate the transcription, respectively.

In bacteria, since the mRNA does not require any processing to become active, and also since transcription and translation take place in the same compartment (there is no separation of cytosol and nucleus in bacteria), many times the translation can begin much before the mRNA is fully transcribed. Consequently, the transcription and translation can be coupled in bacteria.

In eukaryotes, there are two additional complexities

(i) There are at least three RNA polymerases in the nucleus (in addition to the RNA polymerase found in the organelles). There is a clear cut division of labor. The RNA polymerase I transcribes ribosomal RNA (rRNA) The RNA polymerase III transcribes tRNA, one small ribosomal RNA (5srRNA), and snRNA (small nuclear RNAs)-involved in RNA splicing and gene regulation. The RNA polymerase II transcribes the precursor of mRNA, the heterogeneous nuclear RNA (hnRNA).

(ii) The second complexity is that the primary transcripts contain both the exons and the introns and are non-functional. Hence, it is subjected to a process called splicing where the introns are removed and exons are joined in a defined order. Also the hnRNA undergo two additional processing called as capping and tailing. In capping an unusual nucleotide (methyl guanosine triphosphate) is added to the 5'-end of hnRNA. In tailing, adenylate residues (200-300) are added at 3'-end in a template independent manner. The fully processed hnRNA, now called mRNA, is transported out of the nucleus for translation.

The significance of such complexities is now beginning to be understood. The split-gene arrangements represent probably an ancient feature of the genome. The presence of introns is reminiscent of antiquity, and the process of splicing represents the dominance of RNA-world. In recent times, the understanding of RNA and RNA-dependent processes in the living system have assumed more importance.