12
1 Genomics and Its Impact on Medicine and Society A 2001 Primer U.S. Department of Energy Human Genome Program: www.ornl.gov/hgmis The Basics Cells are the fundamental working units of every living system. All the instructions needed to direct their activities are contained within the chemical DNA (deoxyribonucleic acid). DNA from all organisms is made up of the same chemical and physical components. The DNA sequence is the particular side-by- side arrangement of bases along the DNA strand (e.g., ATTCCGGA). This order spells out the exact instructions required to create a particular organism with its own unique traits. The genome is an organism’s complete set of DNA. Genomes vary widely in size: the smallest known genome for a free-living organ- ism (a bacterium) contains about 600,000 DNA base pairs, while human and mouse genomes have some 3 billion. Except for mature red blood cells, all human cells contain a com- plete genome. DNA in the human genome is arranged into 24 distinct chromosomes—physically separate molecules that range in length from about 50 million to 250 million base pairs. A few types of major chromosomal abnormalities, including missing or extra copies or gross breaks and rejoinings (translocations), can be detected by microscopic examination. Most changes in DNA, however, are more subtle and require a closer analysis of the DNA molecule to find perhaps single-base differences. Each chromosome contains many genes, the basic physical and functional units of heredity. Genes are specific sequences of bases that encode instructions on how to make proteins. Genes comprise only about 2% of the human genome; the remainder consists of noncoding regions, whose functions may include providing chromosomal structural integrity and regulating where, when, and in what quantity proteins are made. The human genome is estimated to contain 30,000 to 40,000 genes. Although genes get a lot of attention, it’s the proteins that perform most life functions and even make up the majority of cellular structures. Proteins are large, complex molecules made up of smaller subunits called amino acids. Chemical properties that distinguish the 20 different amino acids cause the protein chains to fold up into specific three-dimensional structures that define their particular functions in the cell. The constellation of all proteins in a cell is called its proteome. Unlike the relatively unchanging genome, the dynamic proteome changes from

Genetic Primer.pdf

Embed Size (px)

Citation preview

1Genomics and Its Impacton Medicine and SocietyA 2001 PrimerU.S. Department of Energy Human Genome Program: www.ornl.gov/hgmisThe BasicsCells are the fundamental working unitsof every living system. All the instructions neededto direct their activities are contained withinthe chemical DNA (deoxyribonucleic acid).DNA from all organisms is made up ofthe same chemical and physical components.The DNA sequence is the particular side-by-side arrangement of bases along the DNAstrand (e.g., ATTCCGGA). This order spells outthe exact instructions required to create aparticular organism with its own unique traits.The genome is an organisms completeset of DNA. Genomes vary widely in size: thesmallest known genome for a free-living organ-ism (a bacterium) contains about 600,000DNA base pairs, while human and mousegenomes have some 3 billion. Except for maturered blood cells, all human cells contain a com-plete genome.DNA in the human genome is arrangedinto 24 distinct chromosomesphysicallyseparate molecules that range in length fromabout 50 million to 250 million base pairs. Afew types of major chromosomal abnormalities,including missing or extra copies or grossbreaks and rejoinings (translocations), can bedetected by microscopic examination. Mostchanges in DNA, however, are more subtle andrequire a closer analysis of the DNA moleculeto find perhaps single-base differences.Each chromosome contains many genes,the basic physical and functional units ofheredity. Genes are specific sequences of basesthat encode instructions on how to makeproteins. Genes comprise only about 2% of thehuman genome; the remainder consists ofnoncoding regions, whose functions mayinclude providing chromosomal structuralintegrity and regulating where, when, and inwhat quantity proteins are made. The humangenome is estimated to contain30,000 to 40,000 genes.Although genes get a lot ofattention, its the proteins thatperform most life functions andeven make up the majority ofcellular structures. Proteins arelarge, complex molecules madeup of smaller subunits calledamino acids. Chemical propertiesthat distinguish the 20 differentamino acids cause the proteinchains to fold up into specificthree-dimensional structures thatdefine their particular functions inthe cell.The constellation of allproteins in a cell is called itsproteome. Unlike the relativelyunchanging genome, thedynamic proteome changes from2moment to moment in response to tens ofthousands of intra- and extracellular envi-ronmental signals. A proteins chemistry andbehavior are specified by the gene sequenceand by the number and identities of otherproteins made in the same cell at the sameThe Human Genome ProjectA Little Bit of Historytime and with which it associates and reacts.Studies to explore protein structure andactivities, known as proteomics, will be thefocus of much research for decades to comeand will help elucidate the molecular basis ofhealth and disease.Though surprising to many, the HumanGenome Project (HGP) traces its roots to an initia-tive in the U.S. Department of Energy (DOE). Since1945, DOE and its predecessor agencies have beencharged by Congress to develop new energyresources and technologies and to pursue a deeperunderstanding of potential health and environ-mental risks posed by their production and use.Such studies have since provided the scientific basisfor individual risk assessments, for example, ofnuclear medicine technologies.In 1986, DOE took a bold step in announc-ing its Human Genome Initiative, convinced thatDOEs missions would be well served by a refer-ence human genome sequence. Shortly there-after, DOE and the National Institutes of Healthdeveloped a plan for a joint HGP that officiallybegan in 1990.Ambitious Goals . . .From the outset, the HGPs ultimate goalhas been to generate a high-quality referencesequence for the entire human genome andidentify all human genes. Other important goalsare to sequence the genomes of model organ-isms to help interpret human DNA, enhancecomputational resources to support futureresearch and commercial applications, andexplore gene function through mouse-humancomparisons. Potential applications are numer-ous and include customized medicines,improved agriculture products, new energyresources, and tools for environmental cleanup.The HGP also aims to train future scientists,study human variation, and address criticalsocietal issues arising from the increased avail-ability of personal human genome data andrelated analytical technologies.. . . and Exciting ProgressAlthough the HGP originally was plannedto last 15 years, rapid technological advancesand worldwide participation have acceleratedthe expected completion date to 2003. InJune 2000, scientists announced biologys moststunning achievement: the generation of aworking draft sequence of the entire humangenome. In addition to serving as a scaffold forthe finished version, the draft provides a roadmap to an estimated 90% of genes on everychromosome and already has enabled genehunters to pinpoint genes associated with morethan 30 disorders.HGP resources have spurred a boom inspin-off sequencing programs on the humanand other genomes in both the private andpublic sectors. To stimulate further research, alldata generated in the public sector are madeavailable rapidly and free of charge via the Web.HGP Spinoff ProjectsMicrobial Genome Projectwww.sc.doe.gov/ober/microbial.htmlwww.ornl.gov/microbialgenomesMicrobial Cell Projectmicrobialcellproject.orgGenomes to Lifedoegenomestolife.orgEnvironmental Genome Projectwww.niehs.nih.gov/envgenom/home.htmCancer Genome Anatomy Projectwww.ncbi.nlm.nih.gov/ncicgapSNP Consortiumsnp.cshl.org3A Major MilestoneAchieving theWorking Draft Human Genome SequenceIn February 2001, HGP andCelera Genomics scientists pub-lished the long-awaited details ofthe working-draft DNA sequence.Although the draft is filled withmysteries, the first panoramic viewof the human genetic landscape hasrevealed a wealth of information andsome early surprises. Papers describ-ing research observations in the journals Nature(Feb. 15, 2001) andScience (Feb. 16, 2001) arefreely accessible (see www.ornl.gov/hgmis/project/journals/journals.html).Although clearly not a Holy Grail orRosetta Stone for deciphering all of biologytwo early metaphors commonly used todescribe the coveted prizethe sequence is amagnificent and unprecedented resource thatwill serve as a basis for research and discoverythroughout this century and beyond. It willhave diverse practical applications and a pro-found impact upon how we view ourselves andour place in the tapestry of life.One insight already gleaned from thesequence is that, even on the molecular level,we are more than the sum of our 35,000 or sogenes. Surprisingly, this newly estimated numberof genes is only one-third as great as previouslythought and only twice as many as those of atiny transparent worm, although the numbersmay be revised as more computational andexperimental analyses are performed. At oncehumbled and intrigued by this finding, scientistssuggest that the genetic key to human com-plexity lies not in the number of genes but inhow gene parts are used to build differentproducts in a process called alternative splicing.Other sources of added complexity are thethousands of chemical modifications made toproteins and the repertoire of regulatorymechanisms controlling these processes.The draft encompasses 90% of the humangenomes euchromatic portion, which containsthe most genes. In constructing the workingdraft, the 16 genome sequencing centersproduced over 22.1 billion bases of rawsequence data, comprising overlap-ping fragments totaling 3.9 billionbases and providing sevenfoldcoverage (sequenced seven times) ofthe human genome. Over 30% ishigh-quality, finished sequence, witheight- to tenfold coverage, 99.99%accuracy, and few gaps.High-Quality VersionExpected in 2003The entire working draft will be finished tohigh quality by 2003. Coincidentally, that year alsowill be the 50th anniversary of Watson and Crickspublication of DNA structure that launched the eraof molecular genetics (see www.nature.com/genomics/human/watson-crick). Much willremain to be deciphered even then. Somehighlights follow from Nature, Science, and TheWellcome Trust philanthropy (an HGP funder).Bioinformatics Boom:Managing the DataMassive quantities of genomic data and high-throughput technologies are now enabling studieson a vastly larger scale than ever before, forexample, in monitoring and comparing the activityof tens of thousands of genes simultaneously incancerous and noncancerous tissue. Advancedcomputational tools and interdisciplinary expertsare needed to capture, represent, store, integrate,distribute, and analyze all the data.Bioinformatics is the term coined for the new fieldthat merges biology, computer science, andinformation technology to manage and analyzethe data, with the ultimate goal of understandingand modeling living systems. Computing andinformation demands will continue to rise with theexplosive torrent of data from large-scale studiesat the molecular, cellular, and whole organismlevels (tour of the public DNA sequence databaseGenBank: www.ncbi.nlm.nih.gov/Tour/).4What Does the Working DraftTell Us?By the Numbers The human genome contains 3164.7 millionchemical nucleotide bases (A, C, T, and G). The average gene consists of 3000 bases, butsizes vary greatly, with the largest known humangene being dystrophin at 2.4 million bases. The total number of genes is estimated at30,000 to 40,000, much lower than previousestimates of 80,000 to 140,000 that had beenbased on extrapolations from gene-rich areasas opposed to a composite of gene-rich andgene-poor areas. The order of almost all (99.9%) nucleotidebases is exactly the same in all people. The functions are unknown for more than50% of discovered genes.The Wheat from the Chaff About 2% of the genome encodes instruc-tions for the synthesis of proteins. Repeated sequences that do not code forproteins (junk DNA) make up at least 50%of the human genome. Repetitive sequences are thought to have nodirect functions, but they shed light on chro-mosome structure and dynamics. Over time,these repeats reshape the genome by rearrang-ing it, thereby creating entirely new genes ormodifying and reshuffling existing genes. During the past 50 million years, a dramaticdecrease seems to have occurred in the rate ofaccumulation of these repeats.How Its Arranged The human genomes gene-dense urbancenters are predominantly composed of theDNA building blocks G and C. In contrast, the gene-poor deserts are richin the DNA building blocks A and T. GC- andAT-rich regions usually can be seen through amicroscope as light and dark bands on thechromosomes. Genes appear to be concentrated in randomareas along the genome, with vast expansesof noncoding DNA between. Stretches of up to 30,000 C and G basesrepeating over and over often occur adja-cent to gene-rich areas, forming a barrierbetween the genes and the junk DNA.These CpG islands are believed to helpregulate gene activity. Chromosome 1 has the most genes (2968),and the Y chromosome has the fewest (231).How the Human Genome Compares withThose of Other Organisms Unlike the humans seemingly randomdistribution of gene-rich areas, many otherorganisms genomes are more uniform, withgenes evenly spaced throughout. Humans have on average three times asmany kinds of proteins as the fly or wormbecause of mRNA transcript alternativesplicing and chemical modifications to theproteins. This process can yield differentprotein products from the same gene.5 Humans share most of the same proteinfamilies with worms, flies, and plants, but thenumber of gene family members hasexpanded in humans, especially in proteinsinvolved in development and immunity. The human genome has a much greaterportion (50%) of repeat sequences than themustard weed (11%), the worm (7%), andthe fly (3%). Although humans appear to have stoppedaccumulating repetitive DNA over 50 millionyears ago, there seems to be no such declinein rodents. This may account for some of thefundamental differences between hominidsand rodents, although estimates of genenumbers are similar in both species. Scien-tists have proposed many theories to explainevolutionary contrasts between humans andother organisms, including those of life span,litter sizes, inbreeding, and genetic drift.Variations and Mutations Scientists have identified about 1.4 millionlocations where single-base DNA differences(SNPs, see Sequence Variation, p. 11) occurin humans. This information promises torevolutionize the processes of finding chro-mosomal locations for disease-associatedsequences and tracing human history. The ratio of germline (sperm or egg cell)mutations is 2:1 in males vs females.Researchers point to several reasons for thehigher mutation rate in the male germline,including the greater number of cell divisionsrequired for sperm formation than for eggs.Applications, Future ChallengesDeriving meaningful knowledge from theDNA sequence will define research through thecoming decades to inform our understanding ofbiological systems. This enormous task willrequire the expertise and creativity of tens ofthousands of scientists from varied disciplines inboth the public and private sectors worldwide.The draft sequence already is having animpact on finding genes associated with disease.Genes have been pinpointed and associated withnumerous diseases and disorders including breastcancer, muscle disease, deafness, and blindness.Additionally, finding the DNA sequences underly-ing such common diseases as cardiovasculardisease, diabetes, arthritis, and cancers is beingaided by the human SNP maps generated in theHGP in cooperation with the private sector. Thesegenes and SNPs provide focused targets for thedevelopment of effective new therapies.One of the greatest impacts of having thesequence may well be in enabling an entirelynew approach to biological research. In the past,researchers studied one or a few genes at a time.With whole-genome sequences and new auto-mated, high-throughput technologies, they canapproach questions systematically and on agrand scale. They can study all the genes in agenome, for example, or all the gene productsin a particular tissue or organ or tumor, or howtens of thousands of genes and proteins worktogether in interconnected networks to orches-trate the chemistry of life.More on the Working DraftPapers from Science and Naturewww.ornl.gov/hgmis/project/journals/journals.htmlSequence Access Siteswww.ornl.gov/hgmis/project/journals/sequencesites.html6After the HGP, the Next Steps . . .The words of Winston Churchill, spoken in1942 after 3 years of war, capture well the HGPera: Now this is not the end. It is not even thebeginning of the end. But it is, perhaps, theend of the beginning.The avalanche of genome data growsdaily. The new challenge will be to use this vastreservoir of data to explore how DNA andproteins work with each other and the environ-ment to create complex, dynamic living sys-tems. Systematic studies of function on a grandscalefunctional genomicswill be the focusof biological explorations in this century andbeyond. These explorations will encompassstudies in transcriptomics, proteomics, struc-tural genomics, new experimental methodolo-gies, and comparative genomics. Transcriptomics involves large-scale analysisof messenger RNAs (molecules that aretranscribed from active genes) to determinewhen, where, and under what conditionsgenes are expressed. Proteomicsthe study of protein expressionand functioncan bring researchers closerthan gene-expression studies to whatsactually happening in the cell. Structural genomics initiatives are beinglaunched worldwide to generate the 3-Dstructures of one or more proteins from eachprotein family, thus offering clues to theirfunction and providing biological targets fordrug design. Knockout studies are one experimentalmethod for understanding the function ofDNA sequences and the proteins they encode.Researchers inactivate genes in living organ-isms and monitor any changes that couldreveal the function of specific genes. Comparative genomicsanalyzing DNAsequence patterns of humans and well-studied model organisms side by sidehasbecome one of the most powerful strategiesfor identifying human genes and interpretingtheir function.Today,scientistshave inhand the com-plete DNAsequences ofgenomes formany organismsfrom microbes to mice to humans. For the firsttime, we can begin to explore the operatingsystems of life written into these genetic codesand put them to use. At the leading edge of thisgreat scientific frontier is the DOE Genomes to Lifeprogram, whose overarching goal is to use thesebiological tools to target critical mission challengesin energy independence, global climate change,toxic waste cleanup, and human health protection.Future applications of this knowledge promise far-reaching benefits to the nation: Independence from foreign oil Significant savings in toxic waste cleanup anddisposal Stabilization of atmospheric carbon dioxide tocounter global warming Enhanced biowarfare agent detection and responseGenomes to life will build on the Human GenomeProject, both by exploiting its data and by extendingthe whole-genome approach to biology to the nextlevelattaining a comprehensive understanding ofhow entire biological systems work. This will lead tomodels that enable scientists to understand livingsystems well enough to predict their behavior underdifferent environmental conditions. Applications ofthis level of understanding will be revolutionary.7Medicine and the New Genetics:Gene Testing, Pharmacogenomics, Gene Therapycolon cancers. Although they havelimitations, these tests sometimesare used to make risk estimatesin presymptomatic individualswith a family history of thedisorder. One potential benefitto using these gene tests is thatthey could provide information thathelps physicians and patients managethe disease or condition more effec-tively. Regular colonoscopies for thosehaving mutations associated withcolon cancer, for instance, couldprevent thousands of deaths each year.Some scientific limitations are thatthe tests may not detect every muta-tion associated with a particular condi-tion (many are as yet undiscovered),and the ones they do detect maypresent different risks to differentpeople and populations. Anotherimportant consideration in gene testingis the lack of effective treatments orpreventive measures for many diseasesand conditions now being diagnosed orpredicted.Revealing information about risk of futuredisease can have significant emotional andpsychological effects as well. Moreover, theabsence of privacy and antidiscrimination legalprotections can lead to discrimination inemployment or insurance or other misuse ofpersonal genetic information. Additionally,because genetic tests reveal information aboutindividuals and their families, test results canaffect family dynamics. Results also can poserisks for population groups if they lead togroup stigmatization.Other issues related to gene tests includetheir effective introduction into clinical practice,the regulation of laboratory quality assurance,the availability of testing for rare diseases, andthe education of healthcare providers andpatients about correct interpretation andattendant risks.DNA underlies every aspect ofour health, both in function anddysfunction. Obtaining adetailed picture of howgenes and other DNAsequences function togetherand interact with environmentalfactors ultimately will lead to thediscovery of pathways involved innormal processes and in diseasepathogenesis. Such knowledge willhave a profound impact on the waydisorders are diagnosed, treated,and prevented and will bring aboutrevolutionary changes in clinical andpublic health practice. Some of thesetransformative developments aredescribed below.Gene TestsDNA-based tests are among thefirst commercial medical applicationsof the new genetic discoveries. Genetests can be used to diagnose disease,confirm a diagnosis, provide prognosticinformation about the course of disease, con-firm the existence of a disease in asymptomaticindividuals, and, with varying degrees ofaccuracy, predict the risk of future disease inhealthy individuals or their progeny.Currently, several hundred genetic testsare in clinical use, with many more underdevelopment, and their numbers and varietiesare expected to increase rapidly over the nextdecade. Most current tests detect mutationsassociated with rare genetic disorders thatfollow Mendelian inheritance patterns. Theseinclude myotonic and Duchenne musculardystrophies, cystic fibrosis, neurofibromatosistype 1, sickle cell anemia, and Huntingtonsdisease.Recently, tests have been developed todetect mutations for a handful of more com-plex conditions such as breast, ovarian, and8side effects than many current medicines.Ideally, the new genomic drugs could be givenearlier in the disease process. As knowledgebecomes available to select patients most likelyto benefit from a potential drug,pharmacogenomics will speed the design ofclinical trials to bring the drugs to marketsooner.Gene Therapy, EnhancementThe potential for using genes themselvesto treat disease or enhance particular traits hascaptured the imagination of the public and thebiomedical community. This largely experimen-tal fieldgene transfer or gene therapyholdspotential for treating or even curing suchgenetic and acquired diseases as cancers andAIDS by using normal genes to supplement orreplace defective genes or bolster a normalfunction such as immunity.More than 500 clinical gene-therapy trialsinvolving about 3500 patients have beenidentified worldwide (June 2001). The vastmajority (78%) take place in the United States,followed by Europe (18%). Most trials focus onvarious types of cancer, although monogenic,infectious, vascular, and other multigenicdiseases are being studied as well. Protocolsgenerally are aimed at establishing thesafety of gene-delivery procedures ratherthan effectiveness, and no cures as yetcan be attributed to these trials.Gene transfer still faces many scien-tific obstacles before it can become apractical approach for treatingdisease. According to the AmericanSociety of Human Genetics State-ment on Gene Therapy, effectiveprogress will be achieved onlythrough continued rigorousresearch on the most fundamentalmechanisms underlying genedelivery and gene expression inanimals.Pharmacogenomics: MovingAway from One-Size-Fits-AllTherapeuticsWithin the next decade, researchers willbegin to correlate DNA variants with individualresponses to medical treatments, identify particu-lar subgroups of patients, and develop drugscustomized for those populations. The disci-pline that blends pharmacology with genomiccapabilities is called pharmacogenomics.More than 100,000 people die each yearas the result of adverse responses to medicationsthat are beneficial to others. Another 2.2 mil-lion experience serious reactions, whileothers fail to respond at all. DNA variants ingenes involved in drug metabolism, particularlythe cytochrome P450 multigene family, are thefocus of much current research in this area.Enzymes encoded by these genes are respon-sible for metabolizing most drugs used today,including many for treating psychiatric, neuro-logical, and cardiovascular diseases. Enzymefunction affects patients responses to both thedrug and the dose. Future advances will enablerapid testing to determine the patients geno-type and drastically reduce hospitalizationresulting from adverse reactions.Genomic data and technologiesalso are expected to make drug devel-opment faster, cheaper, and moreeffective. Most drugs today are based onabout 500 molecular targets; genomicknowledge of the genes involved indiseases, disease pathways, anddrug-response sites will lead tothe discovery of thousands of newtargets. New drugs, aimed atspecific sites in the body and atparticular biochemical eventsleading todisease, probably will cause fewer9Other Anticipated Benefits of Genetic ResearchRapid progress in genome science and aglimpse into its potential applications havespurred observers to predict that biology willbe the foremost science of the 21st century.Technology and resources generated by theHuman Genome Project and other genomicresearch already are having major impacts onresearch across the life sciences. Doubling insize between 1993 and 1999, according toErnst & Young,1 the biotechnology industrygenerated 151,000 direct jobs and 287,000indirect jobs. Revenues totaled $20 billiondirectly and $27 billion indirectly.Some Current and PotentialApplications of GenomeResearchMolecular Medicine Improve diagnosis of disease Detect genetic predispositions to disease Create drugs based on molecular information Use gene therapy and control systems as drugs Design custom drugs based on individualgenetic profilesMicrobial Genomics Rapidly detect and treat pathogens (disease-causing microbes) in clinical practice Develop new energy sources (biofuels) Monitor environments to detect pollutants Protect citizenry from biological and chemi-cal warfare Clean up toxic waste safely and efficientlyRisk Assessment Evaluate the health risks faced by individualswho may be exposed to radiation (includinglow levels in industrial areas) and to cancer-causing chemicals and toxinsBioarchaeology, Anthropology, Evolution,and Human Migration Study evolution through germline mutationsin lineages Study migration of different populationgroups based on maternal genetic inheritance Study mutations on the Y chromosome to tracelineage and migration of males Compare breakpoints in the evolution ofmutations with ages of populations andhistorical eventsDNA Identification Identify potential suspects whose DNA maymatch evidence left at crime scenes Exonerate persons wrongly accused of crimes Identify crime, catastrophe, and other victims Establish paternity and other family relationships Identify endangered and protected species asan aid to wildlife officials (could be used forprosecuting poachers) Detect bacteria and other organisms that maypollute air, water, soil, and food Match organ donors with recipients intransplant programs Determine pedigree for seed or livestock breeds Authenticate consumables such as caviar andwineAgriculture, Livestock Breeding, andBioprocessing Grow disease-, insect-, and drought-resistantcrops Breed healthier, more productive, disease-resistant farm animals Grow more nutritious produce Develop biopesticides Incorporate edible vaccines into food products Develop new environmental cleanup uses forplants like tobacco1. The Economic Contributions of the BiotechnologyIndustry to the U.S. Economy, May 2000, Ernst & Young.10Societal Concerns Arising from the New Genetics Uncertainties associated with gene testsfor susceptibilities and complex conditions(e.g., heart disease, diabetes, and Alzheimersdisease). Should testing be performed when notreatment is available or when interpretation isunsure? Should children be tested for suscepti-bility to adult-onset diseases? Conceptual and philosophical implicationsregarding human responsibility, free will vsgenetic determinism, and concepts of healthand disease. Do our genes influence our behav-ior, and can we control it? What is consideredacceptable diversity? Where is the line drawnbetween medical treatment and enhancement? Health and environmental issues concern-ing genetically modified (GM) foods andmicrobes. Are GM foods and other products safefor humans and the environment? How willthese technologies affect developing nationsdependence on industrialized nations? Commercialization of products includingproperty rights (patents, copyrights, andtrade secrets) and accessibility of data andmaterials. Will patenting DNA sequences limittheir accessibility and development into usefulproducts?Since its inception, the Human GenomeProject has dedicated funds toward studyingthe ethical, legal, and social issues (ELSI) sur-rounding the availability of the new data andcapabilities. Examples of such issues follow. Privacy and confidentiality of geneticinformation. Who owns and controls geneticinformation? Is genetic privacy different frommedical privacy? Fairness in the use of genetic informationby insurers, employers, courts, schools,adoption agencies, and the military, amongothers. Who should have access to personalgenetic information, and how will it be used? Psychological impact, stigmatization, anddiscrimination due to an individuals geneticdifferences. How does personal genetic informa-tion affect self-identity and societys perceptions? Reproductive issues including adequate andinformed consent and the use of geneticinformation in reproductive decision making.Do healthcare personnel properly counselparents about risks and limitations? What arethe larger societal issues raised by new repro-ductive technologies? Clinical issues including the educationabout capabilities, limitations, and socialrisksof doctors and other health-serviceproviders, people identified with geneticconditions, and the general public; and theimplementation of standards and qualitycontrol measures. How do we prepare healthprofessionals for the new genetics? How do weprepare the public to make informed choices?How will genetic tests be evaluated and regu-lated for accuracy, reliability, and usefulness?(Currently, there is little regulation at thefederal level.) How do we as a society balancecurrent scientific limitations and social risk withlong-term benefits? Fairness in access to advanced genomictechnologies. Who will benefit? Will there bemajor worldwide inequities?11Human Genome Project Goals 19982003Human DNA SequencingThe HGPs continued emphasis is on obtainingby 2003 a complete and highly accurate referencesequence (1 error in 10,000 bases) that is largelycontinuous across each human chromosome.Scientists believe that knowing this sequence iscritically important for understanding humanbiology and for applications to other fields.A working draft of the sequence wascompleted 18 months ahead of schedule, in June2000. The achievement has provided scientistsworldwide with a road map to an estimated 90%of genes on every chromosome. Although the draftcontains gaps and errors and does not yet meetthe criterion of 1 error in 10,000 bases outlinedabove, it provides a valuable scaffold for generatinga high-quality reference genome sequence. HGPscientists make human DNA sequence availablebroadly, rapidly, and free of charge via the Web.Sequencing TechnologyAlthough current sequencing capacity is fargreater than at the inception of the HGP, furtherincremental progress in sequencing technologies,efficiency, and cost-reduction are needed. Forfuture sequencing applications, planners emphasizethe importance of supporting novel technologiesthat may be 5 to 10 years in development.Sequence VariationAlthough more than 99% of human DNAsequences are the same across the population,variations in DNA sequence can have a majorimpact on how humans respond to disease; to suchenvironmental insults as bacteria, viruses, toxins,and chemicals; and to drugs and other therapies.Methods are being developed to detect differenttypes of variation, particularly the most commontype called single-nucleotide polymorphisms (SNPs),which occur about once every 100 to 300 bases.Scientists believe SNP maps will help them identifythe multiple genes associated with such complexdiseases as cancer, diabetes, vascular disease, andsome forms of mental illness. These associations aredifficult to establish with conventional gene-hunting methods because a single altered genemay make only a small contribution to disease risk.Functional GenomicsEfficient interpretation of the functions ofhuman genes and other DNA sequences requiresthat strategies be developed to enable large-scaleinvestigations across whole genomes. A first priorityis to generate complete sets of full-length cDNAclones and sequences for human and model-organism genes. Other functional-genomics goalsinclude studies into gene expression and controland the development of experimental and compu-tational methods for understanding gene function.Comparative GenomicsThe functions of human genes and otherDNA regions often are revealed by studying theirparallels in nonhumans. HGP researchers haveobtained complete genomic sequences for thebacterium Escherichia coli, the yeast Saccharomycescerevisiae, the fruit fly Drosophila melanogaster, andthe roundworm Caenorhabditis elegans. Sequenc-ing continues on the laboratory mouse. Theavailability of complete genome sequences gener-ated both inside and outside the HGP is driving amajor breakthrough in fundamental biology asscientists compare entire genomes to gain newinsights into evolutionary, biochemical, genetic,metabolic, and physiological pathways.Ethical, Legal, and Social Implications (ELSI)Rapid advances in the science of genetics andits applications present new and complex ethicaland policy issues for individuals and society. ELSIprograms that identify and address these implica-tions have been an integral part of the U.S. HGPsince its inception. These programs have resultedin a body of work that promotes education and helpsguide the conduct of genetic research and thedevelopment of related medical and public policies.Bioinformatics and Computational BiologyContinued investment in current and newdatabases and analytical tools is critical to thesuccess of the HGP and to the future usefulness ofthe data it produces. Databases must adapt to theevolving needs of the scientific community andmust allow queries to be answered easily. Plannerssuggest developing a human genome database,analogous to model organism databases, that willlink to phenotypic information. Also needed aredatabases and analytical tools for studying theexpanding body of gene-expression and functionaldata, for modeling complex biological networksand interactions, and for collecting and analyzingsequence-variation data.TrainingFuture genome scientists will require training ininterdisciplinary areas including biology, computerscience, engineering, mathematics, physics, andchemistry. Also, scientists with management skills willbe needed for leading large data-production efforts.12A wall poster depicting the 24 humanchromosomes and many mapped genesmay be ordered from www.ornl.gov/hgmis/posters/chromosome/ or from HGMIS.Informative sidebars explain genetic termsand provide URLs for finding more detailedinformation.For More InformationOctober 2001. Produced by the HumanGenome Management InformationSystem (HGMIS) at Oak Ridge NationalLaboratory, Oak Ridge, Tennessee, forthe U.S. Department of Energy HumanGenome Program.For additional copies, contact:[email protected], 865/574-0597,www.ornl.gov/hgmis.A PDF version of this document andaccompanying PowerPoint slides maybe downloaded from www.ornl.gov/hgmis/publicat/primer/intro.html. Human Genome Project Informationwww.ornl.gov/hgmis Online Version of this Primerwww.ornl.gov/hgmis/publicat/primer/intro.html Medicine and the New Geneticswww.ornl.gov/hgmis/medicine/medicine.html Ethical, Legal, and Social Issueswww.ornl.gov/hgmis/elsi/elsi.html Genomes to Life (post-HGP research)DOEGenomesToLife.orgWall Poster Available Free of Charge Publication of Genome Draft Sequencewww.ornl.gov/hgmis/project/journals/journals.html Image Gallerywww.ornl.gov/hgmis/education/images.html Resources for Teacherswww.ornl.gov/hgmis/education/education.html Resources for Studentswww.ornl.gov/hgmis/education/students.html Careers in Genomicswww.ornl.gov/hgmis/education/careers.html