33
AMERICAN GASTROENTEROLOGICAL ASSOCIATION American Gastroenterological Association Future Trends Committee Report: The Application of Genomic and Proteomic Technologies to Digestive Disease Diagnosis and Treatment and Their Likely Impact on Gastroenterology Clinical Practice KONSTANTINOS N. LAZARIDIS and BRIAN D. JURAN Center for Basic Research in Digestive Diseases, Division of Gastroenterology and Hepatology, Mayo Clinic College of Medicine, Rochester, Minnesota T he American Gastroenterological Association (AGA) Future Trends Committee was created in 2004 to further the AGA Strategic Plan by identifying and char- acterizing important trends in clinical practice and sci- entific-technological developments in the world in gen- eral and medicine and gastroenterology in particular that potentially will impact the AGA and/or its members in the coming 3–5 years or beyond and to make strategic recommendations to the Governing Board on how AGA should deal with those trends and developments. These trends and developments may be economic, demo- graphic, practice-based, scientific/technological or polit- ical in nature. Specifically, the committee is charged with preparing a report (or reports) for the AGA Governing Board that describes the trends or developments it has identified, pos- tulates their impact on gastroenterology practice and/or research as appropriate, and presents specific recommenda- tions for action by the AGA in terms of policy and pro- grams. The committee is also asked to monitor these trends and technologies as they play out over time. In July 2004, the AGA Leadership Cabinet suggested several topics that the Future Trends Committee should address. Realizing that the Future Trends Committee could not realistically consider all of them, criteria were developed to prioritize the topics and others that might be added in the future. These criteria were as follows: Time variable, that is, “when will gastroenterology be affected?” Scale and magnitude Does the trend or development represent a threat or opportunity (or both) to gastroenterology? Effect on patient care quality and safety Effect on AGA members and the AGA per se Implications to reimbursement Impact on gastroenterologists’ training and education. In October 2004, a crude Delphi process was used to determine the trends and developments that should be the focus of the committee’s work. Committee members were asked to assign priority scores to the items in the following list, which was based on the suggestions of the Leadership Cabinet and supplemented by AGA staff and others. This process was done via the mail. The application of genomic and proteomic technol- ogies to digestive disease diagnosis and treatment Major changes in the US health care system and reimbursement Increased median age of the population Changes in the ethnic and racial makeup of the US population Patients’ involvement in their own care New colorectal cancer screening and diagnostic technologies Biomedical research funding changes Changes in academic health centers Changes in physician education and training Obesity-related disease incidence and prevalence Computerization and digitization of gastroenterol- ogy practice Abbreviations used in this paper: AGA, American Gastroenterologi- cal Association; BE, Barrett’s esophagus; CS, celiac sprue; CUC, chronic ulcerative colitis; FAP, familial adenomatous polyposis; GI, gastroin- testinal; HGP, Human Genome Project; IBS, irritable bowel syndrome; kb, kilobase; LD, linkage disequilibrium; MALDI-TOF, matrix-assisted laser description ionization-time-of-flight; MS, magnetic sectors; NH- GRI, National Human Genome Research Institute; PCR, polymerase chain reaction; PSC, primary sclerosing cholangitis; SNP, single nucle- otide polymorphism. © 2005 by the American Gastroenterological Association 0016-5085/05/$30.00 doi:10.1053/j.gastro.2005.06.047 GASTROENTEROLOGY 2005;129:1720 –1752

AMERICAN GASTROENTEROLOGICAL ASSOCIATION - Twin Cities

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

A

ACTa

KCM

Tfaeeptrstgi

rdtrtga

santt

GASTROENTEROLOGY 2005;129:1720–1752

MERICAN GASTROENTEROLOGICAL ASSOCIATION

merican Gastroenterological Association Future Trendsommittee Report: The Application of Genomic and Proteomicechnologies to Digestive Disease Diagnosis and Treatmentnd Their Likely Impact on Gastroenterology Clinical Practice

ONSTANTINOS N. LAZARIDIS and BRIAN D. JURANenter for Basic Research in Digestive Diseases, Division of Gastroenterology and Hepatology, Mayo Clinic College of Medicine, Rochester,

innesota

dtwfLo

cutklGco

he American Gastroenterological Association (AGA)Future Trends Committee was created in 2004 to

urther the AGA Strategic Plan by identifying and char-cterizing important trends in clinical practice and sci-ntific-technological developments in the world in gen-ral and medicine and gastroenterology in particular thatotentially will impact the AGA and/or its members inhe coming 3–5 years or beyond and to make strategicecommendations to the Governing Board on how AGAhould deal with those trends and developments. Theserends and developments may be economic, demo-raphic, practice-based, scientific/technological or polit-cal in nature.

Specifically, the committee is charged with preparing aeport (or reports) for the AGA Governing Board thatescribes the trends or developments it has identified, pos-ulates their impact on gastroenterology practice and/oresearch as appropriate, and presents specific recommenda-ions for action by the AGA in terms of policy and pro-rams. The committee is also asked to monitor these trendsnd technologies as they play out over time.

In July 2004, the AGA Leadership Cabinet suggestedeveral topics that the Future Trends Committee shouldddress. Realizing that the Future Trends Committee couldot realistically consider all of them, criteria were developedo prioritize the topics and others that might be added inhe future. These criteria were as follows:

● Time variable, that is, “when will gastroenterologybe affected?”

● Scale and magnitude● Does the trend or development represent a threat or

opportunity (or both) to gastroenterology?● Effect on patient care quality and safety● Effect on AGA members and the AGA per se

● Implications to reimbursement

● Impact on gastroenterologists’ training andeducation.

In October 2004, a crude Delphi process was used toetermine the trends and developments that should behe focus of the committee’s work. Committee membersere asked to assign priority scores to the items in the

ollowing list, which was based on the suggestions of theeadership Cabinet and supplemented by AGA staff andthers. This process was done via the mail.

● The application of genomic and proteomic technol-ogies to digestive disease diagnosis and treatment

● Major changes in the US health care system andreimbursement

● Increased median age of the population● Changes in the ethnic and racial makeup of the US

population● Patients’ involvement in their own care● New colorectal cancer screening and diagnostic

technologies● Biomedical research funding changes● Changes in academic health centers● Changes in physician education and training● Obesity-related disease incidence and prevalence● Computerization and digitization of gastroenterol-

ogy practice

Abbreviations used in this paper: AGA, American Gastroenterologi-al Association; BE, Barrett’s esophagus; CS, celiac sprue; CUC, chroniclcerative colitis; FAP, familial adenomatous polyposis; GI, gastroin-estinal; HGP, Human Genome Project; IBS, irritable bowel syndrome;b, kilobase; LD, linkage disequilibrium; MALDI-TOF, matrix-assisted

aser description ionization-time-of-flight; MS, magnetic sectors; NH-RI, National Human Genome Research Institute; PCR, polymerasehain reaction; PSC, primary sclerosing cholangitis; SNP, single nucle-tide polymorphism.

© 2005 by the American Gastroenterological Association0016-5085/05/$30.00

doi:10.1053/j.gastro.2005.06.047

Caurrsr

N

OAG

cm

pitmptcpwAdpcrcarho

fipio

prgfhg

cps(o(mdtot(gsl

aafimhDansdtact

gtrtcgstepwipctpni

w

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1721

ommittee members were asked to score each itemgainst each of the priority criteria noted previouslysing a scale in which 1 represents large effect and 3epresents small effect (on gastroenterology practice andesearch). The total scores of each topic were thenummed and ranked. The 4 highest priority scores thatesulted from this ranking were as follows:

ew colorectal cancer screening and diagnostic technol-ogies

besity-related diseaseging of the populationenomic and proteomic technologies.

Because the AGA was already investigating the ramifi-ations of the obesity epidemic, the Future Trends Com-ittee decided to concentrate on the other 3 topics.The committee determined that preparing the 3 re-

orts on its own was not feasible. Hence, it decided thatt would solicit proposals from potential qualified au-hors to draft the reports and would modify and supple-ent the drafts as necessary. A request for proposal was

repared and disseminated in December 2004. The au-hors, who were paid for their work, were chosen by theommittee from among the responses to the request forroposal. The manuscripts submitted by the authorsere reviewed by the committee in February 2005.mong the changes to the draft reports were recommen-ations for action by the AGA; these were developedrimarily by the committee. At its review meeting, theommittee also developed a uniform format for the 3eports. Revised manuscripts based on the committee’sritiques were completed in March 2005. The committeelso had each report evaluated by an outside experteviewer for completeness and to ensure that the authorsad not made any egregious error that may have beenverlooked.

This report represents the committee’s recommendationor action by the AGA on this important topic. However, its not the committee’s final word on the topic. Genomic androteomic technologies will advance rapidly over the com-ng years, and the committee will revisit this subject peri-dically.

Executive Summary

Medicine is on the verge of an unprecedented pros-ect as a result of advances in the discipline of genomics andecent progress in the emerging field of proteomics. Humanenomics is the study not just of single genes but also of theunctions and interactions among all genes in the genome ofumans. The significant evolution that has occurred in

enomic science over the past 5 years holds promise to c

hange our ability to better understand, diagnose, treat, andotentially prevent human illness. This unique opportunitytems from the completion of the Human Genome ProjectHGP) and the development of novel technologies, severalf which involve high-throughput automated assay systemsie, genotyping platforms) and the application of bioinfor-ation science (ie, bioinformatics). In a similar vein, the

iscipline of human proteomics is the study of the interac-ions among the various constituents of the entire proteomef humans. Human proteomics represents an extension ofraditional biochemistry coupled with novel technologiesie, tandem mass spectrometry) that seeks to take a morelobal approach to the assessment of the library of proteinspecific to humans and understanding these proteomic re-ationships to health and disease.

Because of the inevitable interconnection of the genomend proteome, genomics and proteomics should be vieweds complementary, rather than antagonizing, scientificelds. Nevertheless, the structure and function of the hu-an proteome are far more complicated than those of the

uman genome. Simply stated, the genome (ie, genomicNA) is a static, unwavering entity regarding its sequence

nd own duplication. In contrast, the proteome is a dy-amic, ever-changing unit. For instance, the protein expres-ion profiles in different human cells are highly variable,ependent on external or internal stimuli, not to mentionhe unique protein expression during the distinct stages ofperson’s life cycle. In contrast, the genome of each human

ell remains, in general, steady and unaltered over genera-ions.

To date, genomic science has mainly dissected the single-ene inherited diseases known as Mendelian disorders. Yet,he greatest promise and impact of genomic and proteomicesearch lie in their future application to complex or mul-ifactorial diseases. In antithesis to Mendelian disorders,omplex diseases arise due to the interplay of multipleenetic variants with environmental factors. Currently, thecience and technology of human genomics greatly exceedhe discipline of proteomics with respect to research discov-ries and applications on science and influence on medicalractice. The ultimate question addressed in this report ishether genomics, along with proteomics, will have an

mpact on future gastroenterology and hepatology clinicalractice. Although we are unable to easily articulate aomprehensive answer to this vital and multifaceted ques-ion, we believe that genomics and proteomics hold vastotential and almost certainly will shape the way we diag-ose and treat patients with digestive and hepatic disordersn years to come.

For eons, humans have understood that heredity, alongith the environment, shape our phenotypic diversity and

ontribute to disease. Yet, it was not until 2003 that the

eHitrvgienscb

phptsgotcn(nwsscpii

oephwcag

ofgsteivd

ptniapdoa

moagataaiivicr

ctatItkpe“i2

tGabflhfet

1722 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

ntire sequence of the human genome was elucidated via theGP, providing the biologic basis for a better understand-

ng of heredity and likely the interaction of environment onhe genetic material. To date, the enormity of informationegarding our own blueprint of life awaits meticulous in-estigation, the overall aim of which is to identify theenetic components of disease susceptibility, thus improv-ng diagnosis, therapy, and disease prevention. To thisxtent, well-designed translational studies in gastrointesti-al (GI) and liver diseases are urgently needed to transformuch colossal genomic knowledge into a meaningful out-ome that can be incorporated into clinical practice andenefit the sick.

Since the early 1990s, we have made momentousrogress in unraveling the genes causing digestive andepatic Mendelian diseases such as familial adenomatousolyposis (FAP), hereditary hemochromatosis, and heredi-ary pancreatitis, to name a few. Although the prevalence ofuch diseases is low, the understanding of the relevantenetic defects has unquestionably shed light on the biologyf the GI system and liver. In the genome era, however, theools are becoming available to begin dissecting commonomplex diseases such as irritable bowel syndrome (IBS),onalcoholic fatty liver disease, inflammatory bowel diseaseIBD), and many others. Recognizing the unparalleled ge-etic and environmental intricacy of those complex diseases,e have to realize that greater challenges lie ahead. In this

cientific struggle, technological innovations will be atrong ally. Newly available genotyping platforms nowompete to provide greater data quality at higher through-ut. Importantly, federally funded initiatives are currentlyn place to develop state-of-the-art whole genome sequenc-ng technologies at relatively affordable cost.

Dr. Harold Varmus, Nobel Laureate and former directorf the National Institutes of Health, wrote in a 2002ditorial in the New England Journal of Medicine that, “theublication last year of nearly complete sequences of theuman genome, did not mean that the practice of medicineould be abruptly and radically transformed. . . Still,

hanges in medical practice are already occurring at anccelerating pace under the influence of the elucidation ofenomes.”1

The clinical impact of genomic and proteomic technol-gies in gastroenterology and hepatology will become pro-ound as we better define the environmental influence andenetic predilection to related complex diseases. This isimply because of the high prevalence of complex diseases inhe population. From a pragmatic view, we predict that thexisting disparity between gathering information on thenherited human material and its application toward pre-enting, diagnosing, and treating complex GI and liver

iseases will close in the coming decades. To achieve such i

rogress, however, we need first to develop high-qualityranslational clinical studies and to test hypotheses perti-ent to the pathogenesis and therapy of the diseases ofnterest. This endeavor will require the coordinated effort ofpplying the knowledge and technologies of genomics,roteomics, and related disciplines. The ultimate goal is toissect the genetic variants that function not as direct causesf but as predisposing factors to development of digestivend hepatic illnesses.

In closing, we bear in mind that the proposed transfor-ation toward genomic medicine is not solely dependent

n scientific discovery. What has to be clearly understoodnd acted on is the concern that the assumed impact ofenomics and proteomics in clinical practice will not bettained without educating gastroenterology trainees, gas-roenterologists and hepatologists, health care professionals,nd our patients not only about the clinical applications butlso the limitations and threats of using genetic informationn medical practice. To this end, confidentiality and equal-ty of genetic information, accuracy of genetic testing, pre-ention of genetic discrimination, and the psychologicalmpact of knowing the genetic susceptibility to disease willontinue to confront healthcare providers, patients and theirelatives, and society alike.

Literature Review and Background

About 50 years ago, when James Watson and Fran-is Crick reported the discovery of the double helical struc-ure of DNA, perhaps hardly any gastroenterologist paidttention to their article in the journal Nature.2 Since then,his seminal finding has transformed the biologic sciences.n the past 2 decades, this scientific breakthrough providedhe cornerstone of an international scientific collaborationnown as the HGP, which aimed at sequencing the com-lete genome of Homo sapiens.3,4 This large-scale biologicndeavor occurred over 13 years (1990–2003).3,4 If thepregenomic era” was concluded by the complete sequenc-ng of the 3.2 billion nucleotides of the human genome in003, we have already entered the “genome era.”5

As stated provocatively by Francis Collins, Director ofhe National Human Genome Research Institute (NH-RI), “all disease—aside from most cases of trauma—havegenetic component.”6 Genetic and environmental contri-utions have been proposed in several GI and liver diseasesor more than a century. To date, we have observed pub-ished work on the genetic contributions in colon cancer,ereditary hemochromatosis, IBD, and IBS, to mention aew. This trend will only continue to increase in the genomera, with implications for practicing or experimental gas-roenterologists and hepatologists alike, regardless of our

ntellectual prejudicial approaches to understand the patho-

gd

updaadrTftotTacTtfgtebcc

agfmihetrus

piGt“efa

aga

sic

hdgatopgti

ieultoo

kdttcilbs

rt

lkepfqtsttaa

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1723

enesis of, diagnose, treat, and prevent digestive and hepaticisorders.

To successfully treat illness, it is imperative to firstnderstand how a disease state is caused. Unraveling theathogenesis of a disorder provides a means to intersect withisease processes and hopefully to alter the natural history ofn illness toward cure or prevention. As we know it, therere 2 elements, whose interaction leads to many humaniseases. The first is the inevitable environmental exposures/isks to which we are constantly exposed, even before birth.he second is the genetic predisposition we have inherited

rom our parents. These 2 components have to be dissectedo shed light on disease pathogenesis if we wish to improveur current diagnostic methods and treatments. To evaluatehe environmental component of an illness is challenging.o assess the genetic inclination to disease is a daunting butttainable goal, because the genetic material is generallyonsidered steady and today can be tested in the laboratory.o this end, the genetic information derived from comple-

ion of the HGP will facilitate such efforts by thrustingorward the discipline of genomics and aiding the emer-ence of the field of proteomics. These 2 interrelated scien-ific subjects and other associated disciplines such as geneticpidemiology and bioinformatics will definitely advanceasic studies and translational clinical investigations to elu-idate the genetic predilection and environmental factorsausing disease.

Recognizing the need to exemplify the mission of thebove scientific fields, a few definitions are offered. Humanenomics is a scientific field that examines the structure,unction, and interaction of all the genes and genetic ele-ents in the human genome.7 Human proteomics is an emerg-

ng discipline that aspires to study the human proteome inealth and disease using a variety of methodologies.8 Geneticpidemiology is the study of the role of genetic factors andheir interaction with environmental elements in the occur-ence of disease in human populations.9 Bioinformatics is these of computer science and methods for the purpose ofpeeding up and enhancing biologic research.10

Amid the current exhilarating scientific progress, weose 2 questions. (1) Will genomics and proteomics have anmpact on the manner that we will use to diagnose and treatI and liver disorders in the future? (2) If so, when will this

ransformation occur? A short answer to the first question isalmost positively,” but a reply to the second query is notasy. Most likely this revolution will steadily ensue in theorthcoming decades, although variations in lead-time arenticipated among different digestive and hepatic diseases.

In the following pages of this report, an effort is made tossess the application of genomic and proteomic technolo-ies to the diagnosis and treatment of GI and liver diseases

nd their proposed impact on the clinical practice of phy- t

icians who treat such patients. Our given task is challeng-ng. To this extent, we would like to make 3 statementsoncerning the structure and scope of this report.

First, the genomic era and the related scientific fields ofuman genomics and proteomics are currently at theirevelopmental stage. A plethora of genomic data have beenenerated and analyzed as a result of the HGP. However,dditional appealing scientific questions have risen fromhis effort. To inform the reader concerning the status quof genomic and emerging proteomic science, the followingages provide a review of the facts regarding the humanenome and proteome, including an overview of the con-emporary scientific theories that attempt to explain humannheritance in health and disease.

Second, before exploring the possible applications andmpact of genomic and proteomic technologies on gastro-nterology and hepatology clinical practice, paradigms andpdates of genomic discoveries in selected digestive andiver diseases are presented. This background knowledge iso serve as a framework on the discussion about the impactf genomic and proteomic technologies in the practice ofur subspecialty.

Third, we believe that educating the human factor is aey element to successfully applying such unprecedentedata on genetic information to better diagnose and treathose with digestive and hepatic disorders. To this direc-ion, we discuss the need to educate health professionalsoncerning the application(s) and limitations of geneticnformation and relevant testing. An outline of the ethical/egal and social issues of genomic medicine that pertain tooth the patient and the health care provider is also pre-ented.

Finally, a basic glossary of genomics, proteomics, andelated fields is available as a reference aid regarding theerminology used in this report (see Appendix 1).

The Human Genome

Simply put, the genome is the program that encodesife. This program, embedded in strings of nucleic acidsnown as DNA, takes shape in the double helix.2 Thislegant configuration provides the means for accurate du-lication of the DNA during each cell division and allowsor vertical transmission of the heritable information re-uired to perpetuate the species. While the DNA encodinghe genome takes on an incredibly complex 3-dimensionaltructure when packed into the nucleus of a cell, we are ableo understand and analyze this molecule as the sequence ofhe 4 nucleo bases (A [adenine], C [cytosine], G [guanine],nd T [thymine]) that comprise it. Ascertainment of thepproximate sequence of the 3.2 billion nucleic acid bases of

he human genome was the extraordinary accomplishment

otsbod

TsucphtsH3gseTfcap3

goshcprsslsguaitfitenvu

d

msiftggcmsu0cmtiirt

ibaiotkabpvtc

Fcrsd

1724 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

f the HGP, heralding the genome era.5 The knowledge ofhis “blueprint” of the human genome sequence provides aolid basis for the study of genes and genetic differencesetween individuals and will likely have a dramatic impactn the way we approach the study of both health andisease.

HGP: facts learned about the human genome.he sequence of the human genome has been determined by

equencing the genomic DNA from a handful of individ-als.3,4 This sequence has been mapped to the specific 23hromosomes comprising the human genome and thusrovides man with an “instruction code” of the prototypicaluman. While this overall genomic structure is what dis-inguishes us from other forms of life, variation of thisequence differentiates us from each other. As a result of theGP, it is now estimated that there are approximately

0,000 protein-coding genes in the human genome.3 Theseenes are comprised of coding regions (ie, exons) inter-persed with noncoding regions (ie, introns) and flanked bynhancer, promoter, initiation, and termination elements.he presence of intervening introns within each gene allows

or “alternative splicing” of exons to occur, which is aommon mechanism to derive several protein isoforms fromn individual gene. In fact, alternative splicing seems to berevalent in humans, with lower estimates of approximately5% of genes undergoing this phenomenon.3

The protein-coding genes comprise �2% of the humanenome; the remaining 98% has traditionally been thoughtf as “junk DNA” with no function. Interestingly, theequence of many of these intergenic regions seems to beighly conserved, and as much as half of the genomeonsists of repeated sequences.3 These genomic repeats maylay an important, yet undefined, structural and functionalole. The genome is highly condensed in the cell andequestered in the nucleus. This nuclear packaging is pos-ible by close association of the DNA with various proteinsike histones and allows for tight regulation of processesuch as DNA replication and gene expression. Hence, theenome exists in vivo as a vibrant molecule constantlyndergoing conformational change in response to internalnd external stimuli, allowing the individual cell to react tots environment and perform its specialized functions. Athe same time, the genome is a stable entity that is faith-ully duplicated in the trillions of cells making up eachndividual and in transmitting the genetic information tohe next generation. Thus, the genome acts as the programncoding life; within the structural “blueprint” of our ge-ome are the instructions to build a human, and within theariation of its sequence lay the differences making each ofs a unique individual.

Human genetic variation. Humans are a relatively

iverse species, as is easily noticed when looking around d

ost any street corner. Kruglyak and Nickerson eloquentlyummarized this observation by stating, “genetic variations the spice of life.”11 The variation in facial features, bodyorm, and certain mannerisms as well as the predilectionoward development of disease are driven in part by theenomic sequence differences between individuals. Thisenetic variation has arisen through a combination of nu-leotide substitutions, recombination events, and randomatings that have occurred throughout the history of our

pecies.12 The extent of genetic variation between any 2nrelated individuals is estimated to be approximately.1%.7 Although this number may seem small and hardlyapable of producing such diversity, it still accounts forillions of genetic sequence variants. Most of these variants

ake the form of single nucleotide polymorphisms (SNPs),n which the sequence differs at a single base pair.7 Othermportant variations in the genome sequence include mic-osatellites and insertions/deletions. A brief overview ofhese types of genetic variation is provided.

SNPs. Stated succinctly, a SNP is a specific positionn the genome where alternate nucleotides can (and do) existetween 2 individuals or a population (Figure 1). For ex-mple, a SNP such as A/C (adenine or cytosine) is a positionn the genome that can harbor one of 2 nucleotide alleles (Ar C). The allele appearing more frequently in the popula-ion is known as the major allele and the less frequent isnown as the minor allele. An arbitrary cutoff for minorllele frequency of 1% has been used to define SNPs,ecause this rate is the traditional definition of polymor-hism. We would like to clarify that SNPs should not beiewed as point mutations. SNPs are simply normal varia-ions of sequence and do not necessarily cause disease; inontrast, point mutations are directly capable of causing

igure 1. (1) An A/C SNP is shown in red within a DNA stretch ofhromosome 9. (2) Microsatellite D22S423 consisting of 24 CAepeats is shown in blue within a segment of sequence from chromo-ome 22. (3) An insertion of 3 bases (ATA) is shown in red. (4) Aeletion of 2 bases (GT) is shown in red.

isease.

mngtfobabntottmprioc

tocafSsaq(aap�5fp

iima

aibyiSpoapbthimicvt

mseMt(Tipmcbr

T

T(

NI

NSR

II

M

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1725

SNPs are stable, greatly abundant, and comprise the vastajority (�90%) of polymorphic loci in the human ge-

ome.3 It is estimated that more than 107 SNPs exist in theenome of the human population.11 SNPs are evenly dis-ributed across the genome, generally dependent on therequency of the minor allele. For example, we expect tobserve one SNP with 1% minor allele frequency every 300ases of genomic sequence and one SNP with 40% minorllele frequency only every 3300 bases.11 These estimates areased on the observed differences between 2 “haploid ge-omes” and extrapolated by the neutral theory of popula-ion genetics, which assumes a randomly mating populationf constant size, with no selectivity of alleles. While thisheory is simplistic and does not take into account popula-ion expansion or natural selection, more detailed treat-ents arrive at similar estimates,11 suggesting that a large

ercentage of the observed SNPs are rather innocuous inegard to selective pressures and perhaps functional signif-cance. This finding is not surprising considering that 99%f SNPs are located in non–gene-coding regions ofhromosomes.13

The location of each SNP within the genome may de-ermine its potential functional significance and relative riskn disease impact (Table 1). For instance, SNPs within theoding region of a gene (ie, coding SNPs) may result in anltered protein likely causing a significant impact on geneunction and thus influencing the risk of disease.14 CodingNPs can be further subclassified as synonymous and non-ynonymous, whether they alter or do not alter the specificmino acid of a protein. Of interest, the structural conse-uence of nonsynonymous coding SNPs can be predictedie, changed amino acid); therefore, these are best suited forssociation-based candidate gene studies (see discussion ofssociation studies in the following text). However, therevalence of coding SNPs is extremely low, estimated at0.1% of all SNPs (it is anticipated that there are about

0,000 coding SNPs in the human genome).11 Despite thisact, some have hypothesized that rare minor alleles may

able 1. SNP-Derived Variation and Risk to Phenotype

ype of SNPfrom high to low risk to phenotype) Location

onsense Exon, codingnsertion/deletion Exon, coding

onsynonymous Exon, codingynonymous Exon, codingegulatory (expression, alternativesplicing)

Promoter, 5= or 3=, UTR intnear exon

ntronic Intronntergenic Between genes

odified and reprinted with permission from Tabor et al.14

lay an important role in disease,15,16 such that investigat- i

ng only SNPs with minor allele prevalence �1% may missmportant disease associations (see discussion of the com-on disease/common allele versus the common disease/rare

llele hypotheses in the following text).Somewhat more prevalent are SNPs that may potentially

ffect the expression or alternative splicing of genes. Thesenclude SNPs in the promoter region and exon/intron spliceoundaries. Although potentially informative, we are notet able to predict a priori if such SNPs will actually resultn altered expression or splicing. While this category ofNPs is still useful for gene mapping studies, real-timeolymerase chain reaction (PCR) and/or “isoform profiling”f messenger RNA (mRNA) may provide a more efficientpproach for candidate gene studies involving altered ex-ression. By far, the most prevalent SNPs are those locatedetween genes (ie, intergenic regions). Based on calcula-ions, there are more than 7 � 106 intergenic SNPs in theuman genome. These SNPs may affect disease risk bynterfering with enhancer functions or some other unknownechanisms, but the likelihood of this is considered exceed-

ngly rare.14 These SNPs are most useful for positionalloning studies to map the disease-causing gene or geneticariants (see discussion of linkage analysis in the followingext).

Microsatellites. Among the many repeat sequenceotifs of the genome, microsatellites are the most well

tudied and have been successfully used as genetic mark-rs for positional cloning of disease-associated genes.icrosatellites are brief strings of tandem dinucleotide,

rinucleotide, or tetranucleotide repeats of DNA sequenceeg, CACACACACACACACACACACACA) (Figure 1).17

he number of these repetitive motifs in each microsatellites variable, with multiple alleles generally detected in aopulation. There are estimated to be approximately 105

icrosatellites in the human genome, spread evenly acrosshromosomes.18 Of those, only approximately 104 haveeen studied.18 While microsatellites generally do not di-ectly affect gene function, they have been successfully used

Affect on function(from low to high frequency in genome)

Premature termination of protein sequenceChanges the frame of the protein coding region, can drastically

alter the sequenceChanges an amino acid in the protein, can alter functionDoes not change protein, can alter splicing or expression levelDoes not change protein, can alter splicing or expression level,

timing, locationUnknown, could alter splicing or stabilityUnknown, could affect expression

ron

n gene mapping strategies. With the advent of PCR,

masl

lgn1frstridIph

r1cytt5poaw

edmrtta2gtre

sitc“3i3tgotroosuti

FassvgnbR

Fbodcehtt

1726 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

icrosatellites have been used as genetic markers and typedcross the entire human genome, creating whole genomecans that helped successfully identify chromosomal regionsinked to specific diseases.

Insertions and deletions. Insertions and deletions areess frequent variants and account for �5% of humanenome variation.3 Insertions take place when one or moreucleotides are introduced into a sequence of DNA (Figure). Deletions occur when one or more nucleotides are lostrom the sequence (Figure 1). When located in the codingegion of a gene, insertions and deletions are apt to haveerious consequences on its structure (ie, frameshift, earlyermination), resulting in gross alteration of protein. Aelevant paradigm of a frameshift insertion causing diseases the case of CARD15 as a susceptibility gene of Crohn’sisease (CD; see discussion of IBDs in the following text).19

nsertions and deletions within the regulatory region (ie,romoter) and intron/exon boundaries of a gene may alsoave an effect on protein expression and function.

Haplotypes of genetic variation. Humans haveelatively limited genetic diversity, even though more than0 million unique genetic variants may be present in oururrent population. This limited diversity relates to theoung age (approximately 100,000 years) of our species andhe fact that our genetic material has only been transmittedhrough a small number of generations (approximately000) from our ancestral origin. Genetic variation in theopulation is introduced via nucleotide substitution, mei-tic recombination, and random mating. Recombination isnatural phenomenon that occurs in gametogenesis, during

igure 2. Many thousands of years ago, a SNP (arrowhead) arose onn ancestral chromosome. Five contemporary chromosomes arehown, the products of random meiotic recombination over thou-ands of generations. Each of the contemporary chromosomes hasariable length domains of the common ancestral chromosome (re-ions shown in white), which hold the original SNP (arrowheads), whileew chromosomal segments (shown in gray) are the result of recom-ination. Modified and reprinted with permission from Ardlie et al. Natev Genet 2002;3:299–309 (http://www.nature.com/).

hich regions between pairs of equivalent chromosomes are G

xchanged through the process of crossing over, generatingiscrete differences between the parental and offspring chro-osomes. Due to recombination, ancestral and contempo-

ary chromosomes will share domains of different length. Tohis end, alleles of adjacent loci on a chromosome have aendency to stay together despite recombination; thus, thelleles are said to be in linkage disequilibrium (LD) (Figure). In contrast, as we humans go through many moreenerations and thus recombinations, we expect our ances-ral and contemporary alleles to approach a state of equilib-ium in which no linkage between ancestral alleles willxist.

The relatively limited genetic diversity of humans canimplify the study of the human genome and our quest todentify disease-causing genetic variants. For example, if weake a neighboring set of 5 SNPs located on the samehromosome, the number of possible combinations, orhaplotypes,” of these is 32 (5 SNPs of 2 alleles each; 25 �2). However, due to LD, assessment of these same 5 SNPsn a population will often show that only a few (ie, 3–5) of2 expected haplotypes are present in a majority (�90%) ofhe population. Moreover, the pattern of LD across theenome is not uniform because recombination appears toccur more readily in certain regions known as recombina-ion hot spots,20 thus creating regions of low LD (ie, highecombination frequency) interspersed between the regionsf high LD (ie, low recombination frequency). These regionsf high LD take the form of block-like structures and, asuch, have aptly been termed “haplotype blocks”20–22 (Fig-re 3). The size and distribution and the allelic diversity ofhese blocks are highly variable between races and seem-ngly compatible with ancestral recombination throughout

igure 3. Patterns of LD are not uniform across the genome. Recom-ination of chromosomes appears to be localized in short “hot spots”f low LD (red segments), positioned between larger chromosomalomains of high LD (green sections) exhibiting low haplotype diversityalled “haplotype blocks.” Specific SNPs (colored triangles) withinach haplotype block that are strongly associated with the majority ofaplotypes in the population are termed “tag-SNPs” and can be usedo detect a haplotype associated with a disease. Non–tag-SNPs (whiteriangles) are also shown. Reprinted with permission from Trends in

enetics, Vol 18, Stumpf pages 226–228, 2002 from Elsevier.20

tspEvofp

stuubScsftpMdr“(tkEernttuhpdwpbleps

mdwm5pt

nuithpdct$bzeg

vuwitqvwscNdtqmv

st(dldooroseipsfi

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1727

he migratory history of our population.21,22 For instance,tudies have estimated that the haplotype blocks found inopulations of African ancestry are much shorter than inuropean, Japanese, and Chinese populations.21 This obser-ation is consistent with the theory that human migrationut of Africa resulted in “population bottlenecks” of limitedounders, leading to reduced genetic diversity in theseopulations.

Initial efforts using SNPs to identify and assess thetructure of human haplotype blocks have shown that whilehese regions may contain many SNPs, a few of these can besed to describe most of the genetic variation in the pop-lation. This limited allelic diversity found in haplotypelocks has the great potential of minimizing the number ofNPs required for comprehensive association studies ofomplex diseases, greatly facilitating the undertaking ofuch studies (see discussion of association studies in theollowing text). In fact, an international effort is underwayo define SNP-based haplotype blocks in a number ofopulations, known as the Human Haplotype Map (Hap-ap) Project. The goals of this $100� million effort are to

escribe the common haplotype blocks found in the humanace and its subpopulations and identify the informativehaplotype-tag SNPs” for discernment of these haplotypesFigure 3). The early findings of this project have estimatedhat the average size of haplotype blocks ranges from 11ilobases (kb) in African populations to 22 kb in Westernuropean populations, with more than half of the genomexisting in larger blocks of approximately 22 and 44 kb,espectively.21 Additionally, within these blocks, a smallumber of haplotypes (3–5) will typically describe morehan 90% of all chromosomes in each particular popula-ion.21,22 Using these data, it has been estimated that anpper limit of 300,000 to 1 million (non-African/African)aplotype-tag SNPs will be required to perform high-owered, whole genome association studies for complexiseases.21,22 The number of SNPs required for such studiesill also significantly decrease if disease is to be studied inopulations that have undergone more recent populationottlenecks (such as isolated Norwegian or Icelandic popu-ations). As haplotype blocks become larger due to greaterxtent of LD in a population, it becomes a greater task toinpoint the disease-causing alleles via fine mappingtudies.

Current progress and future direction. To date,ore than 10 million SNPs have been reported in the

bSNP public database. These can be accessed at http://ww.ncbi.nlm.nih.gov/SNP/. Of these 10 million SNPs, 5illion have been validated via various means and only

00,000 have any population frequency estimations (ie,ercent of minor or major allele in a population). Although

hese numbers are impressive, the disparity between the m

umber of validated SNPs and those with estimated pop-lation frequencies reflects the dichotomy between our abil-ty to detect SNPs and assess them. Increased knowledge ofhe location, type, and extent of genomic variation in theuman population will no doubt lead to effective ap-roaches to further understand the causes and risk factors ofisease. The recent past has seen exponential progress inapacity afforded to genotyping of SNPs, greatly reducinghe cost of each genotype from $1.00–$2.00 to less than0.10 using high-throughput approaches primarily amena-le to disease-gene mapping studies. Additionally, customi-able medium- to high-density SNP typing assays haventered the market, which are more applicable to candidateenes and “functional” (coding) SNP association studies.

Although our current efforts to identify and describe theariation found in the human genome will certainly beseful in the identification of genetic components involvedith disease, the complexity of handling the data regarding

ndividual variants is a challenge for many researchers andhe future of genomics. To this end, whole genome rese-uencing comes as the ideal proposal to understanding howariation of the genome affects human health, because itould provide a global platform (not dependent on the

election of individual SNPs) for such studies. However,urrent costs prohibit this approach. In this regard, theHGRI has launched an aggressive 15-year program to

evelop the necessary technologies to dramatically reducehese costs to $1000 per complete human genome se-uenced. The success of such an effort will likely be theajor factor behind advances in the study of human genetic

ariation and its association with disease in years to come.

Human Genomics and Proteomics

Genomics. Human genomics is the discipline thateeks to understand how the genome directs cellular activityhrough the expression of genes in response to internalother genes) and external (environmental) stimuli to pro-uce and maintain human life. From the earliest stages ofife, the human genome is selectively activated, guiding theevelopment and function of a multitude of cell typesrganized into higher-order structures such as tissues andrgan systems. Understanding such processes requires a vastepository of knowledge not just regarding the genome butf all the potential mRNA gene transcripts (ie, the tran-criptome) and the subsequent proteins (ie, the proteome)ncoded within. Advancements in computer technologies todentify and map expressed sequences to the genome haverovided us with a framework for understanding the tran-criptome, and technologies are available to globally “pro-le” these expressed sequences. However, the function of

any of these messages remains either not annotated or

uhof

Hbimfttmvgt

uohooagrgtpaconfn

poiccthgs

gummclc

tglgppa

tibbtaoietadtobaaHs

taagDmicids

iEuehcre(L5

1728 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

nknown. Elucidation of the proteome is even farther be-ind due to the historical lack of high-throughput technol-gies for the study of proteins and our inability to predictorm or function from sequence a priori.

Recent accomplishments, such as completion of theGP and current efforts in this field, are quite impressive

ut amount to only the “tip of the iceberg”; indeed, we aren the infancy of a discipline with the potential to changeedicine as we know it. Interpretation of the biologic

unction and meaning coded in the genome is a dauntingask that may never be fully complete. However, the po-ential of this knowledge to society is overwhelming andust be explored. In addition to cataloging human genetic

ariation and its association with disease, a number ofenomics “off-shoots” such as comparative genomics, func-ional genomics, and proteomics are being used.

Comparative genomics. Comparative genomicsses the knowledge gained from sequencing the genomesf many species to better understand the function of theuman genome and the genomes of a wide variety ofrganisms that have a medical and an economic impactn human society. This bioinformatics approach gener-lly uses powerful computers and elegant statistical al-orithms to identify and compare conserved genomicegions representing evolutionary favorable genes andene motifs involved with various cellular functionshroughout the spectrum of species. To date, the com-lete genomes have been published for 200 bacteria, 20rchaea, and 17 eukaryotes (including 7 vertebrate spe-ies such as mouse, rat, and chimpanzee). Complete listsf genome sequencing projects and simple tools for ge-ome viewing are freely available via the National Centeror Biotechnology Information Web site (http://www.cbi.nlm.nih.gov/entrez/query.fcgi?db�genomeprj).

An example of one potential medical application of com-arative genomics would be to search for conserved homol-gous regions in the genomes of pathogenic bacteria todentify potential target sites for powerful new antibiotics oronversely to identify pathogen-specific sequences for spe-ific bacterial targeting. Genomic comparison also facilitateshe functional classification of newly discovered or predicteduman gene products through comparison with homolo-ous regions to known function in well-studied modelystems.

This bioinformatics-based, comparative subfield ofenomics offers to greatly advance our efforts toward a fullernderstanding of the human genome and offers numerousedical and socioeconomic rewards. To reach these ends, weust first expand our catalog of genome sequences, espe-

ially those of vertebrate mammals, because these are evo-utionarily closest to humans. In this regard, the NHGRI

urrently supports a number of large-scale sequencing cen- t

ers, which are working to add an additional 20 mammalianenomes to our growing collection. Additionally, techno-ogical advances in computing power and comparative al-orithms as well as analytical innovation will need to keepace with our ability to generate the vast quantities of dataroduced by these projects for the full potential of compar-tive genomics to be realized.

Functional genomics. Functional genomics seekso identify, locate, and characterize all functional elementsn the genome and further attempts to apply associationsetween such elements. The scope of this approach is to goeyond traditional protein-coding gene function, exploringhe intergenic DNA sequences for potential function (suchs regulation of structure, temporal control of expression, orther cell behavior). This said, functional genomics alsoncludes the study of the relationships of these functionallements, including the flow of information through tradi-ional, protein-encoded gene networks. Past efforts in thisrena have often focused on the observation of cellularynamics in response to experimental perturbation (ie, drugreatment, gene product modifications, transgenics, and son) of model systems and have delivered the bulk of theiologic knowledge we know today. Global means to suchpproaches, including gene expression profiling via DNArrays, have advanced recently due to the completion of theGP and subsequent mapping of thousands of expressed

equence tags to the genome.One example of functional nonprotein coding sequences

hat have been discovered is the noncoding RNAs. Thesere genes that are transcribed to RNA from the genome butre not translated into protein.23,24 Instead, they can act asene regulatory elements through binding the genomicNA in cis, effectively silencing expression.24 To date,ore than 2500 of these noncoding RNAs have been

dentified.23 One well-studied mechanism involving non-oding RNAs is X-chromosome inactivation, the process ofnactivation of one X chromosome in females to provide forosage compensation of X-linked genes between theexes.25

The NHGRI has recently launched 2 programs to facil-tate the study of the functional genome and its networks:NCODE (ENCyclopedia of DNA Elements) and “Molec-lar Libraries.” The ENCODE program seeks to perform anxhaustive determination of all functional elements in theuman genome. The first effort of this program will be toharacterize and improve the tools necessary for such explo-ation and to perform a pilot study to identify all functionallements in a 30 million–base pair region of the genomeapproximately 1% of the human genome). The Molecularibraries program aims to provide access to a library of00,000 small molecules for use as chemical probes toward

he study of molecular pathways. This program is expected

ttdeh

bpttatsgcwubmadvgtitgGssbaHc

tamfb(pdrttpAti

s

iatmpaTdapqmmpnmtmpuaoatrtb

fipbsafscaaefvspTeaf

gm

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1729

o present a paradigm shift for academic investigators whoo date have not had such a powerful research tool at theirisposal. Both of these programs offer the potential ofxciting new insights into the function and behavior of theuman genome and the gene networks encoded within.

Although hundreds of thousands of scientific papers haveeen published in the past 50 years regarding the genomichenomenon, our ability to make global connections be-ween them is limited due to the sheer numbers, inconsis-ent gene nomenclature, and lack of consistent computer-ccessible formats. Freely available bioinformaticsechnology with the ability to leverage this enormous re-ource would be a great boom to the study of functionalenomics. A first step toward addressing these inconsisten-ies was the creation of the Human Genome Organization,hich has been given the task of providing and approvingnique gene symbols for all human genes. This process iseing performed by the Human Gene Nomenclature Com-ittee of the Human Genome Organization, which can be

ccessed at http://www.gene.ucl.ac.uk/nomenclature/. Toate, the Human Gene Nomenclature Committee has pro-ided unique gene symbols for more than 20,000 humanenes. The Human Genome Organization has undertakenhe creation of additional tools for the study of the genome,ncluding a comprehensive annotation of human/mouse or-hologues and a comprehensive annotation of all humanenes; these can be accessed through links on the Humanene Nomenclature Committee home page. An additional

ource for gene annotation is available through GeneCardsponsored by the Weizman Institute and accessed at http://ioinfo.weizmann.ac.il/cards/index.shtml. This site is freelyvailable to academic users and provides information for alluman Genome Organization–approved genes automati-

ally extracted from numerous databases.Proteomics. The discipline of proteomics is simply

he study of the proteome (ie, the sum of human proteinsnd how they interact with themselves and the environ-ent).8 While considered by some a field of study separate

rom genomics, we (and others) believe this is not the caseecause proteins are but one “means” of the genomes “end”albeit, the ends do not always justify the means). In ourerspective, genomics seeks to understand how the genomeirects cellular activity through the expression of genes inesponse to internal (other genes) and external (environmen-al) stimuli to produce and maintain life. Thus, logically,he agents of the genome (proteins) and their numerousotential interactions fall under the auspices of genomics.rguments aside, proteomic investigation yields great po-

ential for increased knowledge and is necessary to unlock-ng the secrets of the genome.

Although the genome sequence is stable, essentially the

ame in each cell of an organism at all times, the proteome fi

s in constant flux as a reaction to its milieu. This producesn exponentially increased level of complexity to the pro-eome compared with the genome. Complicating theseatters is the transcriptome, which is the sum of all ex-

ressed mRNA messages in a cell at a given time, which islso in constant flux mediated by the same environment.hese mRNA messages are the templates from which in-ividual proteins making up the proteome are translatednd are thus intermediaries between the genome and theroteome. However, transcription of mRNA and subse-uent translation into protein are controlled by numerousechanisms, with the result that increased transcription ofRNA does not necessarily lead to increased translation of

rotein and, conversely, increased translation of a protein isot necessarily dependent on increased transcription ofRNA. Furthermore, the activity of each individual pro-

ein in the proteome is dependent on its subsequent 3-di-ensional structure and often on its association with other

rotein molecules. Thus, increased presence of any individ-al protein does not necessarily indicate increased function-lity of such protein. Because life has persisted for millionsf years, eventually evolving to beings capable of ponderingnd elucidating these very mechanisms, we must assumehat this occurs in a highly organized way and for a specificeason. It is in understanding this multitude of interactionshat the secrets of the genome, proteome, and life itself wille unlocked.

The technology used to study protein is much differentrom that used to study DNA and RNA and to this pointn time has not proven nearly as amenable to high-through-ut approaches.26 Thus, the study of the proteome is farehind that of the genome. Although we have successfullyequenced the human genome (and many others), providing

solid basis on which to study structure variation andunction, no such luxury is afforded the proteome. A firsttep toward better understanding the proteome will be toatalog the complete transcriptome, including all commonlternative splice variations. Even though translation is notlways dependent on transcription, this is a good initialffort toward understanding gene expression because theunctionality of these messages is generally known (ie, pro-ide a template for protein translation); thus, completion ofuch a project is foreseeable. This will provide a map of theotential protein products that we should expect to find.his map will allow us to assess the variation of proteinxpression, structure, and functions encoded in each genend eventually ascribe associations and function to eachorm.

Many strides have been made of late to provide forreater throughput in the study of proteins. Innovations inass spectroscopy have proved the greatest advance in the

eld, providing a means to distinguish specific proteins

wstoaritbckopa

nHTtadcTcihta

hc

opiDdfipemtdttsetttfceamitao

t

ps

F(atdf

Fbemfm

1730 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

ithout the need of specific ligands.27 Additionally, masspectroscopy has the potential to identify and quantitatehousands of biomarkers at once (although this is a long wayff and is dependent on the creation of advanced spectrallgorithms and prior knowledge of potential spectra). Cur-ent progress with mass spectroscopy has had the greatestmplications on its application to 2-dimensional gel elec-rophoresis in identifying proteins differentially expressedetween 2 samples. However, the identification of the spe-ific differential proteins is dependent on previous spectralnowledge of them. Moreover, protein microarrays are an-ther emerging field of proteomic technologies with pro-osed broad applications for discovery and quantitativenalysis of protein-based biomarkers in cancer.28

As with the genome, standardization of proteomic data isecessary for useful future endeavors. To this regard, theuman Proteome Organization was established in 2001.29

he Proteomics Standards Initiative of the Human Pro-eome Organization has been tasked with the developmentnd implementation of these standards. Included are stan-ards for reporting molecular interactions, mass spectros-opy data, and creation of a general proteomics standard.he Proteomics Standards Initiative Web site can be ac-essed at http://psidev.sf.net. In addition to the standardsnitiative, members of the Human Proteome Organizationave undertaken large-scale proteomics efforts to elucidatehe proteomes of the liver, plasma, and brain. These projectsre nearing the end of the pilot phase.

Human Genomics in Disease

For millennia, humans have recognized the role oferedity in disease. In 1865, Gregor Mendel introduced the

igure 4. In functional cloning, the biochemical defect of a diseaseie, sickle cell anemia) is known before the causative gene is foundnd mapped on the genome. In contrast, positional cloning seeks firsto map the causative gene of the disease (ie, chronic granulomatousisease) in the genome without prior knowledge of its structure orunction.

oncept that an element is passed down from the parent to &

ffspring in an untouched form, causing an observablehenotype. However, it was not until 1953 that the chem-cal base of heredity was discovered and the double helix ofNA was elucidated.2 Yet, until the early 1980s, theiscovery of a disease-causing gene was solely based on itsunctional characteristics. “Functional cloning” refers todentification of a gene causing a disease by knowing orostulating its biochemical function or defect, without ref-rence to its structure and position in the genome (ie,apping) (Figure 4). Functional cloning was used to clone

he gene of sickle cell anemia. Nonetheless, for most humanisorders, the biochemical origin or deficit, is unclear, andhus the process of disease pathogenesis remains elusive. Tohis end, in the mid-1980s, it was proposed that the gene ofpecific disease could be cloned and subsequently identifiedxclusively on the basis of its chromosomal position withinhe genome (ie, mapping).30 This revolutionary approach,ermed positional cloning, assumes no functional informa-ion regarding the disease of interest and was used success-ully for the first time in 1986 to clone the gene causinghronic granulomatous disease (Figure 4).31 In the genomera, however, a novel strategy called position candidatepproach is expected to surpass the positional cloningethod. This new method relies on positional cloning to

dentify a possible chromosomal area, followed by evalua-ion and screening of the candidate genes in the genomicrea of interest using the publicly available, annotated mapsf human genes.

In the following section, the fundamental concepts per-aining to the genomics of human disease are described.

Mendelian and complex diseases. Due to com-letion of the HGP, interest in translational genetic re-earch has recently increased. There are 3 classes of genetic

igure 5. The distribution of genetic disease classification changesy aging. Chromosomal diseases peak before birth. Mendelian dis-ases are more common before puberty. Complex diseases affectainly the adult population. Modified and reprinted with permission

rom Gelehrter T, Collins FS, Ginsburg D, eds. The role of genetics inedicine. Principles of medical genetics. 2nd ed. Media, PA: Williams

Wilkins, 1998:1–8.

dfe51rdldih

totacpamttphdpsscndlIY

tape

ftdghcvctcpmcsaMIrtrrbdbTg

ag

Fopamr2

Fvtap

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1731

isorders: Mendelian (ie, single gene), complex (ie, multi-actorial), and chromosomal. Interestingly, the prevalence ofach disease class changes from birth to adulthood (Figure). Although chromosomal disorders affect approximately% of live-born deliveries, we only discuss the conceptselated to understanding the genetic contribution in Men-elian and complex diseases in this report due to spaceimitations. In fact, as illustrated in Figure 5, the complexiseases are the most prevalent among adults. As well, theynclude many of the disorders that gastroenterologists andepatologists encounter daily in clinical practice.

Mendelian diseases display familial patterns of inheri-ance, including autosomal recessive, autosomal dominant,r X-linked transmission of the disease-related alleles. Inhe case of dominant inheritance, only a single trait-causingllele is necessary to express the disease phenotype. Inontrast, an autosomal recessive phenotype requires botharental alleles to express the disease or trait. Generally, inMendelian disorder, the disease is caused by a few rareutations of a single gene, although exceptions of more

han one gene have been described. In a specific pedigree,he same mutation is accountable for causing the diseasehenotype. However, among affected families, several orundreds of different mutations of the same gene may beetected. Overall, Mendelian diseases are uncommon in theopulation. The most frequent, hereditary hemochromato-is, affects 1 in every 300 individuals. From a genetictandpoint, Mendelian diseases are considered simple be-ause of the direct correspondence of a genotype to a phe-otype (Figure 6). To date, more than 1000 Mendelianiseases have been elucidated; the catalog of human genesinked to these disorders is available online at Mendeliannheritance in Man (http://www.ncbi.nlm.nih.gov/omim).

igure 6. In Mendelian disorders, a single gene is usually the causef the disease. There is a direct correspondence of a genotype to ahenotype. The disease trait is inherited in a predicted pattern (ie,utosomal dominant, autosomal recessive, or X-linked). All affectedembers in a pedigree carry the same gene mutation. Modified and

eprinted with permission from Peltonen et al. Science 2001;91:1224–1229.

et, additional modifier gene(s) may exist that likely affect s

he Mendelian disease “penetrance” (ie, the likelihood thatperson with a specific genotype will express a certain

henotype) or “expressivity” (ie, the degree of phenotypicxpression of a gene).

Complex diseases such as colon cancer, IBD, nonalcoholicatty liver disease, and IBS are believed to have a multifac-orial pathogenesis. It is generally accepted that complexiseases develop as a result of the interplay between severalenes or genetic variants with environmental exposures,ence, the idiom “complex” (Figure 7). For the most part,omplex diseases are caused by relatively common geneticariants in a number of genes, each of which has a smallontribution on the disease trait or phenotype. As a result,he direct correspondence of a genotype to a phenotype thatharacterizes the Mendelian diseases does not exist in com-lex disorders. This fundamentally conceptual distinctionay explain the observed heterogeneity of complex disease

oncerning clinical manifestations, progression, and re-ponse to treatment. Thus, although complex diseases havegenetic component, which in essence is different than inendelian diseases, they demonstrate familial aggregation.

n fact, the risk of developing a complex disease amongelatives of a proband is greater than the estimated risk ofhe disease in the general population. To this end, the termelative risk ratio of a sibling (�s) was coined to define theisk of a sibling presenting with a disease if a biologicalrother or sister is already affected. The �s is calculated byividing the prevalence of a complex disease among siblingsy the prevalence of the disease in the general population.he higher the value of �s, the greater the evidence for aenetic component to a complex disease.

Mendelian and complex disorders operate in differentspects. Mendelian diseases are usually the result of a singleene with high penetrance of the genotype and low preva-

igure 7. In complex disorders, the interplay of multiple geneticariants with the environment determines risk of the disease pheno-ype. The contribution of each factor to phenotype is small and variesmong patients. Thus, complex diseases may be heterogeneous inathogenesis and progression. Modified and reprinted with permis-

ion from Peltonen et al. Science 2001;291:1224–1229.

lcphetrpt

rpOs“dtcpdcsgpisu(ldid

edepwitpwwt

cmtaTTitsipearppccccffiphtidegdg

pchptt

Fvwite1

1732 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

ence in the population. In contrast, complex diseases areaused by modest effects of several genetic variants with lowenetrance of the genotype, of which (ie, diseases) may haveigh prevalence in the population. We believe the apparentpidemiologic and conceptual pathogenetic differences be-ween Mendelian and complex diseases have importantamifications in the discovery, application, and overall im-act of genomic and proteomic technologies on gastroen-erology clinical practice.

The interplay of genes and environment. Envi-onmental risks are important for any disease state butrobably have a higher contribution in complex diseases.n the other hand, the genetic factors in complex diseases

hould be viewed as “susceptibility gene(s)” rather thancausative gene(s),” because such presence in an individualoes not imply development of the disease. Dissection be-ween genetic susceptibility and environmental risk(s) inomplex diseases is the future challenge of genomics androteomics. Recognizing the complexity of some Mendelianiseases such as cystic fibrosis, where hundreds of mutationsould cause or affect the severity of the disease phenotype,omeone could realize that identifying the susceptibilityenes of complex disease will be a slow and challengingrocess. A putative model that describes the continuousnteraction of genetic variants with the environment ishown in Figure 8. In this model, it is postulated that thenique susceptibility genotypes of 2 unrelated individualsA and B) along with the environmental history (ie, diet,ifestyle, physical activity, smoking, and so on) could pre-ict their future outcome between health and illness. To ben a position to predict whether individual A or B willevelop disease, we have to assess both their genetic and

igure 8. The genetic susceptibility of 2 nonbiologically related indi-iduals A and B and the unique environmental history will definehether they will remain healthy or develop a complex disease. Mod-

fied and reprinted with permission from Sing et al. Genetic architec-ure of common multi-factorial diseases. In: Chadwick D, Cardew G,ds. Variation in the human genome. Chichester, England: Wiley,

e996:211–232.

nvironmental risk. To date, we can easily and accuratelyetect genotypes across the entire human genome. How-ver, it remains extremely difficult to assess the past,resent, and future environmental risks. Even if the latterere possible, we still lack robust methods to examine the

nterplay of genes and environment and how their interac-ion leads to disease. Nonetheless, this hypothetical modelrovides the stage to understand the scope and challengese have to overcome as investigators and physicians beforee can make significant progress to better diagnose and

reat complex diseases in the genome era.The common disease/common allele versus the

ommon disease/rare allele hypotheses. At present,any logistic difficulties of genomic science stem from

he theoretical challenges of associating the genetic vari-tion of humans with the phenotype of complex diseases.o address this issue, 2 hypotheses have been postulated.he first, “the common disease/common allele hypothesis,”

s proposed given the fact that the present human popula-ion of approximately 6 billion represents a global expan-ion from the single sub-Saharan African founding inhab-tants of relatively small size (�10,000 people) that tooklace approximately 100,000 years ago. In this regard, it isxpected that contemporary humans share a number oflleles with this small group of founders (Figure 2). As aesult, the common disease/common allele hypothesis pro-oses that common alleles did exist before the global ex-ansion and divergence of humans and contribute signifi-antly to predisposition (ie, susceptibility) to commonomplex disease. Such alleles may confer moderate risk toomplex disease and should occur at relatively high frequen-ies (ie, �1%) in the present human population.32 At thisrequency of alleles, it is implied that association studies (seeollowing text) using large patient cohorts will result indentification of the susceptibility alleles of common com-lex diseases. The fact that a limited number of commonaplotypes account for the majority of haplotype blocks ofhe genome21 supports the optimism that association stud-es using tag-SNPs will identify common haplotypes pre-isposing to common complex disease. The common dis-ase/common allele hypothesis was the basis to develop aenome-wide human haplotype map (ie, HapMap) that willescribe the major haplotypes and tag-SNPs of the humanenome.33

On the other hand, the common disease/rare allele hy-othesis proposes that most common complex diseases areaused not by common but rather rare alleles.15,16 Thisypothesis proposes that more than 99% of genetic variantsredisposing to common complex diseases arose followinghe global expansion and divergence of the human popula-ion.16 Additionally, it predicts that common complex dis-

ases will demonstrate allelic heterogeneity (ie, different

dgcdrmwldhoo

tpobi

sttwgptmwogsicst

tactaaiovhg

cbth

ddvcas

gtgotcowgaisdvida

roedgSrc

F(cser(lpw

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1733

isease-causing alleles at the same locus) and locus hetero-eneity (ie, disease-causing alleles at separate loci), furtheromplicating the discovery of genetic elements that causeisease. If the common disease/rare allele hypothesis is cor-ect, then genome-wide association studies utilizing com-on alleles to interrogate a heterogeneous populationould prove insufficient to identify the genetic variant

eading to increased susceptibility to common complexiseases. To this extent, the ongoing construction of theuman HapMap would be inadequate to define the variantsf common complex diseases simply because it was devel-ped based on common alleles.

While these 2 hypotheses are certainly at odds regardinghe commonality of susceptibility alleles in common com-lex diseases, the heterogeneity of presentation and outcomef most such diseases has led investigators to believe thatoth rare and common genetic variants will prove to play anmportant role in the development of these diseases.

Identifying disease-causing genetic variants:tudy designs and strategies. Linkage analysis. This es-ablished method has proven successful to confine and iden-ify genes causing Mendelian disorders.34 Linkage analysisas developed on the basis that alleles of disease-causingenes and genetic markers (which are evaluated for) ifresent on the same chromosomes should segregate simul-aneously, meaning that they are physically linked. Duringeiotic recombination, however, chromosomes do not al-ays remain intact. Crossing over between a pair of homol-gous chromosomes may lead to the separation of diseaseenes from genetic markers. Thus, the interlocus chromo-omal distance is proportionally related to the probability ofndependent transmission of alleles (in other terms, theloser the alleles are located, the higher the chance they willegregate together because meiotic recombination betweenhem becomes more rare).

Practically, linkage analysis searches for the cosegrega-ion of polymorphic genetic markers, such as microsatellitesmong affected family members. In a family, the sharedhromosomal region(s) of the affected members should con-ain the gene(s) causing the disease. When linkage between

disease and a genetic marker(s) is documented, thendditional markers (ie, SNPs) within this genomic region ofnterest can be evaluated to more precisely map the locationf the disease-causing gene.34 Linkage analysis has provenictorious for identifying genes of Mendelian diseases withigh penetrance; nevertheless, its applicability to detectenetic variants of complex diseases has been unfulfilling.

Association studies. Association analysis is a case-ontrol study design that seeks a statistical correlationetween particular genetic variant(s) and a disease orrait.35 Large association studies (ie, �1000 participants)

ave greater statistical power than linkage methods to p

etect causative genetic variants of small effect on theisease phenotype of complex diseases.36 The geneticariants (more likely SNPs) could be located on genes (ie,andidate genes) or distributed throughout the genomend thus can be used in 2 separate approaches as de-cribed below.

The approach of direct association (Figure 9) seeksenetic variant(s) of plausible candidate genes and obeyshe following procedure14,37: (1) selection of candidateenes that credibly may be involved in the pathogenesisf the disease of interest; (2) identification of the func-ional genetic variants with or in close proximity tooding regions of the candidate genes; (3) ascertainmentf subjects, including careful phenotyping in cases andell matching of controls; (4) genotyping of chosenenetic variants in cases and controls; and (5) statisticalnalysis to determine whether significant association ex-sts between the examined variants and the disease. De-pite the statistical power of association studies, repro-uction of published disease associations to geneticariants is uncommon. The reasons for this disparitynclude small study sample size, stratification biases, andisease locus heterogeneity (ie, presence of causal variantst different loci among patients).

The strategy of indirect association (Figure 9) is cur-ently not possible. However, it will become achievablence the human HapMap38 is complete. This approachntails a genome-wide association study in which hun-reds of thousands of tag-SNPs that cover the entireenome are genotyped in both patients and controls.ubsequently, LD analysis is applied to map the genomicegion(s) that are indirectly associated with disease sus-eptibility genes or variants. This methodology is un-

Evaluating SNPs in Association Studies

Direct association Indirect association

igure 9. On the left, a SNP (red triangle) located within an exonrectangular box) of a candidate gene will be directly tested for asso-iation to a disease phenotype. The proposed causative SNP iselected for testing given prior experimental evidence regarding itsffect on the function or expression of the candidate gene. On theight, 3 SNPs (blue triangles) that flank the proposed causative SNPred triangle) will be genotyped to test indirectly the association of theatter to disease phenotype. The 3 SNPs were selected based on LDatterns in the vicinity of the gene of interest. Modified and reprintedith permission from Hirschhorn et al.38

rejudiced regarding specific genes or genomic regions.

Nu

stlaTetcaaweatrcc

ftdd

FefaToWpa

a1�Agtcpbdht

eptat

aacr3mteAila

mtetescuutHaHdcvhpacp

tistmttddI

1734 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

evertheless, biases can still take place because of pop-lation stratification.39

Transmission disequilibrium test. To tackle thetratification bias of case-control association studies, sta-istical geneticists developed the transmission disequi-ibrium test.40 The transmission disequilibrium test isctually an association study based on a family design.he statistical method uses data on typed genetic mark-rs derived from the probands and both parents. Theransmission disequilibrium test evaluates the frequen-ies of parental alleles that are transmitted to theirffected offspring compared with the frequencies of thelleles that are not transmitted. If a disease is associatedith a high-risk allele (ie, causative variant), then it is

xpected that the frequency of this allele will be highermong the alleles transmitted compared with the non-ransmitted alleles. Because the transmission disequilib-ium test eliminates the concern of population stratifi-ation bias, it can be used to verify positive findings ofase-control association studies.40

Genetics/Genomics in Digestive andLiver Diseases

To provide a stage for our discussion about theuture impact of genomics and proteomics on gastroen-erology and hepatology clinical practice, selected para-igms of Mendelian and complex digestive and liveriseases are presented.

Mendelian diseases. FAP. The prevalence ofAP is 2–3 cases per 100,000 individuals, and the dis-ase phenotype is transmitted in an autosomal dominantashion. The FAP syndrome includes related entities suchs attenuated FAP, Gardner’s syndrome, and sometimesurcot’s syndrome. At age 35 years, approximately 95%f patients with FAP have thousands of colon polyps.

ithout early diagnosis of FAP and subsequent thera-eutic colectomy, colon cancer is expected by the meange of 39 years.

The cause of FAP, the APC gene located on the longrm of chromosome 5 (5q21-q22), was discovered in991 using positional cloning approaches.41 Although1% of colon cancers are due to FAP, discovery of thePC gene has shed light on understanding the patho-enesis of sporadic colon cancer. Indeed, we now knowhat more than 80% of all adenomas and cancers of theolonic mucosa exhibit early inactivation of the APCrotein.42 APC is a tumor-suppressor gene, and loss ofoth alleles is necessary to lead to carcinogenesis. Toate, more than 800 different germline APC mutationsave been described and the vast majority (�90%) cause

runcation of APC, resulting in loss of function. Inter- a

stingly, the location of the APC gene mutation canredict the number and extracolonic manifestations ofhe syndrome.43 Moreover, identical mutations amongffected individuals could cause distinct FAP pheno-ypes, suggesting the presence of modifier gene(s).

In addition to APC gene sequencing and linkagenalysis in affected families, protein truncation testing isn in vitro assay to assure that a proposed mutationauses FAP. Sequencing of the APC gene is more accu-ate than protein truncation testing. Of interest, up to0% of newly diagnosed cases represent de novo APCutations. Thus, failure to identify a known APC mu-

ation does not exclude FAP. Although much has to belucidated about the genotype-phenotype associations ofPC and FAP, genetic testing in affected families has an

mpact on early disease diagnosis among relatives and canead to prevention and treatment of colon cancer and thessociated extracolonic manifestations of the syndrome.

Hereditary hemochromatosis. Hereditary hemochro-atosis is the most common genetic disease in popula-

ions of European ancestry, with a prevalence of 1 invery 300 individuals. Hereditary hemochromatosis isransmitted as an autosomal recessive trait. This is anlusive disease because of the nonspecific nature of itsymptoms, making diagnosis difficult at times. The dis-overy of the hemochromatosis gene (HFE) in 1996sing positional cloning techniques has been critical tonderstanding the pathogenesis and natural history ofhe disease and in devising diagnostic strategies.44 TheFE gene is located on the short arm of chromosome 6

nd exhibits primarily 2 mutations (ie, C282Y and63D) that can cause hereditary hemochromatosis. In-

eed, C282Y homozygotes and H63D homozygotes ac-ount for �80%–85% and �1% of the affected indi-iduals, respectively. Approximately 4% of hereditaryemochromatosis cases are due to C282Y/H63D com-ound heterozygotes. Nevertheless, in the past 5 years,dditional rare non-HFE genes have been identified thatan predispose to familial iron overload, including ferro-ortin 1,45 transferrin receptor 2,46 and hemojuvelin.47

Hereditary hemochromatosis was considered the pro-otype genetic disease for widespread population screen-ng. First, the disease is common and preventable usingimple phlebotomies once the diagnosis is made. Second,here is available genetic testing consisting of only 2utations. Despite these favorable facts, the initial en-

husiasm of population screening was dissipated givenhe fact that significant numbers of C282Y homozygoteso not express the disease and the lack of evidence of liverisease progression in many C282Y homozygotes.48,49

nterestingly, collected data from 14 studies showed that

bout 50% of C282Y homozygotes do not exhibit iron

opnthoitfcfi

rlbaldTcsc(tp

remaHpapstgdsAhagl

ai1ruIc

GadcgfmCCf

srwctpts

tpTCaoCLaNtdros2atsachh

pvvciwr

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1735

verload.50 Nevertheless, long-term observation of suchatients indicated accumulation of iron, suggesting theeed for follow-up of these individuals, including repeatesting for transferrin saturation. These cases of C282Yomozygosity with lack of penetrance imply the presencef modifier gene(s) that interact with HFE to produce theron overload phenotype. Once hereditary hemochroma-osis is diagnosed, however, genetic testing of the af-ected pedigrees is crucial. Siblings have the greatesthance of carrying the HFE gene compared with otherrst-degree relatives.

Wilson’s disease. Wilson’s disease is an autosomalecessive disorder of copper metabolism with a preva-ence of 1 in 100,000 individuals. The disease is causedy defective biliary excretion of copper, leading to itsccumulation in the liver, brain, and cornea. The corol-ary of copper buildup in these organs is the eventualevelopment of severe hepatic and neurologic disease.he gene for Wilson’s disease, located on the long arm ofhromosome 13 (13q14.3-q21.1), was identified by po-itional cloning approaches in 199351,52 and codes for aopper transporting P-type adenosine triphosphataseATP7B). The ATP7B gene product delivers copper tohe apoceruloplasmin and mediates the excretion of cop-er into bile.53

More than 200 distinct genetic defects have beeneported in patients with Wilson’s disease. The inheritedrrors of the ATP7B gene include nonsense and missenseutations, insertions, and deletions. Some mutations are

ssociated with more severe, early-onset disease. Theis1069Gln missense mutation exists in 30%–60% of

atients with Wilson’s disease and of eastern, central,nd northern European ancestry but is rarely seen inatients of non-European origin. The diagnosis of Wil-on’s disease is made by clinical examination, laboratoryests, and liver biopsy. Mutation analysis of the ATP7Bene is limited given the large number of mutations. Toate, gene screening is important for suspected and un-uspected relatives of a proband when the causativeTP7B mutation of the index case is known. Otherwise,aplotype analysis around the disease locus can be used inn affected family to identify carriers of the defectiveene. Therefore, the clinical impact of genetic testing isimited to affected pedigrees.

Complex diseases. IBDs. CD and chronic ulcer-tive colitis (CUC) are chronic inflammatory diseases thatnvolve the GI tract. The prevalence of CD and CUC is0–100 cases and 35–100 cases per 100,000 individuals,espectively. A genetic epidemiologic approach to eval-ate the heritability of a complex disease is a twin study.n such a design, the concordance rate of a disease is

ompared between monozygotic and dizygotic twins. t

iven that monozygotic and dizygotic twins share 100%nd 50% of their genetic material, respectively, if theisease of interest has a strong genetic element, a greateroncordance rate of the disease is anticipated in monozy-otic compared with dizygotic twins. In a twin studyrom Sweden, the index case concordance rate amongonozygotic twins was 6.3% for CUC and 58.3% forD54 indicating that heredity is stronger in CD than inUC. In another study, IBD was found to aggregate in

amilies.55

Since the mid-1990s, several whole genome linkagetudies have been performed to dissect the genomicegions causing susceptibility to IBD. The first genome-ide linkage study for IBD was reported in 1996, indi-

ating 16q as the specific locus for CD (IBD1).56 Sincehe first report, 6 other whole genome studies have beenerformed and identified additional loci that fulfill cri-eria for significant linkage to IBD, including chromo-omal regions 1p, 5q, 6p, 12q, 14q, and 19p.57

CD represents one of the successful scientific endeavorso identify a susceptibility gene of a complex disease byursuing fine mapping approaches of the IBD1 locus.hese strategies led to the simultaneous discovery of theARD15 gene (N-terminal caspase recruitment domain)s the first susceptibility gene of CD by 2 separate groupsf investigators.19,58 Three common variants ofARD15, namely, Arg702Trp, Gly908Arg, and aeu1007fsinC, which truncate the CARD15 protein, aressociated with CD. The CARD15 gene, also known asOD2, possesses apoptosis and nuclear factor �B activa-

ion regions (CARD1 and CARD2), a nuclear bindingomain, and a bacterial recognition region. The relativeisk of developing CD having one susceptibility variantf CARD15 is 1.5–3.0. However, the relative riskharply increases to 20–40 when an individual possesses

susceptibility variants of CARD15. Nevertheless, thebsolute risk of developing CD is about 3% (in othererms, �1 in 25) even for homozygous individuals,upporting the concept that putative modifier gene(s)nd environmental exposures do interact with the sus-eptibility variants of CARD15 to cause CD. To this end,ealthy homozygous carriers of the CARD15 variantsave been reported.59

Interestingly, there are significant differences in therevalence of CARD15 variants among patients of di-erse ethnic origin with CD. The association of CARD15ariants with CD is strong in European, North Ameri-an, and Australian patients but less prominent in Finn-sh, Irish, and Scottish cohorts.60 In contrast, patientsith CD from Japan, Korea, and China lack any of the 3

eported CARD15 variants.60 These observations reflect

he known genetic heterogeneity (ie, different variants

cvt

nIch(CcCd2

hcnwbth

pt2oGalFdg.HalcclvHapd

Bowapm

vtBraotrB

hcuggmpsbsd

toptisdppi

Dmt4pss1nlKmlfesl

1736 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

ausing a similar phenotype) and locus heterogeneity (ie,ariants at separate loci causing a comparable phenotype)hat describe the pathogenesis of complex diseases.

The second whole genome scan in IBD revealed sig-ificant linkage within a domain of chromosome 12 (ie,BD2 locus),61 which is stronger among CUC pedigreesompared with CD families. Moreover, a European studyas dissected another IBD locus on chromosome 6pIBD3), which confers susceptibility to both CD andUC.62 In 1999, a US study identified the IBD5 locus onhromosome 5 (5q31-33) that is specific for CD and notUC.63 Additional fine mapping of the IBD5 locusemonstrated a single, highly conserved haplotype of50 kb in length that was associated with CD.64

The above family-based whole genome studies in IBDave proven, as it was expected, that likely many lociontribute to CD or CUC. As we move forward, we willeed to compile all these data and apply them in large,ell-designed whole genome association studies beforeeing able to better define and verify the several suscep-ibility variants of IBD. Only then can such observationsave an impact on gastroenterology clinical practice.

Helicobacter pylori/peptic ulcer disease. Helicobacterylori is likely one of the most prevalent infectious agentshat populates humans. Even in developed countries,5%–50% of the population is infected. Nevertheless,nly 10%–20% of H pylori–infected individuals developI diseases (ie, gastritis, gastroduodenal ulcer) and have

n elevated risk of gastric cancer. Evidence suggests theikelihood of genetic susceptibility to H pylori infection.or example, twin studies have shown that the concor-ance rate for H pylori infection was 81% for monozy-otic twins compared with 63% for dizygotic twins (P �001).65 To start dissecting the genetic component(s) of

pylori infection in humans, a genome-wide linkagenalysis was performed in 143 infected Senegalese sib-ings.66 This study showed linkage on the long arm ofhromosome 6, an area where the gene that encodes forhain 1 of the interferon gamma receptor (IFNGR1) isocated. Subsequently, sequencing of IFNGR1 gene re-ealed 3 polymorphisms, [56 C ¡ T (promoter),318P (exon 7), and L450P (exon 7)], that were associ-

ted with H pylori infection.66 This study suggested thatolymorphisms of interferon gamma play a role in pre-isposing to H pylori infection in humans.

Barrett’s esophagus and esophageal adenocarcinoma.arrett’s esophagus (BE) is characterized by replacementr metaplasia of the distal squamous esophageal mucosaith a columnar epithelium. When the latter is defined

s specialized intestinal metaplasia, there is potentialrogression to esophageal adenocarcinoma through a

etaplasia-dysplasia-carcinoma sequence. 1

In addition to environmental determinants, geneticariants are likely risk factors for developing BE. First,here are multiple case reports of familial aggregation ofE involving several generations of relatives.67 Second,

eports of BE inheritance in sibling pairs suggest anutosomal recessive inheritance.68 Third, only a portionf patients with reflux symptoms develop BE.69 Fourth,he degree of reflux exposure and a prior diagnosis ofeflux esophagitis do not predict the development ofE.70

Interestingly, first-degree relatives of patients with BEave a higher prevalence (41%) of heartburn symptomsompared with the spouses (12%) of the latter who weresed as controls.71 Moreover, reflux symptoms displayreater concordance in monozygotic compared with dizy-otic twins, with an estimated heritability of approxi-ately 30%.72 Finally, in a large pedigree with severe

ediatric reflux disease, which was inherited in an auto-omal dominant fashion, investigators found a suscepti-ility locus on chromosome 13q14 using linkage analy-is. However, it is not known whether these patients willevelop BE in the future.73

Celiac disease. Celiac sprue (CS) or gluten-sensi-ive enteropathy is characterized by inflammatory injuryf the small intestine mucosa due to ingestion of cerealrolamins. Once believed to be rare in the United States,he prevalence of CS is now estimated to be 1 in 250ndividuals.74 Genetic susceptibility to CS is stronglyupported by family studies. For instance, the concor-ance rate for the disease is 75% for monozygotic com-ared with 11% for dizygotic twins.75 Moreover, therevalence of CS among first-degree relatives of affectedndividuals is approximately 10%.76

CS is strongly associated with HLA-DQ277 and HLA-Q8.78 Despite this evidence, additional non-HLA lociay exert susceptibility to CS. Studies have estimated

hat the HLA-associated risk for developing CS is 20%–0%.79 To date, several genome-wide scans have beenerformed in CS-affected families. A study from Irelandhowed strong linkage for CS with the HLA region anduggestive linkage in chromosomal regions 6p, 7q31,1p11, 15q26, 19q, and 22cen.80 Two independent ge-ome-wide analyses from Italy reported evidence forinkage in chromosome 5q.81,82 A study from the Unitedingdom showed indicative evidence for linkage in chro-osome domains 10q and 16q and less evidence for

inkage in 6q, 11p, and 19q.83 Moreover, in Finnishamilies with CS, a whole genome scan found strongvidence for linkage in the HLA region at 6p21.3 anduggestive evidence for linkage in 6 other chromosomalocations: 1p36, 4p15, 5q31, 7q21, 9p21-23, and

6q12.79 CS is the prime paradigm of a GI disease in

wpp

tIotsftst

go3pcroI1tmIct

IvtTtdta5stLsttfpscp

p

tGa-di0

vmrwtTtow

pisccattcptarMhipaPlteDseacptp

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1737

hich the genetic and environmental (ie, gluten) com-onents exist and interact in causing the diseasehenotype.

IBS. IBS is perhaps the most common disorderhat gastroenterologists encounter in clinical practice.BS is characterized by recurrent abdominal discomfortr pain and is associated with diarrhea and/or constipa-ion. Although pathogenetic abnormalities from visceralensitivity to abnormal gut motility and putative riskactors such as psychological causes have been postulated,he molecular mechanisms underlying IBS remain elu-ive. Lately, genetic epidemiology approaches suggesthat IBS may have a hereditary component.

Several epidemiologic studies indicate that IBS aggre-ates in families. Investigators have reported in a groupf 100 consecutive outpatient IBS visits to the clinic that3% of their patients had a family history of IBS com-ared with 2% of age-, sex-, and social class–matchedontrols.84 Moreover, in a population-based study it waseported that a first-degree relative with abdominal painr bowel problems was associated with a proband statingBS symptoms (odds ratio, 2.5; 95% confidence interval,.5–4.2).85 Additionally, in a twin study from Australia,he concordance of functional GI symptoms was 33% foronozygotic compared with 13% for dizygotic twins.86

n the largest twin study from the United States, theoncordance rate for IBS was 17.2% for monozygoticwins compared with 8.4% for dizygotic twins.87

Several candidate genes have also been examined inBS using association studies. One investigated geneticariant is an insertion/deletion within the promoter ofhe serotonin transporter gene (5-HTT or SLC6A4).88

his polymorphism causes a short (S) and a long (L)ranscript of 5-HTT, in which the former results iniminished transcription and thus lower uptake of sero-onin, which may have functional consequences (ie, di-rrhea).88 In fact, physiologic studies using either a-HT4 receptor agonist or a 5-HT3 receptor antagonistupport the contribution of serotonin in gut motili-y.89,90 In a small study, no significant differences of the- and S-alleles of 5-HTT were found between IBSubphenotypes (ie, diarrhea-dominant IBS and constipa-ion-dominant IBS).91 In contrast, another study fromhe Middle East reported that the LS genotype was morerequent in patients with diarrhea-dominant IBS com-ared with controls (P � .05).92 Nevertheless, anothertudy has shown that the SS genotype was 2 times moreommon in patients with diarrhea-dominant IBS com-ared with healthy controls.93

Other candidate genes investigated in IBS include the2-adrenergic receptors, the norepinephrine transporter

rotein, interleukin-10, transforming growth factor �1, w

umor necrosis factor , and the �3 subunit of the-coupled protein. In a study of 276 patients with IBS

nd 120 controls, the 2c del322-325 and the 2a1291C/G variants were associated with constipation-ominant IBS, with odds ratios of 2.48 (95% confidencenterval, 0.98–6.28) and 1.66 (95% confidence interval,.94–2.92), respectively.94

Beyond identifying susceptibility genetic variants, in-estigators studying IBS have started to perform phar-acogenomic studies to better understand individual

esponse to treatment. One group genotyped 30 patientsith symptoms of diarrhea-dominant IBS for the inser-

ion/deletion within the promoter of the 5-HTT gene.hose patients were taking alosetron (ie, a 5-HT3 recep-

or antagonist) and were found to have a greater declinef colonic transit studies in LL homozygotes comparedith LS heterozygote patients.91

Technologies

Many technical breakthroughs were necessary toropel forward the completion of the HGP, the mostmportant of which were the invention of capillary-basedequencing and the vast improvements to PCR. Beforeompletion of the HGP, it was realized that new low-ost, high-throughput technological platforms for thessessment of nucleic acids would be necessary to cataloghe genomic variation between individuals. This realiza-ion has spurred great innovation in the applicability oflassic nucleic acid techniques to high-throughput ap-roaches and inspired completely new techniques towardhese ends. A plethora of platforms for both genotypingnd expression profiling have entered the market inecent years, too many for complete review in this report.ost of these use the unique property of nucleic acid

ybridization to achieve their goals, be it SNP genotyp-ng or gene expression profiling, although other ap-roaches such as mass spectroscopy have been taken. Thebility to easily and rapidly reproduce nucleic acids withCR and detect them using hybridization techniques has

ed to considerable throughput in these genomics-basedechnologies. The study of proteins, however, is not asfficient. Proteins cannot be reproduced as easily asNA, and detection techniques have previously required

pecific ligands to be produced (quite laboriously) forach protein. These facts and the heavy focus on nucleiccids in the recent past (the HGP) have resulted in aonsiderable lag in the throughput capabilities of theroteomic technologies. However, recent strides in pro-eomic technology have been made and are a vast im-rovement over traditional biochemical approaches. Here

e offer an overview of some of the technologies used for

STsrntlrtc

vatp

smTDcropnsttppSffTwmpoiat

uflMTwmSdaitnt

fmuGfm1TSaeTteefgddhcwili

T

PSA

I

P nome

1738 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

NP genotyping, expression profiling, and proteomics.his is not meant to be an exhaustive list but provides

ome paradigms of the current technological platformsegarding methods, flexibility, throughput, and useful-ess. Costs related to these technologies are prohibitiveo the traditional single investigator wet bench academicaboratory, mainly due to the significant investmentequired to acquire the equipment. Thus, large institu-ions tend to offer access to these platforms throughentrally administered core facilities.

Genotyping platforms. Here we provide an over-iew of select platforms for SNP genotyping. Includedre companies providing laboratory equipment as well ashose offering a genotyping service. An outline of theselatforms is provided in Table 2.

Pyrosequencing. Pyrosequencing is a quantitative,equencing-based genotyping platform developed andarketed by Biotage (http://www.pyrosequencing.com).his technology effectively sequences short lengths ofNA via detection of light emissions from a luciferase-

atalyzed reaction to quantitatively detect pyrophosphateelease during incorporation of individually applied de-xynucleoside triphosphates. Although this assay can beerformed on 96 templates at a time, parallel analysis ofumerous SNPs is impractical. Thus, this platform is notuitable for approaches requiring large numbers of SNPso be typed, such as whole genome association. In addi-ion, the low-throughput results in a relatively higher-SNP cost. However, the pyrosequencing technologyrovides for quantitation of allele frequency and presentsNPs in the context of the surrounding gene sequence,eatures that are not common to high-throughput plat-orms, which generally offer an “allele call” for each SNP.he quantitative feature of pyrosequencing lends itselfell to the quantification of gene copies and extent ofethylation studies,95 while the sequence component

rovides assay quality control and allows for the typingf consecutive or closely based SNPs. Thus, pyrosequenc-ng is an important tool when high quality or uniquepplications are needed.96 It is an excellent choice for the

able 2. Examples of Genotyping Platforms

Platform SNP determination

yrosequencing Sequencing based Small SNequenom Mass spectroscopy Small SNffymetrix/ParAllele Hybridization to GeneChip Larger SN

whole gllumina Hybridization to beads Larger SN

whole gerlegen Hybridization to GeneChip Whole ge

yping of a small number of SNPs. c

Sequenom. Sequenom (http://www.sequenom.com)ses a matrix-assisted laser desorption/ionization time-of-ight (MALDI-TOF) mass spectroscopy platform known asassARRAY. SNPs are genotyped using the MassEX-END assay, a label-free primer extension assay coupledith specific extension terminators, to produce DNA frag-ents of differing size depending on the allele of the assayed

NP and discriminated based on mass.97 This assay isesigned for a 384-well format, allowing for a number ofssays to be prepared simultaneously. The MassARRAYnstrument can also be used for SNP discovery and quan-itative gene expression. Mass spectroscopy offers an alter-ative to the fluorescent-based technologies used to geno-ype SNPs.

Affymetrix DNA chips/ParAllele platform. Af-ymetrix (http://www.afyymetrix.com), the company pri-arily known for their gene expression profiling prod-

cts, also offers genotyping arrays based on theeneChip platform (which is described in detail in the

ollowing text). Currently, Affymetrix offers 2 gene-apping assays, allowing for the simultaneous analysis of

0,000 and 100,000 SNPs per sample, respectively.hese assays overcome a primary bottleneck of large-scaleNP genotyping, the need for locus-specific SNP primersnd/or probes, by using a generic single primer to gen-rate the fragments necessary for allelic discrimination.his is accomplished through restriction digestion of the

arget genomic DNA with consequent ligation of adapt-rs to the DNA ends, PCR is performed using an adapt-r-specific primer, and products are size fragmented be-ore hybridization with the chip. Although this approachreatly simplifies the preparation of targets for alleliciscrimination, it is restricted by an inability to customesign assays. To address these shortcomings, Affymetrixas partnered with ParAllele (http://www.parallelebio.om) to provide standard and customizable assays for useith the Affymetrix system. ParAllele uses molecular

nversion probe technology, which is basically an oligoigation assay with a unimolecular mode of action allow-ng for amplification of targeted SNPs by PCR with

Suitable applications Access

s, candidate SNPs, special situations Equipment onlys, candidate SNPs Equipment onlyts, candidate genes, fine mapping,e mapping

Equipment only

ts, candidate genes, fine mapping,e mapping

Equipment and service

mapping Service only

P setP setP seenomP seenom

ommon primers and analyzed by universal sequence tag

D1i3Pd

agmtsaavdattctssAwlllhpwte

spttalra

Coippcutp

otbmtoimrmmmottSsunmwacc

pooWwdtticostnchosedtnhihl

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1739

NA arrays.98 ParAllele currently offers a single-tube0,000 coding SNP assay99 and the MegAllele Genotyp-ng 10K cSNP kit in addition to their user-customizableK, 5K, and 10K SNP assays. With such products,arAllele is an excellent choice for functional SNP can-idate gene studies.

Illumina. Illumina (http://www.illumina.com) is“full-service” genotyping company offering both a

enotyping service and a commercially available equip-ent platform based on their proprietary BeadArray

echnology. This technology uses oligo-coated beads thatelf-assemble into microwells chemically etched into anrray substrate, resulting in a randomly ordered DNArray.100 This array undergoes a decoding process pro-iding a map used in the downstream analysis of theata.101 The end result is a highly parallel, customizablessay capable of simultaneously interrogating tens ofhousands of SNPs on each sample. The strength of thisechnology is its flexibility, because it provides a practi-al platform to affordably assess hundreds to hundreds ofhousands of SNPs for each sample. As such, it is auitable platform for fine mapping and candidate genetudies as well as whole genome association studies.lthough incredibly low per-SNP costs can be achievedith this technology, the high cost of the equipment

imits its practicality to core genotyping facilities andarge genotyping programs. Additionally, the setup ofarge genotyping panels is only cost effective if manyundreds to thousands of samples will be typed. Someredesigned assays have been announced, including ahole genome association “bead chip” capable of simul-

aneously assessing 100,000 highly informative SNPs onach sample.

Perlegen. Perlegen (http://www.perlegen.com) is aervice-only genotyping company that primarily servicesharmaceutical companies and large well-funded geno-yping efforts. They use proprietary assays developed onhe Affymetrix gene chip platform that are not publiclyvailable, with the current capability to assess 1.5 mil-ion SNPs per sample. This company’s services are di-ectly applied toward whole genome association studiesnd are not readily suitable to candidate gene approaches.

Expression profiling platforms. Affymetrix Gene-hips. Affymetrix (http://www.affymetrix.com), makerf the GeneChip, is the leader in gene expression profil-ng technologies; to date, more than 3000 scientificapers have been published on the use or review of thislatform. The process they use to manufacture theirhips was adapted from the semiconductor industry andses photolithography coupled with solid-phase chemis-ry to “build” oligonucleotides directly onto the chips,

roviding an extremely accurate, high-density array of g

ligonucleotide probes. Gene expression profiling withhe Affymetrix GeneChip is conceptually simple. La-eled complementary RNA targets are prepared from theRNA of the experimental samples and hybridized to

he oligonucleotides on the GeneChip. The level of flu-rescence associated with each location of the GeneChips assessed using a laser-scanning fluorescent confocalicroscope scanner (sold by Affymetrix) and directly

eflects the amount of mRNA in the sample. Affymetrixanufactures GeneChips to assess gene expression forore than 20 species, both plant and animal. The Hu-an Genome U133 Plus 2.0 Array is the current flagship

f the Affymetrix product line. This GeneChip allows forhe simultaneous interrogation of more than 47,000ranscripts selected from the GenBank, dbEST, and Ref-eq databases, representing the complete known tran-criptome and including many expressed sequences withnknown function. Affymetrix also offers a Human Ge-ome Focus Array based on the U133 chip, representingore than 8500 verified genes as a tool for investigatorsith limited resources. Additionally, Affymetrix offers

n array customization program for those investigators orompanies wishing to produce arrays with a focusedontent, such as pathway-specific or cell-specific genes.

Spotted arrays. Spotted arrays are produced by de-ositing PCR-produced complementary DNA (cDNA)r long (generally 40–100 base pairs) oligonucleotidesnto glass slides for use in gene expression analysis.

hile potentially more economical than GeneChipshen produced en masse, the production of these arraysoes require specialized equipment and expertise withhe handling of large cDNA or oligonucleotide collec-ions. Quality control of these collections is of utmostmportance, because misidentification or contaminationan have drastic effects on experimental results. The usef cDNA on spotted arrays is substantially labor inten-ive due to the necessity of PCR amplification, purifica-ion, and verification for thousands of products. Oligo-ucleotides can be readily produced or procuredommercially, somewhat reducing labor. Shorter oligosave less potential for cross-hybridization than longerligos or cDNAs but tend to give weaker signals. Probeet selection can be a daunting task for the individual,specially if a broad assessment of the transcriptome isesired, because not only do thousands of sequences needo be identified but potential cross-hybridization alsoeeds to be minimized. Array-to-array reproducibilityas been traditionally hard to control, but improvementsn the equipment to spot the arrays has improved this;owever, spotted arrays are not often used to compare aarge number of samples. Instead, spotted arrays are

enerally used to analyze a mixture of 2 differentially

ltdAstapeaofe

pecfbsdapapidaqlppitmsaoS

iwaoddsodcoi

cmlafpmtmwisBssrfgpdttpmPF

ad

FtsCI

1740 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

abeled samples, providing a ratio of expression betweenhe 2 samples. There are a few companies offering pre-esigned and custom-made cDNA spotted arrays, such asgilent (http://www.agilent.com) and others offering

potters and scanners, including Affymetrix. The poten-ial pitfalls of spotted arrays coupled with the widevailability and affordable customization of GeneChipsreclude spotted arrays as a practical approach to globalxpression profiling. However, the potential for spottedrrays to be much more sensitive (detect smaller amountsf RNA in a sample) than GeneChips may prove to haveuture applicability to the study of subtle variation inxpression and genes regularly expressed at low levels.

Proteomics platforms. Two-dimensional gel electro-horesis. Two-dimensional gel electrophoresis is a pow-rful method for the separation of proteins by bothharge (first dimension) and mass (second dimension)rom complex mixtures such as tissues, culture media, oriologic fluids. Once stained, the product of 2-dimen-ional gel electrophoresis is a 2-dimensional matrix ofots representing different proteins that can be isolatednd characterized by mass spectroscopy. Its primary ap-lications are in the global assessment of the proteome ofsample and in the identification of differentially ex-

ressed proteins between samples. Proteome assessmentnvolves the isolation and characterization of all proteinots for a single sample. This is a potentially powerfulpproach but is reliant on staining techniques that can beuite insensitive and will miss many proteins that are atow concentration and thus does not provide a completeroteome assessment. Identification of differentially ex-ressed proteins is accomplished by (1) comparing spotntensities between 2 samples using dyes that stain pro-eins in a quantitative manner or (2) comparing theatrix for variation in layout (ie, dots present in one

ample, missing in another). A number of publicly avail-ble databases of 2-dimensional gel results can be foundn the Internet, the most complete of which isWISS-2D PAGE, found at http://us.expasy.org/ch2d/.

Mass spectroscopy. Mass spectroscopy is a tool allow-ng for accurate and sensitive determination of moleculareight and structural information from biomolecules such

s proteins and nucleic acids. A mass spectrometer consistsf 3 parts: the ionization source, the mass analyzer, and theetection system.102 Most common mass spectrometers to-ay use electrospray or matrix-assisted laser desorption ionources because these are the most amenable to the analysisf large biomolecules such as proteins.102 A number ofifferent mass analyzers are available to separate the mole-ules based on their mass-to-charge ratio, the most commonf which are quadrupoles, TOF, magnetic sectors (MS), and

on traps. These analyzers vary on their detectable mass-to- m

harge range and compatibility with different ionizationethods. Tandem mass spectrometers using multiple ana-

yzers such as quadrupoles-TOF, MS-MS, and TOF-TOFre widely available and generally used where additional ionragmentation is necessary, such as in the generation ofrotein structural information. A photomultiplier, electronultiplier, or microchannel plate detector is used, based on

he type of detector, to monitor the ion current and trans-it the signal (mass spectra) to the computer. The mostidely applied proteomic application of mass spectroscopy

s the identification of proteins recovered from 2-dimen-ional gel electrophoresis via peptide mass fingerprinting.102

riefly, the recovered proteins are subjected to cleavage withpecific proteases to produce products of varying mass andubjected to mass spectroscopy. The resulting masses rep-esent a fingerprint of the protein, based on the sizes of theragments generated by the protease cleavage.102 These fin-erprints are compared with measured and/or calculatedrotein fingerprints from a number of publicly availableatabases to identify the isolated protein. The Swiss Insti-ute of Bioinformatics offers many tools and links for pro-ein analysis via their Expert Proteome Analysis Systemroteomics server at http://us.expasy.org/. Another com-only used protein sequence database mining tool, Proteinrospector, is offered by the University of California Sanrancisco and can be accessed at http://prospector.ucsf.edu/.

Implications for GastroenterologyClinical Practice

Genomics and proteomics promise unparalleleddvancements in our knowledge of human health andisease. With time, this accumulated information re-

igure 10. The proposed impact of genomics in medicine. Knowinghe genetic contribution to disease will facilitate prevention, diagno-is, and novel therapies. Modified and reprinted with permission fromollins and McKusick (the AMA does not hold copyright to this article.

t was written by an employee of the United States Federal Govern-

ent and thus is in the public domain.)107

gaattalwrbpod

sgbtttdgaimpct3pit(ff

iaodwiladptNDtv

ctoHpendttgptpalaplasobdrbrCfrCdttltd

aotaCbfyCgpk

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1741

arding the structure and function of the human genomend proteome will alter our approach to the diagnosisnd treatment of disease and may dramatically changehe practice of medicine (Figure 10). Advances in geneticesting for disease risk and early disease detection as wells pharmaceutical responsiveness will ultimately increaseife expectancy and quality of life for many individualshile decreasing the overall cost of health care. Down the

oad, the lessons learned from the HGP will result inetter therapies for all diseases, whether through newharmaceuticals modified specifically to the individual(s)r direct manipulation of the genome to achieve theesired result.In the previous sections of this report, we have de-

cribed concepts, facts, and methods regarding humanenomics and the emerging field of proteomics. Weelieve this introduction will facilitate understanding ofhe potential applications, current limitations, and an-icipated impact of these scientific fields and relatedechnologies in the clinical practice of GI and liveriseases. At the present, despite ongoing enthusiasm andreat expectations, the clinical application of genomicsnd proteomics in digestive and hepatic diseases is lim-ted to Mendelian diseases such as hereditary hemochro-atosis, adenomatous polyposis coli, and other inherited

olyposis syndromes. There are many challenges to over-ome before these technologies will be integrated intohe practice of medicine. From our perspective, there arekey components that are essential to move genomic androteomic research forward, including implementationn clinical practice: (1) development and execution ofranslational clinical studies on GI diseases of interest,2) continued technological innovation, and (3) sufficientunding. These important issues are discussed in theollowing paragraphs of this section.

Mainstreaming the sciences of genomics and proteom-cs into clinical practice will require the developmentnd execution of translational clinical trials in GI diseasesf interest. First, we must identify and prioritize the GIiseases in which application of genomics and proteomicsill be most beneficial, taking into account a number of

ssues such as (1) burden of the GI disease in the popu-ation, (2) GI diseases that currently lack efficient ther-pies, (3) likely impact on GI practice, and (4) model GIisease with strong genetic component (ie, CD, CUC,rimary sclerosing cholangitis [PSC]). Second, we needo develop translational studies in partnership with theational Institute of Diabetes and Digestive and Kidneyiseases for research in genomic and proteomic applica-

ions focused on these prioritized GI diseases and rele-

ant projects. p

Common complex GI diseases such as IBS or sporadicolon cancer have high prevalence in the population;hus, genomic/proteomic-based discoveries for these dis-rders will have a strong impact on clinical practice.owever, recognizing that biologic phenomena such as

enetrance, variable expressivity, genetic and locus het-rogeneity, imprinting, and epigenetics cause a highoise-to-signal ratio, we have to accept that it will beifficult to dissect the causative genetic and environmen-al contribution of common complex diseases. To addresshese issues, the current generation of physician-investi-ators will have to develop large (ie, �1000 studyarticipants) research resources of well-phenotyped pa-ients linked to biospecimens (ie, genomic DNA, serum,lasma, tissue of interest) to apply the tools of genomicsnd dissect the genetic variants predisposing to GI andiver diseases. The overarching goal of such an effort ismbitious and will require many resources, including theossibility of collaboration among medical centers foress common complex diseases. To this extent, we maylso have to focus on rare complex GI diseases with atrong genetic component as models because dissectionf the susceptibility genetic variants in these is exected toe less challenging than that of common complex GIiseases. Subsequently, the application of findings de-ived from rare complex GI diseases can be significant inoth. For example, in a family of an affected proband, theelative risk of a sibling to develop sporadic colon cancer,UC, or PSC is 2- to 3-fold, 10- to 20-fold, and �100-

old, respectively.57,103,104 Because of their higher relativeisk, understanding the genetic contribution to PSC orUC (although not an easy task) will be relatively lessemanding compared with dissecting the genetic suscep-ibility to sporadic colon cancer. Nonetheless, unravelinghe genetic predisposition of PSC or CUC could shedight on sporadic colon cancer pathogenesis given the facthat both of the former (ie, CUC and PSC) are indepen-ent risk factors for development of colon carcinoma.Presently, technological innovation is moving rapidly

nd clinically applicable genomic-based approaches aren the horizon. Technology today provides the first stepoward the goal of “personalized medicine” with Foodnd Drug Administration approval of the AmpliChipYP450 array (Roche Molecular Diagnostics) in Decem-er 2004.105 Based on the Affymetrix GeneChip plat-orm, the CYP450 array performs a comprehensive anal-sis of 2 genes of the CYP450 gene family, namelyYP2D6 (29 alleles) and CYP2C19 (2 alleles). These 2enes are likely responsible for the metabolism of ap-roximately 25% of medications currently on the mar-et, including a number of antidepressants, �-blockers,

roton pump inhibitors, and antipsychotics. Genetic

vdwamAt$stt

pnteepeamctwslkstApgptm

psgpNrtodbmtaeas

ievgtc

cfeIsdspmeptrghid

mmda

Foddmqced

1742 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

ariation of these genes determines how well the relevantrugs are metabolized and hence may provide physiciansith a useful tool in choosing the optimum medication

nd/or dose. Variable response to pharmaceutical treat-ent remains a significant hurdle in medical practice.dverse drug reactions are the cause of 2 million hospi-

alizations and 100,000 deaths annually, at a cost of100 billion dollars each year.106 The ability to predictuch adverse reactions and correctly identify optimalherapies thus has the potential to save human lives andhe economy billions of dollars per year in medical costs.

The AmpliChip CYP450 array offers a “predictivehenotype” for each of the 2 genes based on the combi-ation of alleles present in the individual. For CYP2C19,here are 2 such classifications: poor metabolizer (ie, nonzyme activity) and normal metabolizer (ie, normalnzyme activity). There are 4 classifications for CYP2D6:oor (ie, no enzyme activity), intermediate (ie, reducednzyme activity), extensive (ie, normal enzyme activity),nd ultrarapid (ie, higher than normal enzyme activity)etabolizer. Those with poor metabolic capacity for

ertain drugs are most at risk for developing complica-ions due to the toxic buildup of relevant medications,hereas drugs may not reach beneficial levels when pre-

cribed at recommended doses in the ultrarapid metabo-izer. Currently, the high cost of the test, the lack ofnowledge to interpret it by physicians, and the neces-ity for sophisticated laboratory setup to perform theesting ensure that it will be some time before thempliChip CYP450 array is widely used in medicalractice. However, the potential exists. For instance,enotyping patients with IBS for CYP2D6 may be im-ortant to recommend dosage compensation for nortrip-yline or amitriptyline so that the benefit of treatment isaximal.Generous funding is critical for the research enter-

rise. The federal government and private sector haveupported and are committed to continue promotingenomic and proteomic technologies because of antici-ated clinical applications. In fact, the main goal of theHGRI is to sponsor and advance genome-oriented

esearch. For instance, the NHGRI has recently commit-ed more than $38 million in grants to spur developmentf innovative technologies designed to dramatically re-uce the cost of whole genome sequencing, aimed atroadening the applications of genomic information inedical research and health care. The strategic goal of

he NHGRI is to initially reduce the cost of sequencingn entire mammalian-sized genome to $100,000. It isxpected that such technologies will reach commercialvailability 5 years from now, enabling researchers to

equence the genomes of people as part of studies to 2

dentify genes that contribute to common complex dis-ases such as colon cancer and diabetes. Ultimately, theision of the NHGRI is to decrease the price of wholeenome sequencing to $1000 or less, which would allowhe sequencing of individual genomes as part of medicalare.

The ability to sequence each human’s genome canost-effectively give rise to more individualized strategiesor diagnosing, treating, and preventing disease. To thisxtent, the AGA, in collaboration with the Nationalnstitute of Diabetes and Digestive and Kidney Diseases,hould delineate the current and future needs of researchirection and initiatives for digestive diseases and sub-equently sponsor translational patient-oriented studiesoised to leverage these promising technologies as theyature in the near future. Beyond federal funding, how-

ver, the private sector has also invested in genomic androteomic technologies and initiatives. In fact, many ofhe technological innovations have come from privateesearch and development companies. More importantly,enotyping companies such as Illumina and Perlegenave been actively involved in and are currently perform-ng genomic projects in collaboration with major aca-emic medical centers in a variety of diseases.From a theoretical perspective, new technologies (ie,icroarrays, sequencing) will shed light on the abnormalolecular pathways and cellular events that occur in

igestive and liver diseases. Presently, it is difficult toccurately predict the future impact of genomics/pro-

igure 11. The putative clinical impact of genomics and proteomicsver the next 25 years. Genomics and proteomics are currently in aiscovery and research phase, during which time there will be littleirect impact on clinical practice. Nevertheless, expected improve-ents in genomic technology (ie, P450 array, whole genome se-uence cost at $100,000 or $1000 per person) and data analysis,oupled with the knowledge gained in the initial period, will result in anxponential increase of clinical impact; use of genetic testing andesigner drugs are proposed to be introduced into medical practice in

015 and 2020, respectively.

tNtwpampaewp

ccdCC(avtbnbrpmdtg

otpwCattiactn

idmihc

cLdttpedmaa

agd

ahMIgotttamgtiormbese

anposht

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1743

eomics in diagnosing and treating GI and liver diseases.evertheless, it has been proposed that by 2015 genetic

esting for predisposition to selected complex diseasesill become available in clinical practice107 and by 2020redicting drug responsiveness will be standard of carend gene-based designer drugs will be introduced toedical practice (Figure 11).107 We expect genomic and

roteomic technologies to positively affect Mendeliannd complex digestive and hepatic diseases alike. How-ver, as stated earlier, the clinical and economic impactill be greater for the latter because of the higher overallrevalence in the general population.Among complex GI and liver diseases, CD is one that

an be used as a paradigm to envision how genomicsould be implemented into clinical practice. After aecade of research effort, the first susceptibility gene ofD is a reality.19,58 We now know that mutations of theARD15 gene are responsible for a small part of the risk

�5%) of developing CD in the Western hemisphere. Itppears that the penetrance of CARD15 susceptibilityariants, even in homozygotes, is at best 10%. Indeed,here are reports that carriers of the CARD15 suscepti-ility alleles have no CD.59 Collectively, these findingsot only illustrate the complexity of CD pathogenesisut also indicate the need for additional studies to un-avel the yet unknown CD susceptibility gene(s), theutative modifier gene(s), and the environmental ele-ents that are critical to the pathogenesis of this disor-

er. Hence, this case exemplifies the various challengeshat we face before implementing the knowledge ofenomics into clinical practice.The importance of identifying the susceptibility genes

f complex diseases relates not only to diagnosis but alsoo better understanding of the pathogenesis, clinicalresentation, and natural history of disease. For instance,e now begin to appreciate that patients who carry theARD15 susceptibility alleles are diagnosed with CD atyounger age, usually present with ileitis, and have a

endency to develop strictures more frequently duringhe course of disease. For a chronic disease like CD,dentifying patients who will develop complications suchs intestinal strictures or colon cancer has importantlinical implications. To this end, we need to developranslational studies to better define the genotype/phe-otype associations of CD.Beyond complex diseases, however, better understand-

ng of genomics of Mendelian digestive and hepaticiseases is also important to shed light on yet unknownolecular and cellular pathways of relevant diseases. For

nstance, we now know that more than one gene causesereditary hemochromatosis, and this fact has signifi-

antly improved our knowledge of iron absorption and o

ellular processing. In another example, the discovery ofDL receptor, as the result of studying an extremely rareisease (ie, homozygote hypercholesteremia),108 led tohe development of statins, which have revolutionizedhe medical treatment of hyperlipidemias. This is arime paradigm of an exceptionally rare Mendelian dis-ase becoming the basis for prevention of coronary arteryisease, a common complex disorder. Indeed, the esti-ated target population for statins in the United States

lone is 25%–50% of adults older than 45 years ofge.109

Clearly, there are many challenges ahead before we canpply the plethora of genomic and proteomic technolo-ies and science to patients with digestive and hepaticiseases.

Implications for Education andTraining of GastroenterologistsIn the coming years, discovery of genetic vari-

nts that influence predisposition to disease will likelyave an impact on the diagnosis and therapy of severalendelian and complex digestive and hepatic diseases.

n the appropriate setting, use of tests to screen forenetic variants will become an important componentf clinical practice in the future (Figure 11). At thatime, medical practitioners will need to correctly in-erpret the test findings and thoroughly understandhe implications of such testing to effectively diagnosend treat the patient and to prevent illness. To date,ost physicians have no formal training in medical

enetics; therefore, they have difficulty with the in-erpretation of genetic test findings and lack counsel-ng skills. In fact, a study to evaluate the clinical usef APC gene testing reported that genetic testingesults were misinterpreted by physicians in approxi-ately 30% of patients with FAP.110 This finding

ecomes important in patients at risk for highly pen-trant diseases, where incorrect interpretation of re-ults may lead to false-negative diagnosis for an oth-rwise vastly preventable disease.

Previous sections of this report focused on scientificdvances in genomics and proteomics. Nevertheless, weeed to remember that established approaches to medicalractice, such as a gathering and comprehensive analysisf the family history, will remain important towardcreening and treatment of patients with digestive andepatic diseases and their families. Instructions on ob-aining the family history have traditionally been focused

n Mendelian diseases. However, in the near future,

ftsnbatbap

patDaptpsdd

saccutocfiCipt

iplaairbtpCase

ssp

oropttcAA

aeatipwnafvaaa

1744 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

amily history will be applied more often in the con-ext of complex disorders.111 As we gain better under-tanding of the genetic and environmental compo-ents of these diseases, family information willecome increasingly essential. Health care profession-ls need to retain the skill of collecting and assessinghe family history in addition to understanding theasics of genetics before new genetic testing becomes widely used tool for the diagnosis, treatment, andrevention of complex disease.Beyond addressing the genetic susceptibility to com-

lex diseases, new clinical tests will likely focus onssessing individual variation to drug response. Recently,he US Food and Drug Administration approved the firstNA microarray for genotyping 2 genes (ie, CYP2D6

nd CYP2C19) of the cytochrome P450 system (seerevious text). Variants of these 2 genes can predict pooro ultrarapid metabolizers of medications before evenrescribing them. The use of this test could affect deci-ion making on a variety of prescribed medications byefining the optimal treatment dose and/or preventingangerous drug side effects and interactions.

As a result of all these exciting discoveries, medicaltudents to practicing gastroenterologists alike will besked to be able to interpret genetic test results and toommunicate these findings to patients and relatives,olleagues, and referring physicians. Realizing these ed-cational needs, medical genetics is now incorporated inhe curriculum of medical schools. However, to the bestf our knowledge, there is no current medical geneticsurriculum during internal medicine residency or GIellowship training programs. In fact, an Internal Med-cine Residency Training Program Genetics Curriculumommittee was recently formed and a curriculum outline

s now published.112 Nevertheless, to date, there are nolans on educating the practicing gastroenterologist onhe concepts and future applications of genomics.

The educational requirements of the medical student,nternal medicine resident, and GI trainee on genomics androteomics should be broad. The medical genetics curricu-um should include the fundamental concepts, principles,pplications, and shortcomings of both Mendelian diseasend complex disease genetics. Emphasis should be given tonterpreting relevant genetic test results and delivering theesults to and counseling of the patient and family mem-ers. The learning objectives can be achieved through directeaching in the classroom and on medical rounds, reading ofertinent textbooks and interactive multimedia such asD-ROMs, elective clinical experience in medical genetics,nd participation in clinical genetics research. The medicaltudent, resident, and GI fellow should understand the

ssential genetic concepts and terminology, be able to con-

truct a 3-generation pedigree, perform basic genetic coun-eling, and be familiar with on-line genetics, genomics, androteomics resources (see Appendix 2).

The educational needs of the practicing gastroenterol-gist on genomics and proteomics should be directlyelated to clinical practice. An appreciation of the basicsf genetics will be required, along with ability to inter-ret genetic tests and to effectively counsel patients andheir relatives. These learning objectives can be achievedhrough self-directed reading, continuing medical edu-ation courses, attendance of professional meetings (ie,merican Association for the Study of Liver Diseases,GA), and special courses.At the beginning of the 21st century, medicine enters

new era. Despite progress in human genomics and themergence of proteomics, the practicing physician hasnd will have a pivotal role in assessing and dissectinghe phenotype of disease and in applying the technolog-cal advances to provide the best possible care to theatient. Postulating that these laboratory technologiesill begin to enter clinical practice in the next decade(s),ow is the time to begin addressing the needs to educatend train current and future gastroenterologists in theundamentals of medical genetics and the clinical rele-ance of genomic medicine. To this end, the AGA is inunique position to initiate, guide, and supervise such

n action that will greatly benefit each of its membersnd patients suffering from digestive and liver diseases.

Suggested Key Research Questions

Suggested key research questions are as follows.

● What is the current and expected significance ofgenomic and proteomic technologies in gastroenter-ology clinical practice?

● Are the available genomic and proteomic technol-ogies a research tool and/or currently in use forclinical applications?

● Which element(s) will affect the future clinicalimplementation and overall impact of genomic andproteomic technologies?

● Which are the current challenges of genomic andproteomic technologies; how can we overcomethem?

● What will influence the clinical usefulness of theAmpliChip CYP450 array and other similar tech-

nologies in years to come?

pstamstgotcin

iprfimhictdrngbdfpitwtqnewp

gmsdrbn

cceRAcmccmbWne

saptso3psebbtgmrpat

mglfkobprosmalii

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1745

Ethical, Legal, and Social IssuesThe study of genomics and proteomics offers un-

recedented potential toward the understanding and as-essment of disease, especially in the identification ofhose at increased risk of contracting illness, in turnllowing us to tailor our surveillance and treatmentethods to the individual and resulting in a health care

ystem that is markedly more efficient and effective thanhat of today. However, with these advancements comereat social and economic implications with the potentialf slowing or perhaps limiting the successful implemen-ation of these medically beneficial insights. Although aomplete treatise on these ethical, legal, and social issuess outside of the scope of this report, we think it isecessary to briefly address a few key topics.The main concerns of the public regarding genetic test-

ng are currently focused on 2 issues: insurability and em-loyment. A first step in addressing the societal concernsegarding genetic information and insurability was putorth in the Health Insurance Portability and Accountabil-ty Act of 1996. This legislation provided the first federallyandated protections against genetic discrimination in

ealth insurance, most importantly by stating that geneticnformation in the absence of a clinical diagnosis of illnessould not be considered a preexisting condition. However,he Health Insurance Portability and Accountability Actoes not prohibit insurance providers from charging higherates to individuals based on genetic information and doesot prohibit insurers from requiring applicants to undergoenetic testing. Furthermore, the Health Insurance Porta-ility and Accountability Act places no restrictions on theisclosure of genetic information to insurance companies, aact that lends credence to the public’s concerns. For exam-le, individuals potentially benefiting from genetic screen-ng for disease risk may be reluctant to take advantage ofhis screening for fear that their already-high insurance ratesill be increased. Additionally, those identified as at risk

hrough genetic screening may be reluctant to receive ade-uate follow-up disease surveillance during periods whenot insured for fears of clinical diagnosis and thus a “pre-xisting” condition. The concerns regarding insurabilityill become exponentially more important as genomic androteomic knowledge advances.

The second primary concern of the public regardingenetic information is whether employers will be per-itted to consider genetic information in making per-

onnel decisions and therefore unjustifiably rule out in-ividuals from employment for reasons that are notelated to their ability to perform the tasks of their jobut for motive of future economic liability to the orga-

ization. Such concern may be warranted, because some r

ompanies have adopted discriminatory policies towardertain at-risk segments of the population such as smok-rs or the obese, who are not protected under the Civilights Act of 1964 or the Americans with Disabilitiesct of 1990. With the increasing costs of health care, one

an imagine that a business focused on the “bottom line”ay well see a genetic prescreen as a way to reduce these

osts. In turn, these potentially considerable cost savingsould be passed on to their employees such that theyight not mind such an intrusive policy and willingly

ecome participants in de facto genetic discrimination.hile the employers’ role in genetic discrimination is

ot now a major factor, it may become so as our knowl-dge of genomics and proteomics grows.

An important legal aspect of genomic and proteomictudy relates to the patenting of newly discovered genesnd proteins. These natural products have been easilyatentable following isolation, purification, and charac-erization, although recent acceptance guidelines neces-itate that some level of feasible utility for use of the gener protein be shown. To date, there have been more thanmillion genome-related patent applications filed. The

atenting of genes or proteins is a strategy to rewardcientists for their novel discoveries and to protect sci-ntific inventions without secrecy. While this has beeneneficial, providing the means of existence for countlessiotechnology companies and contributing significantlyo the economy, the current accumulation of patentedenes may begin to discourage new investigators, whoay consider that all of the “good” genes are “taken.” In

ealistic terms, genes and proteins have and will beatented. While there are arguments both for andgainst the practice, biologic patents are now inter-wined with the economy; there is no turning back.

Translational clinical studies are necessary to prove theedical applicability of genomic and proteomic technolo-

ies. To this extent, investigators have stored human bio-ogic samples for decades. Such collections can be obtainedor diagnostic and research purposes and could aid ournowledge regarding diseases in humans. Current technol-gies in biology provide effective approaches to use suchiospecimen resources for research, diagnostic, and thera-eutic purposes. Nevertheless, ethical issues have beenaised about these sample repositories and related technol-gies. Important issues regarding the storage of humanpecimens include consent of individuals who provided suchaterials, methods and duration of stored biologic materi-

ls, identification of the material and its source, materialinkage to medical data, transfer and exchange of biomed-cal material, and transformation of genetic material (ie,mmortal cell lines) for future use. It is important that

esearchers should pursue their scientific goals without com-

pracstr

b

srogkrag

AA

A

A

A

AB

C

C

C

C

DD

E

E

EEEGG

GHHH

H

HHHI

III

1746 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

romising the rights of the human subjects involved in suchesearch. To this direction, institutional review boards,long with investigators and research sponsors, must exer-ise great care and social sensitivity in applying the profes-ional guidelines and governmental regulations to protecthe subjects of research whose biologic samples are used foresearch purposes.

The ethical, legal, and social issues pertaining to

iomedical research evolve as science, medicine, and p

ociety move forward and past practices change oredefine. In the genome era, topics related to privacyf the genetic information, fairness in the use ofenetic information, the psychological impact ofnowing the genetic predisposition to disease, accu-acy of genetic testing, and education of physiciansnd the public about the capacities and limitations ofenetic information will continue to challenge health

rofessionals, patients, and society alike.

Appendix 1. Glossary

llele One or more alternate forms of a gene at a specific locuslternative splicing The mechanism by which variation in the use of a gene’s exons leads to the production of multiple mRNA

species, or isoforms, potentially coding for functionally different proteinsssociation study A case-control study design that searches for a statistical correlation between particular genetic variant(s)

and a disease or disease traitutosomal dominant A pattern of inheritance in which one dominant allele is capable of exerting its phenotypic effect,

regardless of the other alleleutosomal recessive A pattern of inheritance in which 2 copies of the same allele are necessary to exert a phenotypic

effectutosomes Any chromosome except for the sex chromosomes (X and Y)ioinformatics The use of computer science and methods for the purpose of speeding up and enhancing biologic

researchandidate gene A gene potentially harboring genetic variants unique to or more prevalent in those with a particular

diseasehromatin The observable (via staining) material in the interphase nuclei, composed of genomic DNA, chromosomal

proteins, and some RNAodon A sequence of 3 DNA or RNA nucleotides coding for a certain amino acid or providing a signal to

terminate translation of a protein (ie, stop codon)omparative genomics Uses the knowledge gained through genomic study of other species to infer the likely function of

unstudied genes in humanseletion Loss of one or more nucleotides from a DNA or RNA sequenceizygotic twins Twins resulting from the fertilization of separate eggs by separate sperm and thus sharing the same

amount of genetic material as nontwin siblings (ie, 50%)pigenetic A nonmutational phenomenon that alters the phenotype without alteration of genotype (ie, promoter

methylation or histone modification)STs Expressed sequence tags—short stretches of DNA sequence expressed at the RNA level and detected by

means that are not gene specific; ESTs are primarily used to identify new genes and with thecompletion of the HGP are being mapped to chromosomes

uchromatin The genetically active, gene-rich part of the chromatinxpressivity The degree of phenotypic expression of a genexon A segment of DNA that is present in fully processed mRNAene expression profiling The large-scale quantitative assessment of gene expression at the mRNA levelenetic epidemiology The study of genetic factors and their interaction with environmental factors in the distribution and

determination of disease in the populationenotype The genetic makeup of an individualaploid genome Genome with only a single chromosome complement, such as are found in gametesaplotype A combination of alleles at 2 or more closely linked genetic lociaplotype block Block-like structures of the genome that are in LD with each other and can be described by a few of the

many potential haplotypeseredity The passage of genetic material from parents to offspring, resulting in the inheritance of similar

traitseterochromatin The genetically inactive, gene-poor regions of the chromatin composed of repetitive DNA sequenceseterozygous Having 2 different alleles at a specific gene locusomozygous Having 2 identical alleles at a specific gene locus

mprinting Differential expression of a gene or chromosome segment dependent on parental origin, often throughepigenetic means

nsertion Addition of one or more nucleotides to a DNA or RNA sequencentergenic region(s) The regions of the genome located between genesntron A nontranscribed region of a gene that does not code for a protein

L

L

L

LMM

MMM

MM

MMMMN

O

PP

P

P

PPP

PR

R

R

R

S

S

SSS

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1747

Appendix 1. (continued)

inkage The tendency of genetic loci on the same chromosome to be inherited together as a consequence oftheir physical proximity

inkage analysis A method to trace and measure the cosegregation of a disease or disease traits in a family using markerloci

inkage disequllibrium Particular alleles at 2 or more adjacent loci that occur together more frequently than expected based ontheir individual frequencies; often termed “allelic association”

oci Plural of locus—the physical location of a gene.ajor allele The most prevalent allele for a given locus in a populationicrosatellites Small string of tandemly repeated DNA sequence, commonly of 1–4 base pairs; represent approximately

105 uniformly distributed loci in the human genomeinor allele The least prevalent allele for a given locus in a populationodifier gene A genetic factor that modifies the expression or phenotypic consequence of expressing another geneonozygotic twins Twins resulting from the fertilization of one egg with one sperm, thus sharing 100% of their genetic

materialutation A permanent, potentially heritable modification in the sequence of genomic DNAutation, frameshift Mutation caused by deletion or insertion of a nucleotide(s) resulting in an altered open-reading frame of a

gene and usually to a truncated proteinutation, missense Mutation resulting in the incorporation of an incorrect amino acid into a proteinutation, nonsense Mutation resulting in a stop codon that causes truncation of a proteinutation, point Mutation of a single nucleic acid, capable of directly causing disease; often confused with SNPsutation, silent Mutation in coding sequence that causes no change in the amino acid sequence of a proteinucleotide substitution Change of a single nucleotide due to slight lack of fidelity in the DNA replication process; is one

evolutionary process producing additional diversity in the speciesligonucleotide A short (20–100 base pairs) nucleic acid molecule generally used as a primer or probe through its ability

to hybridize with complementary nucleic acid sequencesenetrance The likelihood that a person with a specific genotype will express a certain phenotypehenotype The observable physical, biochemical, or physiologic features resulting from the expression of one or

more genesharmacogenomics The study of genetic and environmental factors contributing to the success or failure of pharmaceutical

interventionleiotrophy Multiple, seemingly unrelated and variably expressed phenotypes stemming from the expression of a

single geneolymorphism The existence of 2 or more common alleles at a specific locusopulation bottleneck Decreased genetic variation among members of a population expanded from relatively few foundersositional cloning Identification of specific disease genes exclusively based on their chromosomal position within the

genomeroband The individual through which a pedigree is discovered and exploredecombination The process during meiosis in which regions between pairs of equivalent chromosomes are exchanged

through the process of crossing over; this process results in offspring that are biologically differentfrom their parents and thus provides genetic variation in a population

ecombination hot spot An area of the chromosome where recombination takes place more often than expected, resulting in lowLD and driving the creation of haplotype blocks

elative risk ratio The risk of an “exposed” population to develop a disease compared with an “unexposed” population;exposure not only includes environmental exposures but also familial, genotypic, or allelic exposure—this ratio is calculated by dividing the prevalence of disease in the exposed population by theprevalence in the unexposed population; the greater the number, the greater the risk of developing thedisease in the exposed population

elative risk ratio of a sibling(�s)

The risk of a person to develop a disease if his or her biological sibling is affected. The �s is calculatedby dividing the prevalence of disease among siblings with the prevalence of the disease in the generalpopulation

ibling (sib)-pair analysis Linkage analysis in which genetic markers are tested for linkage to a disease or phenotypic trait bymeasuring the extent to which affected sibling pairs share the marker haplotypes

NPs Single nucleotide polymorphisms—a specific position in the genome where alternate nucleotides can(and do) exist between 2 individuals or a population; a minimum minor allele frequency of 1% is ofteninvoked to further define an SNP

NP, coding An SNP located in a gene coding exonNP, nonsynonymous A coding SNP that changes the amino acid sequenceNP, synonymous A coding SNP that does not change the amino acid sequence

G

G

P

T

1748 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

Appendix 2. Selected Online Resources

enomic projects/agenciesNational Center for Biotechnology Information—home page

http://www.ncbi.nlm.nih.gov/Human Genome Organization—home page

http://www.gene.ucl.ac.uk/hugo/Human Gene Nomenclature Committee—home page

http://www.gene.ucl.ac.uk/nomenclature/SNP Consortium—home page

http://snp.cshl.org/International HapMap Project—home page

http://www.hapmap.org/NHGRI—home page

http://www.genome.gov/enomic databases/toolsOnline Mendelian Inheritance in Man—catalog of human genetic

disordershttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db�OMIM

Entrez Genome Project—comprehensive database of genomemapping projects

http://ncbi.nlm.nih.gov/entrez/query.fcgi?db�genomeprj/Basic Local Alignment Search Tool—nucleic acid and protein

alignment toolhttp://www.ncbi.nlm.nih.gov/BLAST/

dbSNP—comprehensive database of SNPshttp://www.ncbi.nlm.nih.gov/SNP/

dbEST—comprehensive database of expressed sequence tagshttp://www.ncbi.nlm.nih.gov/dbEST/index.html

dbSTS—database of sequence and mapping data for sequencetagged sites

http://www.ncbi.nlm.nih.gov/dbSTS/index.htmlHuman Genome Browser—tools for searching and viewing the

human genomehttp://www.genome.ucsc.edu/

CHIP Bioinformation Tools—tools for accessing SNPs and geneontology

http://snpper.chip.org/GeneCards—database of human genes

http://bioinfo.weizmann.ac.il/cards/index.shtml/Gene Ontology Web site—database and search tool for gene

function and annotationhttp://www.geneontology.org/

roteomics sitesSWISS-2D PAGE

http://us.expasy.org/ch2d/Expert Proteome Analysis System

http://us.expasy.org/UCSF Protein Prospector

http://prospector.ucsf.edu/Proteomics Standard Initiative of the Human Proteome

Organizationhttp://psidev.sf.net/

Human Proteome Organizationhttp://www.hupo.org/

echnologyAgilent Technologies

http://www.agilent.comIllumina

http://www.illumina.comPerlegen

http://www.perlegen.comAffymetrix

http://www.affymetrix.comSequenom

Appendix 2. (continued)

http://www.sequenom.comParAllele

http://www.parallelebio.comBiotage

http://www.pyrosequencing.com

References1. Varmus H. Getting ready for gene-based medicine. N Engl J Med

2002;347:1526–1527.2. Watson JD, Crick FH. Molecular structure of nucleic acids; a

structure for deoxyribose nucleic acid. Nature 1953;171:737–738.

3. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, BaldwinJ, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D,Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R,McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C,Morris W, Naylor J, Raymond C, Rosetti M, Santos R, SheridanA, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A,Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D,Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P,Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S,Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A,Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R,Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, HillierLW, McPherson JD, Marra MA, Mardis ER, Fulton LA, ChinwallaAT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD,Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, JohnsonDL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P,Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, OlsenA, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, MuznyDM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM,Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL,Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, ToyodaA, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissen-bach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T,Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L,Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A,Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, WangJ, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW,Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J,Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C,Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M,Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Rein-hardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H,Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA,Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB,Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T,Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C,Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W,Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ,Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaghtA, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP,Schuler G, Schultz J, Slater G, Smit AF, Stupka E, SzustakowskiJ, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R,Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, GuyerMS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A,Morgan MJ, de Jong P, Catanese JJ, Osoegawa K, Shizuya H,Choi S, Chen YJ; International Human Genome SequencingConsortium. Initial sequencing and analysis of the human ge-nome. Nature 2001;409:860–921.

4. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG,Smith HO, Yandell M, Evans CA, Holt RA, Gocayne JD, Amanati-

des P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD,

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1749

Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD,Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, NadeauJ, McKusick VA, Zinder N, Levine AJ, Roberts RJ, Simon M,Slayman C, Hunkapiller M, Bolanos R, Delcher A, Dew I, FasuloD, Flanigan M, Florea L, Halpern A, Hannenhalli S, Kravitz S,Levy S, Mobarry C, Reinert K, Remington K, Abu-Threideh J,Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M,Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, DiFrancesco V, Dunn P, Eilbeck K, Evangelista C, Gabrielian AE,Gan W, Ge W, Gong F, Gu Z, Guan P, Heiman TJ, Higgins ME, JiRR, Ke Z, Ketchum KA, Lai Z, Lei Y, Li Z, Li J, Liang Y, Lin X, LuF, Merkulov GV, Milshina N, Moore HM, Naik AK, Narayan VA,Neelam B, Nusskern D, Rusch DB, Salzberg S, Shao W, Shue B,Sun J, Wang Z, Wang A, Wang X, Wang J, Wei M, Wides R, XiaoC, Yan C, Yao A, Ye J, Zhan M, Zhang W, Zhang H, Zhao Q, ZhengL, Zhong F, Zhong W, Zhu S, Zhao S, Gilbert D, Baumhueter S,Spier G, Carter C, Cravchik A, Woodage T, Ali F, An H, Awe A,Baldwin D, Baden H, Barnstead M, Barrow I, Beeson K, BusamD, Carver A, Center A, Cheng ML, Curry L, Danaher S, DavenportL, Desilets R, Dietz S, Dodson K, Doup L, Ferriera S, Garg N,Gluecksmann A, Hart B, Haynes J, Haynes C, Heiner C, HladunS, Hostin D, Houck J, Howland T, Ibegwam C, Johnson J, KalushF, Kline L, Koduru S, Love A, Mann F, May D, McCawley S,McIntosh T, McMullen I, Moy M, Moy L, Murphy B, Nelson K,Pfannkoch C, Pratts E, Puri V, Qureshi H, Reardon M, RodriguezR, Rogers YH, Romblad D, Ruhfel B, Scott R, Sitter C, Small-wood M, Stewart E, Strong R, Suh E, Thomas R, Tint NN, Tse S,Vech C, Wang G, Wetter J, Williams S, Williams M, Windsor S,Winn-Deen E, Wolfe K, Zaveri J, Zaveri K, Abril JF, Guigo R,Campbell MJ, Sjolander KV, Karlak B, Kejariwal A, Mi H, Laza-reva B, Hatton T, Narechania A, Diemer K, Muruganujan A, GuoN, Sato S, Bafna V, Istrail S, Lippert R, Schwartz R, Walenz B,Yooseph S, Allen D, Basu A, Baxendale J, Blick L, Caminha M,Carnes-Stine J, Caulk P, Chiang YH, Coyne M, Dahlke C, Mays A,Dombroski M, Donnelly M, Ely D, Esparham S, Fosler C, Gire H,Glanowski S, Glasser K, Glodek A, Gorokhov M, Graham K,Gropman B, Harris M, Heil J, Henderson S, Hoover J, Jennings D,Jordan C, Jordan J, Kasha J, Kagan L, Kraft C, Levitsky A, LewisM, Liu X, Lopez J, Ma D, Majoros W, McDaniel J, Murphy S,Newman M, Nguyen T, Nguyen N, Nodell M, Pan S, Peck J,Peterson M, Rowe W, Sanders R, Scott J, Simpson M, Smith T,Sprague A, Stockwell T, Turner R, Venter E, Wang M, Wen M, WuD, Wu M, Xia A, Zandieh A, Zhu X. The sequence of the humangenome. Science 2001;291:1304–1351.

5. Guttmacher AE, Collins FS. Welcome to the genomic era. N EnglJ Med 2003;349:996–998.

6. Collins FS. The human genome project and the future of medi-cine. Ann N Y Acad Sci 1999;882:42–55; discussion 56–65.

7. Guttmacher AE, Collins FS. Genomic medicine—a primer. N EnglJ Med 2002;347:1512–1520.

8. Posadas EM, Simpkins F, Liotta LA, MacDonald C, Kohn EC.Proteomic analysis for the early detection and rational treatmentof cancer—realistic hope? Ann Oncol 2005;16:16–22.

9. Kaprio J. Science, medicine, and the future: genetic epidemiol-ogy. BMJ 2000;320:1257–1259.

10. Altman RB. Bioinformatics in support of molecular medicine.Proc AMIA Symp 1998:53–61.

11. Kruglyak L, Nickerson DA. Variation is the spice of life. NatGenet 2001;27:234–236.

12. Collins FS, Guyer MS, Chakravarti A. Variations on a theme:cataloging human DNA sequence variation. Science 1997;278:1580–1581.

13. Lee C. Irresistible force meets immovable object: SNP mappingof complex diseases. Trends Genet 2002;18:67–69.

14. Tabor HK, Risch NJ, Myers RM. Opinion: candidate-gene ap-proaches for studying complex genetic traits: practical consid-

erations. Nat Rev Genet 2002;3:391–397.

15. Pritchard JK. Are rare variants responsible for susceptibility tocomplex diseases? Am J Hum Genet 2001;69:124–137.

16. Weiss KM, Terwilliger JD. How many diseases does it take tomap a gene with SNPs? Nat Genet 2000;26:151–157.

17. Weissenbach J, Gyapay G, Dib C, Vignal A, Morissette J, Millas-seau P, Vaysseix G, Lathrop M. A second-generation linkagemap of the human genome. Nature 1992;359:794–801.

18. Schlotterer C. The evolution of molecular markers—just a mat-ter of fashion? Nat Rev Genet 2004;5:63–69.

19. Hugot JP, Chamaillard M, Zouali H, Lesage S, Cezard JP, Be-laiche J, Almer S, Tysk C, O’Morain CA, Gassull M, Binder V,Finkel Y, Cortot A, Modigliani R, Laurent-Puig P, Gower-Rous-seau C, Macry J, Colombel JF, Sahbatou M, Thomas G. Associ-ation of NOD2 leucine-rich repeat variants with susceptibility toCrohn’s disease. Nature 2001;411:599–603.

20. Stumpf MPH. Haplotype diversity and the block structure oflinkage disequilibrium. Trends Genet 2002;18:226–228.

21. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumen-stiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-CorderoSN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, DalyMJ, Altshuler D. The structure of haplotype blocks in the humangenome. Science 2002;296:2225–2229.

22. Cardon LR, Abecasis GR. Using haplotype blocks to map humancomplex trait loci. Trends Genet 2003;19:135–140.

23. Lavorgna G, Dahary D, Lehner B, Sorek R, Sanderson CM,Casari G. In search of antisense. Trends Biochem Sci 2004;29:88–94.

24. Morey C, Avner P. Employment opportunities for non-codingRNAs. FEBS Lett 2004;567:27–34.

25. Heard E. Recent advances in X-chromosome inactivation. CurrOpin Cell Biol 2004;16:247–255.

26. Wulfkuhle J, Espina V, Liotta L, Petricoin E. Genomic and pro-teomic technologies for individualisation and improvement ofcancer treatment. Eur J Cancer 2004;40:2623–2632.

27. Petricoin EF, Fishman DA, Conrads TP, Veenstra TD, Liotta LA.Lessons from Kitty Hawk: from feasibility to routine clinical usefor the field of proteomic pattern diagnostics. Proteomics 2004;4:2357–2360.

28. Espina V, Mehta AI, Winters ME, Calvert V, Wulfkuhle J, PetricoinEF III, Liotta LA. Protein microarrays: molecular profiling technol-ogies for clinical specimens. Proteomics 2003;3:2091–2100.

29. Orchard S, Hermjakob H, Binz PA, Hoogland C, Taylor CF, Zhu W,Julian RK Jr, Apweiler R. Further steps towards data standardi-sation: the Proteomic Standards Initiative HUPO 3(rd) annualcongress, Beijing 25–27(th) October, 2004. Proteomics 2005;5:337–339.

30. Ruddle FH. The William Allan Memorial Award address: reversegenetics and beyond. Am J Hum Genet 1984;36:944–953.

31. Royer-Pokora B, Kunkel LM, Monaco AP, Goff SC, Newburger PE,Baehner RL, Cole FS, Curnutte JT, Orkin SH. Cloning the gene foran inherited human disorder—chronic granulomatous dis-ease—on the basis of its chromosomal location. Nature 1986;322:32–38.

32. Cargill M, Daley GQ. Mining for SNPs: putting the commonvariants—common disease hypothesis to the test. Pharmacog-enomics 2000;1:27–37.

33. Collins FS, Green ED, Guttmacher AE, Guyer MS. A vision for thefuture of genomics research. Nature 2003;422:835–847.

34. Ghosh S, Collins FS. The geneticist’s approach to complexdisease. Annu Rev Med 1996;47:333–353.

35. Romero R, Kuivaniemi H, Tromp G, Olson JM. The design,execution, and interpretation of genetic association studies todecipher complex diseases. Am J Obstet Gynecol 2002;187:1299–1312.

36. Devlin B, Roeder K. Genomic control for association studies.

Biometrics 1999;55:997–1004.

1750 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

37. Hirschhorn JN, Altshuler D. Once and again—issues surround-ing replication in genetic association studies (editorial). J ClinEndocrinol Metab 2002;87:4438–4441.

38. Hirschhorn JN, Daly MJ. Genome-wide association studies forcommon diseases and complex traits. Nat Rev Genet 2005;6:95–108.

39. Wang W, Barratt BJ, Clayton DG, Todd JA. Genome-wide asso-ciation studies: theoretical and practical concerns. Nat RevGenet 2005;6:109–118.

40. Schaid DJ. Likelihoods and TDT for the case-parents design.Genet Epidemiol 1999;16:250–260.

41. Groden J, Thliveris A, Samowitz W, Carlson M, Gelbert L, Albert-sen H, Joslyn G, Stevens J, Spirio L, Robertson M, et al. Iden-tification and characterization of the familial adenomatous pol-yposis coli gene. Cell 1991;66:589–600.

42. Kinzler KW, Vogelstein B. Lessons from hereditary colorectalcancer. Cell 1996;87:159–170.

43. Beroud C, Collod-Beroud G, Boileau C, Soussi T, Junien C. UMD(universal mutation database): a generic software to build andanalyze locus-specific databases. Hum Mutat 2000;15:86–94.

44. Feder JN, Gnirke A, Thomas W, Tsuchihashi Z, Ruddy DA,Basava A, Dormishian F, Domingo R Jr, Ellis MC, Fullan A,Hinton LM, Jones NL, Kimmel BE, Kronmal GS, Lauer P, Lee VK,Loeb DB, Mapa FA, McClelland E, Meyer NC, Mintier GA, MoellerN, Moore T, Morikang E, Wolff RK. A novel MHC class I-like geneis mutated in patients with hereditary haemochromatosis. NatGenet 1996;13:399–408.

45. Njajou OT, Vaessen N, Joosse M, Berghuis B, van Dongen JW,Breuning MH, Snijders PJ, Rutten WP, Sandkuijl LA, Oostra BA,van Duijn CM, Heutink P. A mutation in SLC11A3 is associatedwith autosomal dominant hemochromatosis. Nat Genet 2001;28:213–214.

46. Camaschella C, Roetto A, Cali A, De Gobbi M, Garozzo G, CarellaM, Majorano N, Totaro A, Gasparini P. The gene TFR2 is mutatedin a new type of haemochromatosis mapping to 7q22. NatGenet 2000;25:14–15.

47. Papanikolaou G, Samuels ME, Ludwig EH, MacDonald ML,Franchini PL, Dube MP, Andres L, MacFarlane J, Sakellaropou-los N, Politou M, Nemeth E, Thompson J, Risler JK, ZaborowskaC, Babakaiff R, Radomski CC, Pape TD, Davidas O, Christakis J,Brissot P, Lockitch G, Ganz T, Hayden MR, Goldberg YP. Muta-tions in HFE2 cause iron overload in chromosome 1q-linkedjuvenile hemochromatosis. Nat Genet 2004;36:77–82.

48. Yamashita C, Adams PC. Natural history of the C282Y homozy-gote for the hemochromatosis gene (HFE) with a normal serumferritin level. Clin Gastroenterol Hepatol 2003;1:388–391.

49. Andersen RV, Tybjaerg-Hansen A, Appleyard M, Birgens H, Nor-destgaard BG. Hemochromatosis mutations in the general pop-ulation: iron overload progression rate. Blood 2004;103:2914–2919.

50. Bacon BR. Hemochromatosis: diagnosis and management.Gastroenterology 2001;120:718–725.

51. Petrukhin K, Fischer SG, Pirastu M, Tanzi RE, Chernov I, DevotoM, Brzustowicz LM, Cayanis E, Vitale E, Russo JJ, et al. Map-ping, cloning and genetic characterization of the region contain-ing the Wilson disease gene. Nat Genet 1993;5:338–343.

52. Tanzi RE, Petrukhin K, Chernov I, Pellequer JL, Wasco W, RossB, Romano DM, Parano E, Pavone L, Brzustowicz LM, et al. TheWilson disease gene is a copper transporting ATPase with ho-mology to the Menkes disease gene. Nat Genet 1993;5:344–350.

53. Lutsenko S, Petris MJ. Function and regulation of the mamma-lian copper-transporting ATPases: insights from biochemicaland cell biological approaches. J Membr Biol 2003;191:1–12.

54. Tysk C, Lindberg E, Jarnerot G, Floderus-Myrhed B. Ulcerative

colitis and Crohn’s disease in an unselected population of

monozygotic and dizygotic twins. A study of heritability and theinfluence of smoking. Gut 1988;29:990–996.

55. Roth MP, Petersen GM, McElree C, Vadheim CM, Panish JF,Rotter JI. Familial empiric risk estimates of inflammatory boweldisease in Ashkenazi Jews. Gastroenterology 1989;96:1016–1020.

56. Hugot JP, Laurent-Puig P, Gower-Rousseau C, Olson JM, Lee JC,Beaugerie L, Naom I, Dupas JL, Van Gossum A, Orholm M,Bonaiti-Pellie C, Weissenbach J, Mathew CG, Lennard-Jones JE,Cortot A, Colombel JF, Thomas G. Mapping of a susceptibilitylocus for Crohn’s disease on chromosome 16. Nature 1996;379:821–823.

57. Zheng CQ, Hu GZ, Lin LJ, Gu GG. Progress in searching forsusceptibility gene for inflammatory bowel disease by positionalcloning. World J Gastroenterol 2003;9:1646–1656.

58. Ogura Y, Bonen DK, Inohara N, Nicolae DL, Chen FF, Ramos R,Britton H, Moran T, Karaliuska R, Duerr RH, Achkar JP, Brant SR,Bayless TM, Kirschner BS, Hanauer SB, Nunez G, Cho JH. Aframeshift mutation in NOD2 associated with susceptibility toCrohn’s disease. Nature 2001;411:603–606.

59. van der Linde K, Boor PP, Houwing-Duistermaat JJ, Kuipers EJ,Wilson JH, de Rooij FW. Card15 and Crohn’s disease: healthyhomozygous carriers of the 3020insC frameshift mutation. Am JGastroenterol 2003;98:613–627.

60. Ahmad T, Tamboli CP, Jewell D, Colombel JF. Clinical relevanceof advances in genetics and pharmacogenetics of IBD. Gastro-enterology 2004;126:1533–1549.

61. Satsangi J, Parkes M, Louis E, Hashimoto L, Kato N, Welsh K,Terwilliger JD, Lathrop GM, Bell JI, Jewell DP. Two stage ge-nome-wide search in inflammatory bowel disease provides evi-dence for susceptibility loci on chromosomes 3, 7 and 12. NatGenet 1996;14:199–202.

62. Satsangi J, Welsh KI, Bunce M, Julier C, Farrant JM, Bell JI,Jewell DP. Contribution of genes of the major histocompatibilitycomplex to susceptibility and disease phenotype in inflamma-tory bowel disease. Lancet 1996;347:1212–1217.

63. Ma Y, Ohmen JD, Li Z, Bentley LG, McElree C, Pressman S,Targan SR, Fischel-Ghodsian N, Rotter JI, Yang H. A genome-wide search identifies potential new susceptibility loci forCrohn’s disease. Inflamm Bowel Dis 1999;5:271–278.

64. Rioux JD, Daly MJ, Silverberg MS, Lindblad K, Steinhart H,Cohen Z, Delmonte T, Kocher K, Miller K, Guschwan S, Kulbo-kas EJ, O’Leary S, Winchester E, Dewar K, Green T, Stone V,Chow C, Cohen A, Langelier D, Lapointe G, Gaudet D, Faith J,Branco N, Bull SB, McLeod RS, Griffiths AM, Bitton A, GreenbergGR, Lander ES, Siminovitch KA, Hudson TJ. Genetic variation inthe 5q31 cytokine gene cluster confers susceptibility to Crohndisease. Nat Genet 2001;29:223–228.

65. Malaty HM, Engstrand L, Pedersen NL, Graham DY. Helicobac-ter pylori infection: genetic and environmental influences. Astudy of twins. Ann Intern Med 1994;120:982–986.

66. Thye T, Burchard GD, Nilius M, Muller-Myhsok B, Horstmann RD.Genome wide linkage analysis identifies polymorphism in thehuman interferon-� receptor affecting Helicobacter pylori infec-tion. Am J Hum Genet 2003;72:448–453.

67. Drovdlic CM, Goddard KA, Chak A, Brock W, Chessler L, King JF,Richter J, Falk GW, Johnston DK, Fisher JL, Grady WM, Leme-show S, Eng C. Demographic and phenotypic features of 70families segregating Barrett’s oesophagus and oesophagealadenocarcinoma. J Med Genet 2003;40:651–656.

68. Gelfand MD. Barrett esophagus in sexagenarian identical twins.J Clin Gastroenterol 1983;5:251–253.

69. Winters C Jr, Spurling TJ, Chobanian SJ, Curtis DJ, Esposito RL,Hacker JF III, Johnson DA, Cruess DF, Cotelingam JD, GurneyMS, et al. Barrett’s esophagus. A prevalent, occult complicationof gastroesophageal reflux disease. Gastroenterology 1987;92:

118–124.

1

1

November 2005 AMERICAN GASTROENTEROLOGICAL ASSOCIATION 1751

70. Bortolotti M. Natural history of gastro-oesophageal reflux dis-ease: the neglected factor. Scand J Gastroenterol 2003;38:1204–1208.

71. Romero Y, Cameron AJ, Locke GR III, Schaid DJ, Slezak JM,Branch CD, Melton LJ III. Familial aggregation of gastroesopha-geal reflux in patients with Barrett’s esophagus and esophagealadenocarcinoma. Gastroenterology 1997;113:1449–1456.

72. Cameron AJ, Lagergren J, Henriksson C, Nyren O, Locke GR III,Pedersen NL. Gastroesophageal reflux disease in monozygoticand dizygotic twins. Gastroenterology 2002;122:55–59.

73. Hu FZ, Preston RA, Post JC, White GJ, Kikuchi LW, Wang X, LealSM, Levenstien MA, Ott J, Self TW, Allen G, Stiffler RS, McGrawC, Pulsifer-Anderson EA, Ehrlich GD. Mapping of a gene forsevere pediatric gastroesophageal reflux to chromosome13q14. JAMA 2000;284:325–334.

74. Not T, Horvath K, Hill ID, Partanen J, Hammed A, Magazzu G,Fasano A. Celiac disease risk in the USA: high prevalence ofantiendomysium antibodies in healthy blood donors. Scand JGastroenterol 1998;33:494–498.

75. Greco L, Romino R, Coto I, Di Cosmo N, Percopo S, Maglio M,Paparo F, Gasperi V, Limongelli MG, Cotichini R, D’Agate C,Tinto N, Sacchetti L, Tosi R, Stazi MA. The first large populationbased twin study of coeliac disease. Gut 2002;50:624–628.

76. Maki M, Holm K, Lipsanen V, Hallstrom O, Viander M, Collin P,Savilahti E, Koskimies S. Serological markers and HLA genesamong healthy first-degree relatives of patients with coeliacdisease. Lancet 1991;338:1350–1353.

77. Spurkland A, Sollid LM, Ronningen KS, Bosnes V, Ek J, VartdalF, Thorsby E. Susceptibility to develop celiac disease is primarilyassociated with HLA-DQ alleles. Hum Immunol 1990;29:157–165.

78. Polvi A, Arranz E, Fernandez-Arquero M, Collin P, Maki M, SanzA, Calvo C, Maluenda C, Westman P, de la Concha EG, PartanenJ. HLA-DQ2-negative celiac disease in Finland and Spain. HumImmunol 1998;59:169–175.

79. Liu J, Juo SH, Holopainen P, Terwilliger J, Tong X, Grunn A, BritoM, Green P, Mustalahti K, Maki M, Gilliam TC, Partanen J.Genomewide linkage analysis of celiac disease in Finnish fam-ilies. Am J Hum Genet 2002;70:51–59.

80. Zhong F, McCombs CC, Olson JM, Elston RC, Stevens FM,McCarthy CF, Michalski JP. An autosomal screen for genes thatpredispose to celiac disease in the western counties of Ireland.Nat Genet 1996;14:329–333.

81. Greco L, Corazza G, Babron MC, Clot F, Fulchignoni-Lataud MC,Percopo S, Zavattari P, Bouguerra F, Dib C, Tosi R, Troncone R,Ventura A, Mantavoni W, Magazzu G, Gatti R, Lazzari R, GiuntaA, Perri F, Iacono G, Cardi E, de Virgiliis S, Cataldo F, De AngelisG, Musumeci S, Clerget-Darpoux F, et al. Genome search inceliac disease. Am J Hum Genet 1998;62:669–675.

82. Greco L, Babron MC, Corazza GR, Percopo S, Sica R, Clot F,Fulchignoni-Lataud MC, Zavattari P, Momigliano-Richiardi P,Casari G, Gasparini P, Tosi R, Mantovani V, De Virgiliis S, IaconoG, D’Alfonso A, Selinger-Leneman H, Lemainque A, Serre JL,Clerget-Darpoux F. Existence of a genetic risk factor on chromo-some 5q in Italian coeliac disease families. Ann Hum Genet2001;65:35–41.

83. King AL, Yiannakou JY, Brett PM, Curtis D, Morris MA, DearloveAM, Rhodes M, Rosen-Bronson S, Mathew C, Ellis HJ, CiclitiraPJ. A genome-wide family-based linkage study of coeliac dis-ease. Ann Hum Genet 2000;64:479–490.

84. Whorwell PJ, McCallum M, Creed FH, Roberts CT. Non-colonicfeatures of irritable bowel syndrome. Gut 1986;27:37–40.

85. Locke GR III, Zinsmeister AR, Talley NJ, Fett SL, Melton LJ III.Familial association in adults with functional gastrointestinal

disorders. Mayo Clin Proc 2000;75:907–912.

86. Morris-Yates A, Talley NJ, Boyce PM, Nandurkar S, Andrews G.Evidence of a genetic contribution to functional bowel disorder.Am J Gastroenterol 1998;93:1311–1317.

87. Levy RL, Jones KR, Whitehead WE, Feld SI, Talley NJ, Corey LA.Irritable bowel syndrome in twins: heredity and social learningboth contribute to etiology. Gastroenterology 2001;121:799–804.

88. Lesch KP, Bengel D, Heils A, Sabol SZ, Greenberg BD, Petri S,Benjamin J, Muller CR, Hamer DH, Murphy DL. Association ofanxiety-related traits with a polymorphism in the serotonin trans-porter gene regulatory region. Science 1996;274:1527–1531.

89. Prather CM, Camilleri M, Zinsmeister AR, McKinzie S, Thom-forde G. Tegaserod accelerates orocecal transit in patients withconstipation-predominant irritable bowel syndrome. Gastroen-terology 2000;118:463–468.

90. Viramontes BE, Camilleri M, McKinzie S, Pardi DS, Burton D,Thomforde GM. Gender-related differences in slowing colonictransit by a 5-HT3 antagonist in subjects with diarrhea-predom-inant irritable bowel syndrome. Am J Gastroenterol 2001;96:2671–2676.

91. Camilleri M, Atanasova E, Carlson PJ, Ahmad U, Kim HJ, Vi-ramontes BE, McKinzie S, Urrutia R. Serotonin-transporter poly-morphism pharmacogenetics in diarrhea-predominant irritablebowel syndrome. Gastroenterology 2002;123:425–432.

92. Pata C, Erdal ME, Derici E, Yazar A, Kanik A, Ulu O. Serotonintransporter gene polymorphism in irritable bowel syndrome.Am J Gastroenterol 2002;97:1780–1784.

93. Yeo A, Boyd P, Lumsden S, Saunders T, Handley A, Stubbins M,Knaggs A, Asquith S, Taylor I, Bahari B, Crocker N, Rallan R,Varsani S, Montgomery D, Alpers DH, Dukes GE, Purvis I, HicksGA. Association between a functional polymorphism in the se-rotonin transporter gene and diarrhoea predominant irritablebowel syndrome in women. Gut 2004;53:1452–1458.

94. Kim HJ, Camilleri M, Carlson PJ, Cremonini F, Ferber I, StephensD, McKinzie S, Zinsmeister AR, Urrutia R. Association of distinctalpha(2) adrenoceptor and serotonin transporter polymor-phisms with constipation and somatic symptoms in functionalgastrointestinal disorders. Gut 2004;53:829–837.

95. Tooke N, Pettersson M. CpG methylation in clinical studies:utility, methods, and quality assurance. IVDT 2004.

96. Berg LM, Sanders R, Alderborn A. Pyrosequencing technologyand the need for versatile solutions in molecular clinical re-search. Expert Rev Mol Diagn 2002;2:361–369.

97. Tang K, Fu DJ, Julien D, Braun A, Cantor CR, Koster H. Chip-based genotyping by mass spectrometry. Proc Natl Acad SciU S A 1999;96:10016–10020.

98. Hardenbol P, Baner J, Jain M, Nilsson M, Namsaraev EA, Karlin-Neumann GA, Fakhrai-Rad H, Ronaghi M, Willis TD, LandegrenU, Davis RW. Multiplexed genotyping with sequence-tagged mo-lecular inversion probes. Nat Biotechnol 2003;21:673–678.

99. Hardenbol P, Yu F, Belmont J, Mackenzie J, Bruckner C, Brund-age T, Boudreau A, Chow S, Eberle J, Erbilgin A, Falkowski M,Fitzgerald R, Ghose S, Iartchouk O, Jain M, Karlin-Neumann G,Lu X, Miao X, Moore B, Moorhead M, Namsaraev E, PasternakS, Prakash E, Tran K, Wang Z, Jones HB, Davis RW, Willis TD,Gibbs RA. Highly multiplexed molecular inversion probe geno-typing: over 10,000 targeted SNPs genotyped in a single tubeassay. Genome Res 2005;15:269–275.

00. Oliphant A, Barker DL, Stuelpnagel JR, Chee MS. Bead arraytechnology: enabling an accurate, cost-effective approach tohigh-throughput genotyping. Biotechniques 2002;32(Suppl):S56–S61.

01. Gunderson KL, Kruglyak S, Graige MS, Garcia F, Kermani BG,Zhao C, Che D, Dickinson T, Wickham E, Bierle J, Doucet D,Milewski M, Yang R, Siegmund C, Haas J, Zhou L, Oliphant A,Fan JB, Barnard S, Chee MS. Decoding randomly ordered DNA

arrays. Genome Res 2004;14:870–877.

1

1

1

1

1

1

1

1

1

1

1

AA

AM

L

1752 AMERICAN GASTROENTEROLOGICAL ASSOCIATION GASTROENTEROLOGY Vol. 129, No. 5

02. Reinders J, Lewandrowski U, Moebius J, Wagner Y, Sickmann A.Challenges in mass spectrometry-based proteomics. Proteom-ics 2004;4:3686–3703.

03. Winawer S, Fletcher R, Rex D, Bond J, Burt R, Ferrucci J, GaniatsT, Levin T, Woolf S, Johnson D, Kirk L, Litin S, Simmang C.Colorectal cancer screening and surveillance: clinical guidelinesand rationale—update based on new evidence. Gastroenterol-ogy 2003;124:544–560.

04. Bergquist A, Lindberg G, Saarinen S, Broome U. Increasedprevalence of primary sclerosing cholangitis among first-degreerelatives. J Hepatol 2005;42:252–256.

05. Food and Drug Administration. Medical devices: clinical chem-istry and clinical toxicology devices: drug metabolizing enzymegenotyping system. Fed Regist 2005;46:11865–11867.

06. Ross JS, Schenkein DP, Kashala O, Linette GP, Stec J, Sym-mans WF, Pusztai L, Hortobagyi GN. Pharmacogenomics. AdvAnat Pathol 2004;11:211–220.

07. Collins FS, McKusick VA. Implications of the Human GenomeProject for medical science. JAMA 2001;285:540–544.

08. Yamamoto T, Davis CG, Brown MS, Schneider WJ, Casey ML,Goldstein JL, Russell DW. The human LDL receptor: a cysteine-rich protein with multiple Alu sequences in its mRNA. Cell

1984;39:27–38. P

09. Pearson TA. The epidemiologic basis for population-wide cho-lesterol reduction in the primary prevention of coronary arterydisease. Am J Cardiol 2004;94:4F–8F.

10. Giardiello FM, Brensinger JD, Petersen GM, Luce MC, HylindLM, Bacon JA, Booker SV, Parker RD, Hamilton SR. The use andinterpretation of commercial APC gene testing for familial ad-enomatous polyposis. N Engl J Med 1997;336:823–827.

11. Guttmacher AE, Collins FS, Carmona RH. The family history—moreimportant than ever. N Engl J Med 2004;351:2333–2336.

12. Riegert-Johnson DL, Korf BR, Alford RL, Broder MI, Keats BJ,Ormond KE, Pyeritz RE, Watson MS. Outline of a medical geneticscurriculum for internal medicine residency training programs.Genet Med 2004;6:543–547.

Address requests for reprints to: Chair, Future Trends Committee,GA National Office, c/o Membership Department, 4930 Del Rayvenue, Bethesda, Maryland 20814. Fax: (301) 654-5920.This report was prepared by Dr. Lazaridis under the direction of the

GA Future Trends Committee. It was approved by the committee onay 15, 2005.Members of the AGA Future Trends Committee include Nicholas F.

aRusso (chair), Juan R. Malagelada, Walter J. McDonald, Pankaj J.

asricha, Suzanne Rose, and Michael Lee Weinstein.