Genes for common diseases

Genes for common diseases

Mark J. Caul®eld

Department of Clinical Pharmacology, St Bartholomew's and the Royal London School of Medicine, Charterhouse Square, London, UK

There is increasing interest in the genetic factors that may

in¯uence the development of common diseases such as

hypertension and diabetes. In this series, the editors have

sought to bring readers of the `Green' journal the state of

the art in these common complex disorders.

In terms of human genome research it is an exciting

time. There will shortly be a draft sequence of the entire

human genome available within the public domain.

During the course of re®nement of this we will get a

clearer picture of the number of genes and their locations

within the human genome. In seeking to detect the genes

for common complex diseases it is important to remember

that the effects of these genes may only be modest and may

operate in concert with environmental factors. Thus the

effect of an individual variation within a gene may be small

at a population level. This has implications for the size of

the studies required to be adequately powered to detect

a true positive effect, for example: in the case of hyper-

tension, 1500 families based on affected sibling pairs would

be required to detect ®ve genes each contributing 6% to

blood pressure variation. Accordingly when evaluating

studies published on complex traits, one of the key

questions to be answered in the mind of the reader will be

whether the study has been adequately powered to detect

a true effect.

The progress that has been made towards understanding

the genetics of these common diseases will be outlined

in two articles. The ®rst is by Kevin O'Shaughnessy on

the Genetics of Hypertension and the second by Mark

McCarthy & Stephan Menzel on Type II diabetes. In both

these articles the authors describe how the tools of the

genomic trade have been applied to understand the genetic

base of common disease.

The tools are those genetic variations or polymorphisms

which each of us has about every 500 bases within our

genome. These polymorphisms or variations can be used

to identify disease-causing variants and explain genetic

in¯uence on a trait [1].

The most abundant form of polymorphisms are single

nucleotide, or base, changes in the code. These single

nucleotide polymorphisms or SNPs (pronounced snips)

have become a source of great interest and a consortium

between the pharmaceutical industry and private investors

has been formed to identify 300 000 new SNPs by May

2001. This $45 000 000 US investment has already

identi®ed 41 209 unique SNPs which can be found on

the Cold Spring Harbour website at http://snp.cshl.org/.

These SNPs will become not only increasingly important

in mapping disease genes but also in predicting drug

response and so are of considerable interest to the

pharmaceutical industry.

In addition to these simple polymorphisms, there are

more complex, highly polymorphic markers, which have

been widely employed to map disease genes in common

disorders. These highly polymorphic markers are based on

repetitive sequences in DNA known as simple sequence

repeats. These may be based on di, tri nucleotide, or

tetranucleotide repetitive segments of DNA. Such poly-

morphisms are widely employed in sets spread evenly,

throughout the genome, to map regions of interest within

which genes for complex disorders may lurk. These simple

sequence repeats that comprise linkage marker sets have

been employed to date in genome wide screening.

A second type of highly polymorphic marker is a var-

iable number tandem repeat. This is based on repetitive

segments of DNA that involve tens or hundreds of

nucleotides. All of these genetic markers constitute the

tools of the trade and may be used to identify disease

causing genetic variation or polymorphism.

Study designs for understanding complex traits

There are several different types of study that can be

applied to identify a gene for a common disease [2]. The

most widely applied have been population based case-

control studies [2]. Here it is essential that the cases and

controls are drawn from the same population and that

there is no risk of ethnic heterogeneity or substrati®ca-

tion which might create a genetic artefact. Ethnicity is

particularly important as the representation of different

genetic polymorphisms is quite different between ethnic

groups. This may lead to spurious ®ndings in the analysis

of common diseases [2]. When we study a genetic

variation such as a SNP, or group of SNPs in a case control

population association study, we are relying on a principle

known as linkage disequilibrium. This basically means that

the disease causing variation will have been inherited on

Correspondence: Professor M. J. Caul®eld, Department of Clinical Pharmacology,

St Bartholomew's and the Royal London School of Medicine, Charterhouse

Square, London EC1M 6BQ.

Received 19 October 2000, accepted 2 November 2000.

f 2001 Blackwell Science Ltd Br J Clin Pharmacol, 51, 1±3 1

the same piece of DNA over time along with the SNP

under study, such that the two are physically related and do

not become separated by recombination during meiosis.

The proportion of linkage disequilibrium between a dis-

ease causing allele and the genetic variation under study is

important because it can affect the power of the study.

When the genotypic data derived from such a study are

analysed, the categoric data or genotypes are usually

arranged in a contingency table and tested using Chi

squared test with the appropriate degrees of freedom.

The second study design that is widely used is family

based linkage analysis [1]. Unlike single gene disorders,

which often present relatively early in life, so called

common complex traits will often present in the middle or

later years of life [1]. They can also exhibit variable age of

onset [1]. Therefore within a family it may be very dif®cult

to be absolutely certain that an individual who is

apparently unaffected now will not become affected in

the future by the disease of interest [2]. This has led to two

approaches in common diseases which are being applied

in both hypertension and diabetes research.

The ®rst is to choose affected siblings who are

concordant for the trait of interest, for example: two

diabetic affected siblings, or two hypertensive affected

siblings [1]. In the analysis of such a study, we would

genotype a series of usually highly polymorphic markers of

the simple sequence repeat type and analyse these data

asking the simple question, do our hypertensive or diabetic

sibling pairs share versions of this genetic marker more

often than you would expect in the general population. If

this is so this may be remarkable and may be identifying

a region of linkage within the genome. Linkage may still

be preserved at up to 50 000 000 bases of distance, which is

not an inconsiderable length of genome [1].

A second form of family based analysis is to use

discordant sibling pair analysis [1]. This is where a severely

affected individual and an unaffected individual at the

opposite end of a quantitative trait such as blood pressure,

are compared to see whether or not there are genetic

differences. This extreme discordant sibling pair analysis

may be quite powerful in a common complex trait [2].

Technological advances

Finally the technological advances that have occurred in

the last 10 years have facilitated the ability to achieve the

high throughput required to screen the numbers of

polymorphisms we need to detect common disease

causing genes. The ®rst of these advances inspired by

Kary Mullis was the polymerase chain reaction which

essentially photocopies DNA within a target sequence.

This has allowed increased throughput for genotyping

and also for sequencing to detect genetic variation. Much

of the current advances are only possible because of this

development.

Semi-automated ¯uorescence based genotyping

The second development is methods for high throughput

genotyping. The ®rst of these was semiautomated

¯uoresence based genotyping. Here, using different

coloured ¯uorescent tags, 20 markers may be combined

in a single lane on a gel, since it is perfectly possible to run

96 lanes on a gel, you can imagine that a considerable

volume of information can be derived from a single two

hour experiment [3]. The ¯uorescent tags are detected

after excitation by a laser scanning device at the base of the

gel as they migrate under electrophoresis. This ¯uores-

cence based technology has been further re®ned and is

available using capillary based systems, which offer the

advantage of a higher throughput and also mean that there

is no risk of spillover between lanes in the gel.

Microarrays and DNA chips

There have also been advances in the methods available

to detect single nucleotide polymorphisms [4]. These

techniques may be facilitated by developments in micro-

array technology where short lengths of DNA can be

applied to glass slides and combined with target DNA to

genotype the targets of interest [4]. In addition the same

process can be carried out on a DNA chip. Whether both

of these techniques prove as valuable remains to be seen.

However, there is considerable interest in whether such

high throughput SNP screening may become available in

the future [4]. Such microarray and chip technology is

currently quite expensive and therefore not widely used as

yet in academia [4]. However, microarrays have already

clearly demonstrated their value in the area of functional

genomics where one might wish to examine the effect of

expression pro®les of different genes in a tissue sample [4].

A key question at the end of all of this is what makes

O'Shaughnessy, McCarthy and Menzel spend large

segments of their life looking for genes with modest

effects on our common diseases. Their collective hope,

and mine too, is that by ®nding these genes we will be

better able to predict those at risk of these common

diseases and also, perhaps, most importantly, to develop

new treatments and to target their use in a re®ned way.

Anyone who treats patients with common diseases like

high blood pressure, will know that different patients may

respond differently to drugs. Essentially, the management

may constitute a ®shing trip in which the patient may be

exposed to several agents that prove ineffective before an

effective alternative emerges. In the subsequent articles on

hypertension and diabetes the reviewers have sought to

M. J. Caul®eld

2 f 2001 Blackwell Science Ltd Br J Clin Pharmacol, 51, 1±3

give you the state of the art, but they have not shirked

from presenting the dif®culties.

References

1 Lander ES, Schork NJ. Genetic dissection of complex traits.

Science 1994; 265: 2037±2048.

2 Risch N, Zhang H. Extreme discordant sib pairs for mapping

quantitative trait loci in humans. Science 1995; 268: 1584±1589.

3 Reed PW, Davies JL, Copeman JB, et al. Chromosome-speci®c

microsatellite sets for ¯uorescence-based, semi-automated

genome mapping. Nature Genet 1994; 7: 390±395.

4 The Chipping Forecast. Nature Genet 1999; 21: 1(Suppl).

Editorial

f 2001 Blackwell Science Ltd Br J Clin Pharmacol, 51, 1±3 3

Documents

Genes for common diseases