Upload
mark-j-caulfield
View
213
Download
0
Embed Size (px)
Citation preview
Genes for common diseases
Mark J. Caul®eld
Department of Clinical Pharmacology, St Bartholomew's and the Royal London School of Medicine, Charterhouse Square, London, UK
There is increasing interest in the genetic factors that may
in¯uence the development of common diseases such as
hypertension and diabetes. In this series, the editors have
sought to bring readers of the `Green' journal the state of
the art in these common complex disorders.
In terms of human genome research it is an exciting
time. There will shortly be a draft sequence of the entire
human genome available within the public domain.
During the course of re®nement of this we will get a
clearer picture of the number of genes and their locations
within the human genome. In seeking to detect the genes
for common complex diseases it is important to remember
that the effects of these genes may only be modest and may
operate in concert with environmental factors. Thus the
effect of an individual variation within a gene may be small
at a population level. This has implications for the size of
the studies required to be adequately powered to detect
a true positive effect, for example: in the case of hyper-
tension, 1500 families based on affected sibling pairs would
be required to detect ®ve genes each contributing 6% to
blood pressure variation. Accordingly when evaluating
studies published on complex traits, one of the key
questions to be answered in the mind of the reader will be
whether the study has been adequately powered to detect
a true effect.
The progress that has been made towards understanding
the genetics of these common diseases will be outlined
in two articles. The ®rst is by Kevin O'Shaughnessy on
the Genetics of Hypertension and the second by Mark
McCarthy & Stephan Menzel on Type II diabetes. In both
these articles the authors describe how the tools of the
genomic trade have been applied to understand the genetic
base of common disease.
The tools are those genetic variations or polymorphisms
which each of us has about every 500 bases within our
genome. These polymorphisms or variations can be used
to identify disease-causing variants and explain genetic
in¯uence on a trait [1].
The most abundant form of polymorphisms are single
nucleotide, or base, changes in the code. These single
nucleotide polymorphisms or SNPs (pronounced snips)
have become a source of great interest and a consortium
between the pharmaceutical industry and private investors
has been formed to identify 300 000 new SNPs by May
2001. This $45 000 000 US investment has already
identi®ed 41 209 unique SNPs which can be found on
the Cold Spring Harbour website at http://snp.cshl.org/.
These SNPs will become not only increasingly important
in mapping disease genes but also in predicting drug
response and so are of considerable interest to the
pharmaceutical industry.
In addition to these simple polymorphisms, there are
more complex, highly polymorphic markers, which have
been widely employed to map disease genes in common
disorders. These highly polymorphic markers are based on
repetitive sequences in DNA known as simple sequence
repeats. These may be based on di, tri nucleotide, or
tetranucleotide repetitive segments of DNA. Such poly-
morphisms are widely employed in sets spread evenly,
throughout the genome, to map regions of interest within
which genes for complex disorders may lurk. These simple
sequence repeats that comprise linkage marker sets have
been employed to date in genome wide screening.
A second type of highly polymorphic marker is a var-
iable number tandem repeat. This is based on repetitive
segments of DNA that involve tens or hundreds of
nucleotides. All of these genetic markers constitute the
tools of the trade and may be used to identify disease
causing genetic variation or polymorphism.
Study designs for understanding complex traits
There are several different types of study that can be
applied to identify a gene for a common disease [2]. The
most widely applied have been population based case-
control studies [2]. Here it is essential that the cases and
controls are drawn from the same population and that
there is no risk of ethnic heterogeneity or substrati®ca-
tion which might create a genetic artefact. Ethnicity is
particularly important as the representation of different
genetic polymorphisms is quite different between ethnic
groups. This may lead to spurious ®ndings in the analysis
of common diseases [2]. When we study a genetic
variation such as a SNP, or group of SNPs in a case control
population association study, we are relying on a principle
known as linkage disequilibrium. This basically means that
the disease causing variation will have been inherited on
Correspondence: Professor M. J. Caul®eld, Department of Clinical Pharmacology,
St Bartholomew's and the Royal London School of Medicine, Charterhouse
Square, London EC1M 6BQ.
Received 19 October 2000, accepted 2 November 2000.
f 2001 Blackwell Science Ltd Br J Clin Pharmacol, 51, 1±3 1
the same piece of DNA over time along with the SNP
under study, such that the two are physically related and do
not become separated by recombination during meiosis.
The proportion of linkage disequilibrium between a dis-
ease causing allele and the genetic variation under study is
important because it can affect the power of the study.
When the genotypic data derived from such a study are
analysed, the categoric data or genotypes are usually
arranged in a contingency table and tested using Chi
squared test with the appropriate degrees of freedom.
The second study design that is widely used is family
based linkage analysis [1]. Unlike single gene disorders,
which often present relatively early in life, so called
common complex traits will often present in the middle or
later years of life [1]. They can also exhibit variable age of
onset [1]. Therefore within a family it may be very dif®cult
to be absolutely certain that an individual who is
apparently unaffected now will not become affected in
the future by the disease of interest [2]. This has led to two
approaches in common diseases which are being applied
in both hypertension and diabetes research.
The ®rst is to choose affected siblings who are
concordant for the trait of interest, for example: two
diabetic affected siblings, or two hypertensive affected
siblings [1]. In the analysis of such a study, we would
genotype a series of usually highly polymorphic markers of
the simple sequence repeat type and analyse these data
asking the simple question, do our hypertensive or diabetic
sibling pairs share versions of this genetic marker more
often than you would expect in the general population. If
this is so this may be remarkable and may be identifying
a region of linkage within the genome. Linkage may still
be preserved at up to 50 000 000 bases of distance, which is
not an inconsiderable length of genome [1].
A second form of family based analysis is to use
discordant sibling pair analysis [1]. This is where a severely
affected individual and an unaffected individual at the
opposite end of a quantitative trait such as blood pressure,
are compared to see whether or not there are genetic
differences. This extreme discordant sibling pair analysis
may be quite powerful in a common complex trait [2].
Technological advances
Finally the technological advances that have occurred in
the last 10 years have facilitated the ability to achieve the
high throughput required to screen the numbers of
polymorphisms we need to detect common disease
causing genes. The ®rst of these advances inspired by
Kary Mullis was the polymerase chain reaction which
essentially photocopies DNA within a target sequence.
This has allowed increased throughput for genotyping
and also for sequencing to detect genetic variation. Much
of the current advances are only possible because of this
development.
Semi-automated ¯uorescence based genotyping
The second development is methods for high throughput
genotyping. The ®rst of these was semiautomated
¯uoresence based genotyping. Here, using different
coloured ¯uorescent tags, 20 markers may be combined
in a single lane on a gel, since it is perfectly possible to run
96 lanes on a gel, you can imagine that a considerable
volume of information can be derived from a single two
hour experiment [3]. The ¯uorescent tags are detected
after excitation by a laser scanning device at the base of the
gel as they migrate under electrophoresis. This ¯uores-
cence based technology has been further re®ned and is
available using capillary based systems, which offer the
advantage of a higher throughput and also mean that there
is no risk of spillover between lanes in the gel.
Microarrays and DNA chips
There have also been advances in the methods available
to detect single nucleotide polymorphisms [4]. These
techniques may be facilitated by developments in micro-
array technology where short lengths of DNA can be
applied to glass slides and combined with target DNA to
genotype the targets of interest [4]. In addition the same
process can be carried out on a DNA chip. Whether both
of these techniques prove as valuable remains to be seen.
However, there is considerable interest in whether such
high throughput SNP screening may become available in
the future [4]. Such microarray and chip technology is
currently quite expensive and therefore not widely used as
yet in academia [4]. However, microarrays have already
clearly demonstrated their value in the area of functional
genomics where one might wish to examine the effect of
expression pro®les of different genes in a tissue sample [4].
A key question at the end of all of this is what makes
O'Shaughnessy, McCarthy and Menzel spend large
segments of their life looking for genes with modest
effects on our common diseases. Their collective hope,
and mine too, is that by ®nding these genes we will be
better able to predict those at risk of these common
diseases and also, perhaps, most importantly, to develop
new treatments and to target their use in a re®ned way.
Anyone who treats patients with common diseases like
high blood pressure, will know that different patients may
respond differently to drugs. Essentially, the management
may constitute a ®shing trip in which the patient may be
exposed to several agents that prove ineffective before an
effective alternative emerges. In the subsequent articles on
hypertension and diabetes the reviewers have sought to
M. J. Caul®eld
2 f 2001 Blackwell Science Ltd Br J Clin Pharmacol, 51, 1±3
give you the state of the art, but they have not shirked
from presenting the dif®culties.
References
1 Lander ES, Schork NJ. Genetic dissection of complex traits.
Science 1994; 265: 2037±2048.
2 Risch N, Zhang H. Extreme discordant sib pairs for mapping
quantitative trait loci in humans. Science 1995; 268: 1584±1589.
3 Reed PW, Davies JL, Copeman JB, et al. Chromosome-speci®c
microsatellite sets for ¯uorescence-based, semi-automated
genome mapping. Nature Genet 1994; 7: 390±395.
4 The Chipping Forecast. Nature Genet 1999; 21: 1(Suppl).
Editorial
f 2001 Blackwell Science Ltd Br J Clin Pharmacol, 51, 1±3 3