Upload
aniqah-zulfa
View
130
Download
1
Tags:
Embed Size (px)
Citation preview
International medical university
Introduction to Bioinformatics
Drug Design Using Bioinformatics
Aniqah Zulfa Binti Abdul Latif
MB0710029885
Medical Biotechnology 1/10
Table of Contents
Content Page
Introduction 2
Drug Discovery 3
Bioinformatics Approach to Drug
Design
4
Sequence Annotation Databases 5
Structure Prediction 6
High-Throughput and Virtual
Screening
1. High-Throughput Screening
2. Virtual Screening
a. Docking Screening
b. Similarity Screening
7
ADMET 10
Future Outlooks 11
References 11
INTRODUCTION
1
Bioinformatics is a field that associates computer science with the pure science field such as
biology, chemistry and medicine. They play an important role in organizing, managing and interpreting
data from biological information. Terms like genomics and proteomics are the backbone of the field of
bioinformatics. In this essay, let us dive in to see the development of the bioinformatics in drug designing
process. [1]
Bioinformatics has grown so well that its presence has transformed the customary approaches of
the drug designing and development. In our time, the approaches to the drug designing and development
have been increasingly favoring the computational methodologies. Methods such as high-throughput
screening, microarray, two-dimensional (2D) gel experiments, large-scale mass spectrophotometry and
chemical library screens are acknowledged due to its contribution in introducing many potential and
reliable drugs to the community. Despite the molecular and chemical understanding of certain drug
development and designing, these methods too have been used to speed up the overall process of drug
discovery.[2]
Cited 1 - http://www.ittc.ku.edu/bioinfo_seminar/F07.html
Since, there is a drastic increase of computational usage in scientific researches, the major
challenge scientists face nowadays is not in collecting the data, but, in the interpreting, analyzing,
recovering and also in the storage of the data. Most of the scientific data are collected in large-scale
database. Such databases contain many experimental results, gene sequences, mutations and millions of
nucleotide polymorphisms. For example, GenBank contains 39,000,000 genomes, 43 billion bases and
occupying 100 gigabytes of disk space. There are more than 1,000 viral genomes, 200 bacterial genomes
and more than a dozen eukaryotic genomes have been sequenced. Finally, database called PubMed
2
contained 15 million abstracts from more than 4,600 journals occupying more than 40 gigabytes of
textual data. [3]
Scientists have been working side by side with computer scientists to help in managing this so
called “data explosion”. Thus, this collaboration has led to the rise of two new arena in information
science; bioinformatics and cheminformatics. Cheminformatics touches more on the chemistry basis and
in case of drug designing; chemistry is the backbone of it. Hence, from the collaboration of both fields,
scientist is able to predict the pharmaceutical importance of a drug by retrieving and visualizing the
storage experimental data. [4]
In this essay, we will take a look on the bioinformatics features that are significant in
pharmaceutical researches, specifically in drug designing and development. Since the features of
bioinformatics in pharmaceutical researches are so wide, we will only concentrate on the bioinformatics
tools that apply on the important pharmaceutical factors, especially the structural prediction of the target
drugs. Besides that, we will also discuss on the prediction and understanding of the metabolism and
toxicity of the drugs using bioinformatics resources and relevant software. [5]
Drug Discovery
For a drug to have high efficacy and potency it should be as specific as possible and the side
effects are as low as possible. Therefore, good chemists should be able to identify the drug’s target before
designing it. The drug should be design according to the specificity of the drug’s target and its action. For
example, the protein protease; an enzyme that catabolized proteins. Protease is an important enzyme that
helps in many metabolic activities in the body. However, it also plays an important role in human
3
diseases. Take as example, the Human Immunodeficiency Virus (HIV) in AIDS. This virus makes use of
protease to break down healthy proteins and use them as a precursor for the development of new viruses.
In case of osteoporosis, osteoclast cells that stick onto the bone surface produce proteases that make
bones more fragile. Therefore, in the case of protease, the drugs that are design should be specifically act
to inhibit the actions of the enzyme protease. However, the major challenge is to have enough specificity
and lower the possible side effects of the drug. [6]
Earlier, most of the human genomes were still unknown and not yet discovered. Thus, the drug
development had been constrained to a small percentage of possible drug’s targets. Thanks to
bioinformatics, the task of selecting drug targets are highly lightened as more and more genome
sequences were identified and stored in the genes databases.[7]
In dealing with the drug design process, it is also important to understand the function of the
proteins that make up that particular drug. In order to achieve this, bioinfomaticist will perform a
computational analysis that can predict the three-dimensional structure (3D) of the proteins. Important
software tools can be used in order to generate the 3D structure with a desired epitopes coordinates.[8]
Bioinformatics Approach to Drug Design
In the bioinformatics of drug design, it can take numerous of approaches in order to develop
significant and reliable drugs. The approaches include: [9]
4
Identification and characterization of gene
Identification and characterization of
proteinMolecular phylogeny Determination of
protein structure
Analysis and finding of promoter Identification and
analysis of splice siteAnalysis of genome and
proteomeDetermination of protein structure
Identification of transcription factor
binding siteSimulation of biochemical
Analysis of DNA microarray
Analysis and identification of motif
Sequence Annotation Databases
In pharmaceutical research, it is important to understand and interpret the gene and protein
sequences of particular organism in order to have an overview on the possible protein drug targets. For
example, the regular sequencing of bacteria, parasites and other pathogenic organisms can really help
scientist to identify its pathogenicity. Moreover, performing sequences on mammalian’s genomes has
helped in categorizing various drug-metabolizing enzymes and the gene information is used widely to
study and understand protein expressions in many pharmacology and toxicology experiments. From the
sequence annotation data, we are able to predict the proteins in which the drug acts upon, the mechanisms
of the drug and the metabolism of the drug. [10]
There are two main providers that offer sequence annotation data and they are:
National Centre for Biotechnology Information (NCBI)
European Bioinformatics Institute (EBI)
In general, NCBI offers data that is DNA-rich and EBI offers protein-rich information.
Some of other sequence annotation databases are as follow:
DatabasesGenBankGenBank StatsEnsemblEntrezGeneUCSC-GoldenPathRefSeqSwissProtUniProtTrEMBLGeneCardsMouse genome database (MGD)Rat genome database (RGD)MAGPIE/BLUEJAYSymAtlasCypAlleles DBDirectory of P-450 containing systemsCytochrome P-450 interaction tableHuman membrane transporter database (HMTD)Transporter pageHuman ABC transporter database
5
Structure Prediction
One of the applications of bioinformatics in drug designing processes is to achieve an
understanding about the connection between the amino acid sequence and protein’s 3D structure. The
structure of the protein can give the overview of how the protein will function. As a result, the most vital
approach that needs to be taken in consideration is the identification and the classification of protein. This
is due to the need to visualize the 2D and 3D structure of a particular protein. Hence, through this method
that protein structure shall be predicted. [11]
The process of drug designing is facilitated by understanding the structure of the target protein.
The prediction starts by identifying the amino acid sequences and genes before going to the purified
protein. Thus, this results in more accurate prediction of the protein. [11]
Thanks to bioinformatics, there have been various databases that offer lists of 3D structure of
various proteins and macromolecules. For example for such databases are, molecular modeling database
(MMDB) and protein data bank (PDB). [12]
The methods in in which the structure of the proteins is predicted are categorized into three
standard methods. They are:
De novo prediction is used when the protein sequences have little or no structure similar to it. It is
done based on the chemistry and physics of the protein structure. Secondly, the prediction based on
6
Ab initio / de novo prediction Homology modeling Fold recognition
(threading)
homology modeling is done by comparing with homologous sequence which in turn will produce similar
structures. However, not all homologous sequence will produce the similar structure that we need.
Thirdly, the threading method or fold recognition method is used to predict the protein structure when two
proteins have similar three-dimensional structure but they have distinct primary sequence. Hence, this
method can verify the unknown structural alignment. MAMMOTH and SCOP are some of the programs
that are used in structural alignment. [12]
High-Throughput and Virtual Drug Screening
The next step after the drug’s identification, structure prediction and functional recognition, they
need to be tested for their efficiency in vivo as well as in vitro. Therefore, there are several approaches
that can be done in order to put the drugs to screening. They can be classified into high-throughput
screening and virtual drug screening.[13]
High-Throughput Screening
High-throughput screening is the traditional approach that is done upon a drug to recognize its
activities. This method involves the use of chemicals that are tested systematically upon the drugs in
vitro. The whole process of high-throughput screening is an automated process whereby 100,000
molecules can be screened per day. The media that the drugs are tested upon could include the use of
organism or cell-based testing.[13]
Virtual Drug Screening
Virtual drug screening is an expensive yet precise approach for the testing of drug’s activity. This
method uses different and unrelated databases which provide all the sequence and structure information
of genes. It uses the gene’s information and sequence to predict the 3D structure of proteins and give ideal
virtual screening. The most precise virtual screening is achieved based on the accuracy and the degree of
completion in data. [13]
7
Virtual screening includes several methods and two of them are:
Docking-Based Virtual Screening
Similarity-Based Virtual Screening
Docking-Based Virtual Screening
This method of virtual screening includes the identification
and characterization of the binding sites of the drug’s target
proteins. The surface of the proteins that make up the
drug’s targets can be visualized by using modeling
programs such as DOCK and AUTODOCK. Significantly,
this programs use various databases such as ZINC to
identify potential ligands that can bind to the binding sites
of the proteins. Moreover, this approach of drug screening
visualizes the protein’s side chains conformation in the selection of ligands and characterized them as
conserved or non-conserved. Conserved side chains are mainly found in various proteins’ binding sites
and therefore are non-specific. On the other hand, non-conserved side chains are expected to be more
specific. Thus, we need to identify the degree of specificity of the ligands that target the protein’s binding
sites. [14]
The significant of this in drug designing is that, if we assume that the drugs that we want to
design is the ligands, hence, we can use this approach to know the degree of specificity of the drugs that
we designed on the targeted protein’s binding sites.[14]
8
Similarity-Based Virtual Screening
This method of virtual screening includes the small molecule alignment in which test ligands are
screened through known ligands databases and the most similar known ligands can be the reference to the
test ligands. The similarity of the ligand’s alignments is scored based on the molecular groups
overlapping. Examples of the programs that make use of this concept are GASO and FlexS. [15]
In addition to ligand’s alignment, the identification of the ligand’s binding site structure can also
be used to recognize the possible drug targets. This approach makes use both the protein structure
databases as well as the ligand binding affinity databases. Binding Database is one of the examples of
ligand binding affinity database. From these two concepts, we shall examine proteins that have
comparable functions and from the fact that proteins which have similar functions could also possess the
similar binding areas, we can predict the interaction of the ligands and the targets of our drugs of interest.
Relibase is a tool that is used to analyze the reaction of ligand
binding and can sets out the significant data which includes the
binding pockets conformation, interactions of water molecules
and the degree of specificity of the ligands. [15]
On top of that, databases such as Comprehensive
Medicinal Chemistry Database and MACCS-II Drug Data Report are specialized designed screening
libraries in which they are able to give the performance report of the drug molecules in vivo. Such
databases also include the chemical properties of the drug molecules for instance the properties of the
hydrogen binding, log P, the molecular weight and also the possible attachment of certain functional
groups. [15]
9
ADMET
Bioinformatics play a very important role in terms of describing the ADMET of drugs in drug
designing and development. Many clinical trials of drugs failed to describe the ADMET of the drugs in
such details. This is due to the fact that the ADMET of a drug is an extremely complicated picture
whereby scientists need to understand the mechanisms of action of the compound from its entrance to the
digestive tract to its target. In between, many chemical reactions are taking place and each details of it are
crucial in the predicting the ADMET of that compound. [16]
Therefore, what bioinformatics do in this case is just to predict the ADMET based on the
collected data for instance, the size of the compound, lipophilicity properties and the presence of probable
functional groups. From this information, QSAR (Quantitative Structure-Activity Relationship) model
can be build. QSAR model is a process attempt to quantitatively associate the structural and properties of
a compound with a well-defined process; in this case, it’s a biological process. [16]
There are various QSAR programs that are created to specifically predict the ADMET of a
compound. For example of such programs is ADMET Predictor from Simulation Plus. [16]
1 - ADMET Predictive Software
Due to the fact that predicting the performance of such complex system, these ADMET
prediction tools are able to give 60 to 70% of accuracy. On the other hand, certain toxicity models
somehow give more reliable results. This is because; toxicity models are designed for only one specific
type of toxicity.
10
Future Outlook of Bioinformatics in Drug Designing
Someday, it is not impossible to expect that data collected are not limited to the molecular basis
of organisms but their physiological and even their epidemiological information can be collected and
interpret. Possibly, this information can give more accurate interpretation of a specific disease in the
aspect of populations, racial or ethnic groups. From this information, we can predict the likelihood of
probable adverse effects, toxicity, and the pharmacokinetics in the distribution of population if the data
are incorporated with the high-throughput in vitro ADMET.
Due to the fact that there is a drastic increase in the development of bioinformatics, it is most expected
that there will be a new innovative era of medical and health sciences.
References
1 Special issue: Biological databases, Nucleic Acids Res., 29, 1 (2001).
2 K. Rutherford, J. Parkhill, J. Crook, T. Horsnell, et al., Bioinformatics,
16, 944 (2000).
3 A.A. Schaffer, Y.I. Wolf, C.P. Ponting, E.V. Koonin, et al.,
Bioinformatics, 15, 1000 (1999).
4 M.J. Callow, S. Dudoit, E.L. Gong, T.P. Speed, et al., Genome Res., 10,
2022 (2000).
5 http://www.genomicglossaries.com/content/chapterinfosourcestext.asp
6 http://www.scfbio-iitd.res.in/tutorial/drugdiscovery.htm
7 http://www.b-eye-network.com/view/852
8 http://www.pharmainfo.net/reviews/computer-aided-drug-design-and-
bioinformatics-current-tool-designing
9 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1609333/
10 http://www.vls3d.com/courses_talk/Villoutreix_intro_drug_design.pdf
11 http://www.mrc-lmb.cam.ac.uk/genomes/madanm/pdfs/medinfo.pdf
12 Odriguez R., Chinea G., Lopez N., Pon T., andVriend G. 1998.Homology
modeling, model and software evolution: Three related resources.
Bioinformatics 14:523-528
13 http://biospectrumindia.ciol.com/content/careers/10306091.asp
14 Ortiz, A. R., Gomez-Puertas, P., Leo-Macias, A., et al. (2006)
Computational approaches to model ligand selectivity in drug design.
Curro Top Med. Chern. 6(1). 41-55.
15 http://www.slideshare.net/bknanjwade/applications-of-bioinformatics-in-
drug-discovery-and-process-presentation
16 http://phobos.ramapo.edu/~pbagga/binf/binf_future.htm
11
12