DNA-Protein Interactions Techniques

8/11/2019 DNA-Protein Interactions Techniques

1/21

DNAprotein interactions: methods for detection and analysis

Bipasha Dey

Sameer Thukral

Shruti Krishnan

Mainak Chakrobarty Sahil Gupta

Chanchal Manghani Vibha Rani

Received: 24 September 2011 / Accepted: 16 February 2012 / Published online: 8 March 2012

Springer Science+Business Media, LLC. 2012

Abstract DNA-binding proteins control various cellular

processes such as recombination, replication and transcrip-tion. This review is aimed to summarize some of the most

commonly used techniques to determine DNAprotein

interactions. In vitro techniques such as footprinting assays,

electrophoretic mobility shift assay, southwestern blotting,

yeast one-hybrid assay, phage display and proximity ligation

assay have been discussed. The highly versatile in vivo

techniques such as chromatin immunoprecipitation and its

variants, DNA adenine methyl transferase identification as

well as 3C and chip-loop assay have also been summarized.

In addition, somein silicotools have been reviewed to pro-

vide computational basis for determining DNAprotein

interactions. Biophysical techniques like fluorescence reso-

nance energy transfer (FRET) techniques, FRETFLIM,

circular dichroism, atomic force microscopy, nuclear mag-

netic resonance, surface plasmon resonance, etc. have also

been highlighted.

Keywords DNAprotein interactions Footprinting

Electrophoretic mobility shift assay Southwestern

blotting Phage display Yeast one-hybrid assay

Chromatin immunoprecipitation assay

Biophysical techniques

Introduction

Association of DNA with proteins is a phenomenon of utmost

importance. In effect, almost all aspects of cellular function,

such as transcriptional regulation, chromosome maintenance,

replication and DNA repair depend on the interaction of

proteins with DNA. Activation of genes by DNA-binding

proteins is a fundamental regulatory mechanism involving the

chromatin modifying and transcription complexes to initiate

the RNA synthesis [1]. Such DNA-binding proteins have

diverse roles and may function as structural proteins making

up the nucleosome, enzymes modulating chromatin structure

to control gene expression, transcription factors, and also as

cofactors. One of the most widely studied examples of DNA-

binding proteins is the transcription factor. TFs association

with DNA is considered to be extremely critical in develop-

ment processes and in response to environmental stresses.

Also, in humans their dysfunction can contribute to the pro-

gression of various diseases [2].

In view of such an important role played by DNA

protein interactions, various techniques have evolved over

the years to elucidate them. Each technique, with its own

advantages and drawbacks, serves a very specific purpose.

In brief, the techniques cater either of the two parts of the

interaction: protein (molecular weight, identity, domains

etc.) or DNA (general sequence, specific sequence, alter-

native sequences etc.).

This review has been focused to aptly summarize some

of the most important in vitro, in vivo, in silico and bio-

physical techniques to study DNAprotein interactions,

owing to the pivotal role played by DNA-associating pro-

teins in various cellular processes. The review shall assist a

researcher to understand and evaluate various DNApro-

tein interaction techniques and use them appropriately for

their research.

All the authors have contributed equally.

B. Dey S. Thukral S. Krishnan M. Chakrobarty

S. Gupta C. Manghani V. Rani (&)

Department of Biotechnology, Jaypee Institute of Information

Technology, A-10 Sector-62, Noida 201307,

Uttar Pradesh, India

e-mail: vibha.rani@jiit.ac.in

1 3

Mol Cell Biochem (2012) 365:279299

DOI 10.1007/s11010-012-1269-z


2/21

In vitro techniques to study DNAprotein interactions

There are several techniques to determine the in vitro

DNAprotein interactions experimentally. Some of the

well known in vitro techniques are footprinting assay,

southwestern assay, electrophoretic mobility shift assay,

yeast one-hybrid assay, phage display and proximity liga-

tion assay.

Footprinting assay

Foot printing assays are based on the principle of protec-

tion of protein-bound DNA from degradation. The tech-

nique is used to decipher the specific sequence to which a

DNA-binding protein or molecule binds. The procedure

employs chemical or enzymatic digestion of naked- and

protein bound-DNA oligomers. Both the reactions are then

compared using gel electrophoresis. The segment of the

DNA bound by the protein appears as an empty stretch

footprint in the protein-bound reaction when compared tothe continuous fragments produced by naked DNA diges-

tion (Fig. 1a).

Foot printing has been a valuable tool for elucidating

sequence specificity and dissociation constants of a variety

of ligands binding to DNA. The agent used to cleave DNA

is called the probe. The smaller the probe, the higher is the

resolution provided, but its chances for cleaving DNA

under the bound protein also increases. The enzymatic

digestion methods include the use of DNAse I, MNase [3],

methidiumpropyl-EDTAFe(II) (MPE) [4, 5], copper phe-nanthroline [6], uranyl photocleavage [7, 8], hydroxyl

radicals [913] and iron complexes [14]. Comparisons

between different probes used for foot printing provide

useful information on their relative merits and demerits

[1518].

DNAase I footprinting is the most commonly used

footprinting assay. DNAase I is a double-strand-specific

endonuclease, which binds to the minor groove to break

phosphodiester bonds. The technique was developed by

Galas and Schmitz [19] for visualizing the binding of the

lac-repressor protein to the lac-operator sequence. The

footprinting technique employs use of a single end-radio-labeled, synthetic or natural, DNA fragment. The fragment

is incubated with either crude or purified protein sample,

Fig. 1 In vitro techniques to study DNAprotein interactions

280 Mol Cell Biochem (2012) 365:279299

1 3


3/21

under appropriate binding conditions, allowing the protein

to bind to its specific DNA sequence. The protein-bound

fragment and the control (i.e. naked fragment) are then

subjected to DNAse I treatment in an appropriate buffer,

with varying concentrations and time periods. Both the

samples are then run on a denaturating polyacrylamide gel,

processed and imaged [20].

There are various key points for this technique [21].First, by altering experimental conditions, DNAse is

allowed to partially digest the fragment, assuming a single

nick per fragment. This creates a range of fragments which

differ from one another by single nucleotide, hence pro-

viding high resolution for the protein-binding sequence.

Upon examining variety of experimental footprinting gels,

it is evident that if all the sequences were cleaved without

any sequence-dependent specificity by DNAse I, all bands

would have been of similar intensity, but DNAse I has

partial sequence specificity, resulting in some sites

becoming hyper sensitive and thus showing a more intense

band. Second, end labeling of DNA is for a specific pur-pose. In a single reaction, DNAse will cut both the strands

leading to mixture of?ve and ve strand fragments which

are further separated on the denaturating gel. The purpose

of radio labeling DNA is to provide a clue as to which of

the two anti-parallel strands does the protein under con-

sideration bind. Thus, in a 50 labelled reaction only the 50

30 strand sequence information is provided on the final

exposed film. Subsequently, it is common to digest both

?ve and ve strand labeled fragments in separate tubes and

then run them along side each other. Lastly, there are a

variety of methods to analyze the final footprinting image.

These methods range from visual inspection to creating a

differential cleavage plot on the basis of densitometric

analysis. Further there are techniques available for quan-

titative analysis of binding affinity [22].

The purpose of a denaturing gel is to make sure that

fragments show up on the gel only as single-stranded DNA.

Often the naked DNA is chemically sequenced and run on

the same gel, as a marker for finding the exact sequence of

the footprint. However, some precautions must be consid-

ered. First, it is important not to titrate too much DNA with

the protein sample. This shall cause a large amount of

DNA to remain unbound from the protein and thus sus-

ceptible to DNAse I attack. This fragmented DNA shall

show up at the place of the usual footprint, making it harder

to detect. Second, incubation of crude samples with the

fragment requires the presence of competitor DNA to

exclude the non-specific binding proteins from showing a

footprint. The limitation of this technique is that it does not

provide identity of the protein [20]. Because of the large

molecular weight of DNAse I, its attack is easily sterically

hindered, by the bound protein. Other probes for nicking

DNA, like free radicals may not be hindered so easily and

hence have a chance of nicking a few bases, under the

bound protein itself.

Apart from proteins, footprinting technique is also used

to elucidate the binding of other small molecules like drugs

to DNA [22]. The modifications of this technique use

automation and capillary electrophoresis along with fluo-

rescent labeling [23, 24]. A technique using DNAse I

digestion followed by sequencing called DNAse-seq isoften used for genome-wide studies [25]. Protocols using

automated infrared sequencers, allowing long range and

highly sensitive DNAse I footprinting have been developed

[26]. Also using Streptavidin-bound oligonucleotides for

protein binding and subsequent DNAse I digestion makes

the technique more convenient [27]. The technique can

also be used to fish out a protein of interest, from a crude

sample. It is used for quantitatively measuring the disso-

ciation constant of proteinDNA interaction [2830].

Though initially in vitro, the technique has been adopted to

be used as an in vivo foot printing assay, involving per-

meabilization of cells followed by DNAse I-mediatedcleavage and ligation-mediated PCR [31,32]. DrugRNA

footprinting methods have also been developed [33]. It is

now known that Gold (Au)-DNA conjugates change their

surface plasmon resonance (SPR) wavelength depending

on the length of the DNA oligo attached. Comparing the

SPR wavelength of a control protein-bound DNAAu

conjugate, with the experimental DNAse I or any other

probe digested protein-bound DNAAu conjugate provides

information on the exact number of nucleotides from end,

where the protein under consideration is bound. This recent

advancement provides a label-free, quantitative, real-time

measurement of nuclease activity and footprint of a bound

protein without running a gel [34].

Electrophoretic mobility shift assay (EMSA)

EMSA is a relatively simple in vitro technique to study

DNAprotein interactions. Its novelty lies in its application

to deduce the binding parameters and relative affinities of a

protein for one or more DNA sites or for detecting protein

nucleic acid interactions with the aim of comparing the

affinities of different proteins for the same sites [35]. It is

based on the principle that DNAprotein complexes are

heavier and move slowly when subjected to non-denaturing

polyacrylamide or agarose gel electrophoresis as compared

to unbound free probe. Since, the rate of DNA migration is

shifted or retarded when bound to protein, the assay is also

referred to as a gel shift or gel retardation assay. The DNA

sequence is provided externally and incubated to crude cell

protein lysate. Following this, the DNA and extracted

proteins are incubated together in a binding reaction and

separated on a gel. The DNA probes used may be radio-

labeled or dyes specific to stain DNA and protein may be

Mol Cell Biochem (2012) 365:279299 281

1 3


4/21

used to visualize the DNAprotein interaction. In general

poly (dI-dC) is added to abolish any non-specific binding.

A supershift assay can be performed to specifically assert

the DNAprotein interactions by using an antibody specific

to the protein of interest. By incubating antibody along

with DNAprotein sample followed by gel separation, the

DNAproteinantibody complex can be visualized as a

supershifted band. Competition assays may also be per-formed using unlabeled specific and non-specific oligo

duplexes (Fig. 1b).

EMSA can be used qualitatively to identify sequence-

specific DNA-binding proteins in crude lysates and, in

conjunction with mutagenesis, to identify the important

binding sequences within a given gene upstream regulatory

region. EMSA can also be utilized quantitatively to mea-

sure thermodynamic and kinetic parameters. This tech-

nique poses several advantages. The most significant

benefit of EMSA is its ability to resolve complexes of

different stoichiometry or conformation. Another major

advantage is that the source of the DNA-binding proteinmay be crude nuclear or whole cell extract, in vitro tran-

scription product or a purified preparation. In addition, the

relatively low ionic strength of the electrophoresis buffer

helps to stabilize transient interactions, permitting even

labile complexes to be resolved and analyzed by this

method [3639].

An additional variation to the conventionally used

EMSA is capillary electrophoretic mobility shift assay

(CEMSA) which allows the rapid separation and quan-

titation of DNAprotein interactions, in uncoated capil-

laries with no gel matrixes, using high-sensitivity laser-

induced-fluorescence detection of fluorescein-labeled

DNA. Capillary electrophoresis (CE) separates analytes

on the basis of their mass-to-charge ratio and elutes

complexes in the order of free protein, protein/DNA

complex, and lastly DNA [40]. A rapid and quantitative

procedure has also been developed that permit accurate

assessment of specific DNAprotein interactions on a

scale more than 100-fold, below the minimum signal

necessary for EMSA by using a laser-induced fluores-

cence detection system [41].

IDEMSA is another modification of EMSA that

combines immunodepletion with the traditional EMSA

and supershift assays. In this, the nuclear or cytoplasmic

extracts are depleted of the specific protein by incubation

with the relevant antibody and protein A-sepharose. The

depleted extracts are then analyzed for the presence of

protein by the EMSA and supershift assay. This tech-

nique poses the advantage of combining results of im-

munodepletion and supershift to determine the protein

composition of a particular proteinDNA complex and

also the localization of the dimer to a specific complex

[42].

Southwestern blotting

This technique combines the principles of southern and

western blotting and is primarily used for elucidating the

molecular weight of the protein in a proteinDNA com-

plex. Though a super shift assay, an extension of an EMSA

experiment, provides more information on the nature and

hence the molecular weight of the protein, often there areno antibodies known for the bound protein. Thus, in cases

where no preliminary knowledge of the DNA-binding

protein is available, southwestern blotting provides at least

some minimal information like molecular weight.

The experimental procedure involves, a modified wes-

tern blot using labeled oligonucleotides instead of anti-

bodies as probes. In brief, the crude or purified

cytoplasmic/nucleic/whole cell extract containing the pro-

tein of interest, is resolved on an SDS-PAGE, followed by

electrophoretic transfer of the proteins from the gel to a

membrane under conditions favouring renaturation of the

proteins. The membrane-bound proteins are then incubatedwith oligonucleotides to which the protein of interest

putatively binds. The membrane is developed, photo-

graphed and only the band corresponding to the bound

oligo appears in the final picture (Fig. 1c). Aligning the

band on the developed picture with the SDS-PAGE posi-

tion of the protein at that band, marks the protein bound to

the oligo and provides information about its molecular

weight [4346]. The SDS-PAGE provides the information

of the molecular weight, while the blotting allows the

protein to bind to the sequence. The labelling is required to

mark the spot of the bound proteinDNA complex [47].

A 2-D gel electrophoresis, instead of SDS-PAGE and

on-blot digestion of the DNA-bound protein followed by

LCMS/MS, analysis provides better information about the

molecular weight of protein [48, 49]. Non-radioactive

methods for southern blotting make the procedure less

cumbersome [50,51]. Moreover, using differently labelled

oligos on the same blot would provide information on the

binding affinity of various mutants of the oligo. The same

blot is probed with different probes by using alkaline

phosphatase to strip the signal of the bound probe [52]. A

further modification uses the southwestern blot itself as a

substrate for nuclease footprinting or other types of foot-

printing like chemical nuclease and methylation protection,

thus identifying the exact DNA sequence where the protein

binds [53]. To differentiate the specific from the non-spe-

cific binding on the blot, a rapid dimethylsulphate (DMS)

protection assay has been developed, which distinguishes

between them on the basis of conditions that specific

binding creates, making the complex impervious to DMS

[54]. Though southwestern blotting is primarily a technique

for knowing the molecular weight of protein binding to a

known DNA sequence, it can also be used to find the

282 Mol Cell Biochem (2012) 365:279299

1 3


5/21

sequence of DNA that a particular protein binds to [55].

While screening various oligos, caution is advised for

cDNA expression libraries screened by southwestern

methodologies [56]. Southwestern histochemistry is also an

important modification, allowing in situ identification and

localization of DNA-binding proteins. It uses oligonucle-

otides instead of antibodies to probe a specific protein in a

histological sample. Incubation of the labelled oligonu-cleotide with the crude or purified cytoplasmic/nucleic/

whole cell extract, followed by cross-linking using UV

light and subsequent resolution of the extract by gel elec-

trophoresis, is an alternative to blotting [5759].

Apart from these modifications, another modification

would be coupling the chromatographic separation of

proteins with SDS-PAGE for each fraction. This shall

provide better information on the characteristics, purifica-

tion properties and molecular weight of the protein. Fur-

ther, this technique is restricted to blotting because

oligonucleotides cannot be directly made to penetrate a

SDS-PAGE and bind to the cognate proteins. Hence, if infuture, some protein-resolving oligonucleotide-permeable

gels are developed, the blotting procedure can be avoided

and hybridization can happen on the gel itself.

The disadvantage of this technique is that DNA-binding

proteins involving multiple subunits may get dissociated

during the SDS-PAGE step and hence evade detection.

Even the proteins which are monomers may not renature

properly on the blot to recognize their binding sequence.

Proteins requiring co-factors for DNA binding are difficult

to detect on blot, unless those specific co-factors are added

[60].

Yeast one-hybrid assay (Y1H)

The Y1H, a modification of the yeast-two hybrid assay, is a

sensitive technique for identifying and analyzing proteins

that bind to a specific DNA fragment of our interest. In

1993, Wang and Reed [61] first used the Y1H to clone the

gene encoding the olfactory neuron-specific transcription

factor OLF-1.

The concept of this assay, like the yeast-two hybrid

assay, exploits the same basic finding that most eukaryotic

transcription factors have two physically separable

domains called the activation domain (AD) and the DNA-

binding domain (DB/DBD). If these domains are separated

from each other, it results in a functionally inactive tran-

scription factor that cannot recruit RNA polymerase at its

corresponding promoter to start transcription [62].

In the yeast-two hybrid assay, which is used to study

proteinprotein interactions, a protein X is translationally

fused to AD, while the other protein Y is translationally

fused to DB and both are expressed in the same yeast cell.

The DB-Y fusion is often referred to as the bait and the

AD-X fusion is called the prey. If X and Y interact within

the yeast cell, it brings the AD and DB in close physical

proximity to reconstitute the functionally active transcrip-

tion factor and allows the expression of a downstream

reporter gene. Thus, yeast cells expressing the reporter

gene show that the proteins X and Y interact with each

other.

On the other hand, in the one-hybrid system, the bait isreplaced by a DNA sequence of our interest and the

interaction of a protein X with the bait sequence is assayed.

If X interacts with the bait DNA sequence, it results in

bringing AD-X fusion close to the promoter, allowing AD

to activate the RNA polymerase and result in the expres-

sion of the downstream reporter gene. While AD recruits

RNA polymerase, X plays the role of BD since the BD

fusion protein is absent here. Since this assay contains only

one-hybrid encoded on a vector, it is called the Y1H. In

other words, the one-hybrid assay can be used to trap any

protein (X) having a binding-domain specific for any target

DNA sequence (Fig. 1d).The one-hybrid assay offers maximal sensitivity because

detection of the DNAprotein interaction occurs while

proteins are in their native configurations. In addition, the

gene encoding the DNA-binding protein of interest is

immediately available after a library screening [63].

The procedure first involves the construction of a vector

carrying the bait-sequence upstream of a reporter gene

promoter. Transforming the yeast cells with this bait

reporter construct generates a yeast reporter strain to be

used for the assay. The bait sequence and reporter gene

may remain on the vector or can be integrated into the

chromosome. The integration of the construct into the yeast

genome is preferred and is ensured using high-frequency

homologous recombination sites flanking baitreporter

region on the vector. The transformants are screened by

marker selection and are then again transformed with the

vector encoding a DNA-binding protein fused to Gal4p

AD. The library of AD-DNA-binding protein is screened to

check for potential DNAproteins specific to the bait

sequence, which is reported by the expression of the

reporter genes [63]. The bait sequence can either be an

artificial site having several tandem repeats of the

sequence, or it can be a partial site or a fully functional site

in situ [64].

The reporter gene most commonly used is HIS3, which

allows growth of yeast cells showing the positive interac-

tion in a medium lacking histidine. Any background noise

because of basal level or leaky expression of HIS3 is

eliminated by including a competitive inhibitor 3-amino-

1,2,4-triazole (3AT) in the medium. Hence, a higher level

of expression of HIS3 is required for survival of yeast cell

in the medium. The higher level expression is only possible

in cells showing positive interaction and hence any false

Mol Cell Biochem (2012) 365:279299 283

1 3


6/21

positives are eliminated [64]. LacZ is another reporter gene

which can be used in luciferase-based assays.

There are several modifications of the yeast one-hybrid

system. First, the one-hybrid system can be used to look for

interaction-defective proteins in which the reporter gene

codes for a toxic product. Thus, a positive interaction leads

to expression of the toxin and cell death while a lack of

interaction confers cell viability. This method is referred toas the reverse one-hybrid assay. When used to screen an

AD fusion library of random mutations in the DNA-bind-

ing proteins, it can identify mutations that lead to disrup-

tion of a DNAprotein interaction [62]. It can also be used

to detect the therapeutic potential of drugs or other small

molecules that lead to disruption of DNAprotein interac-

tions which are involved in the onset of a disease. Like the

reverse two hybrid assay, the reverse one-hybrid assay can

be titrated to generate a range of DNAprotein affinities

[65]. Second, Y1H can be modified to screen various

binding sites in a bait sequence that binds to a known

DNA-binding protein. Third, Y1H can also be used toscreen for specific epitopes on a known DNA-binding

protein.

Phage display for DNA-binding proteins

Phage display refers to the method of expressing a peptide

or protein domain on a bacteriophage capsid by genetically

fusing its amino acid sequence to that of the coat proteins

encoded by the phage. A wide variety of proteins can be

expressed in this way yielding a pool of variants referred to

as a phage-display library. The proteins of interest can then

be selected from the library by affinity purification using an

appropriate ligand. The clones with the highest affinity for

the target ligand can be enriched by sequential rounds of

selection and amplified by passage through a bacterial host.

The identity of the selected clones can be obtained by

sequencing the phage genome thus giving complete infor-

mation about the protein of interest [66,67]. Phage display

is also applied to map the DNAprotein interactions

because of the advantage of screening a large number of

protein variants simultaneously and also giving the com-

plete sequence information of the same [6873].

In this modified version of phage display, phages

express a DNA-binding protein domain fused to its coat

protein. Affinity purification of a pool of clones expressing

various DNA-binding protein domains is carried out using

dsDNA oligos (with the binding sequence specific to a

protein) bound to a solid matrix. Enrichment and amplifi-

cation is done as in conventional phage-display experi-

ments and depending on the type of library screened, it

results in identification of the protein domains in the DNA-

binding protein that physically interact with the DNA

(Fig.1e).

The choice of library to be screened is dictated by the

aim of the experiment. The most common types of phage-

display libraries are the random peptide libraries (RPL)

which are obtained by randomizing the DNA sequence at a

selected region of the gene encoding the DNA-binding

protein. This can be used to check which residue in the

region is involved in the interaction or for selecting rare

clones with enhanced function, or clones in which thedisplayed domain has acquired a new function as a result of

mutation. Other libraries like cDNA libraries or genomic

libraries are used to determine the DNA-binding proteins

[67]. DNA oligos are prepared by annealing complemen-

tary oligonucleotides together and by biotinylating at least

one of the strands so that they can be bound to streptavidin-

coated matrices.

The procedure firstly entails the construction of an

appropriate phage-display library of DNA-binding proteins

according to the protocol previously described [66]. DNA

oligos bound to an appropriate matrix are then incubated

with the phages. The unbound phages are removed usingseveral rounds of washing. The bound phages are then

eluted and amplified by passage through a bacterial host.

These affinity-purified and amplified phages are then made

to undergo another round of affinity purification and then

again amplified. Several rounds of affinity purification

followed by amplification lead to the enrichment of the

phage clones expressing the DNA-binding protein domain

with maximum affinity to the DNA of interest.

Once the clones are selected and enriched, their binding

property is assayed using phage ELISA as a final confir-

mation. In this, the streptavidin-coated microtiter plates are

first coated with the biotinylated DNA oligos. Then the

enriched and amplified phages are allowed to bind to the

wells. The unbound phages are washed off and anti-phage

antibody conjugated with an enzyme is added. After

washing off the unbound antibodies the colour developing

solution carrying the substrate is added and reaction stop-

ped after a specified time. The intensity of colour devel-

oped is measured using plate reader spectrophotometer at

450 nm. Higher intensity indicates and confirms strong

interaction between the DNA oligo and the proteins dis-

played on those phage clones.

Proximity ligation assay (PLA)

PLA is used for ultrasensitive protein analysis for mea-

suring DNAprotein interaction. In this technique, direct

detection of proteins or DNAprotein interactions is pos-

sible and DNA representations of detected proteins are

created. Following this, the amplified oligonucleotides are

attached to specific protein-binding reagents (mono/poly-

clonal antibodies). One of the proximity probes is a partly

double-stranded oligonucleotide with a single-stranded

284 Mol Cell Biochem (2012) 365:279299

1 3


7/21


8/21

DNAprotein complexes are captured onto a HaloLink

Resin. This is followed by the standard decross-linking,

DNA purification and PCR amplification of enriched DNA

[7880]. In general, there are many alternatives to detect an

immunoprecipitated chromatin such as polymerase chain

reaction (PCR), quantitative PCR (qPCR), labelling and

hybridization to genome-wide or tiling DNA microarrays

(ChIP-on-chip), molecular cloning and sequencing, or directhigh-throughput sequencing (ChIP-seq) [8187]. There are

several variations of ChIP assay.

X-ChIP

This method allows freezing of all DNA-associated proteins

by cross-linking using formaldehyde. Formaldehyde reacts

with primary amines located on amino acids and the bases on

DNA or RNA molecules, forming a covalent cross-link

between the specific proteins to the DNA on which they are

situated. Now the various DNAprotein complexes are iso-

lated by cell lysis and the crude cell extracts are sonicated toshear the DNA to a smaller size. The proteinDNA complex

is immunoprecipitated and the DNAprotein cross-links are

reversed by heating. The proteins are then removed by

treatment with proteinase K. The DNA portion of the com-

plex is then purified and identified by PCR using specific

primers. The use of formaldehyde for cross-linking mini-

mizes nucleosome rearrangements andis an efficient method

to analyze proteins that are weakly or indirectly associated to

DNA. The use of formaldehyde as a crosslinking agent has

certain limitations like the short cross-linking arm of form-

aldehyde is not suitable for examining proteins that indi-

rectly associate with DNA, such as those found in larger

complexes. So, a variety of other long-range bifunctional

cross-linkers may have to be used in combination with

formaldehyde to detect such interactions [88]. The yield of

chromatin and its resolution may be less after sonication and

sometimes there are chances of epitope disruption [89,90].

Native-ChIP (N-ChIP)

N-ChIP is a technique suited for natural DNAprotein

interactions where the proteins are tightly associated to

chromatin in their native state such as histones due to their

high-affinity for DNA. Hence, these interactions do not

require cross-linking with formaldehyde. Native chromatin

within a cell produces smaller fragments, by treatment with

micrococcal nuclease (MNase) which are then immuno-

precipitated using antibody specifically against the protein

of interest. Enzymatic digestion technique is mild and does

not result in loss of antibody epitope during immunopre-

cipitation yielding higher immunoprecipitation efficiencies

[91]. It also provides high resolution as it is possible to

produce single monosomes of about 175 base pairs.

However, the digestion by MNase is uneven, as the enzyme

favours certain areas of genome sequence more. To avoid

over represented or over looked data, X-chip should be

carried out as a comparative control [92]. Also nucleo-

somes may rearrange during digestion and this has to be

taken into consideration while performing N-ChIP.

Fast ChIP

As the name suggests, fast chip is a modification of the

chip technique for large cell numbers which reduces time

required for conventional ChIP assay and helps in elimi-

nating multiple tube transfers thereby preventing loss in

output. Conventional chip assays require a high cell num-

ber to begin with due to low recovery rate of cross-linked

DNA from total cellular DNA. Multiple washes during the

procedure may also cause loss of specific interactions.

Therefore, a technique that can reduce the time and chan-

ces of losing cells is favoured. In this modification of ChIP

assay, all the steps are similar. However, the cross-links arereversed during 10 min incubation at 100C in an ultra-

sonic bath, in the presence of Chelex-100, a resin that aids

in the extraction of DNA. After incubation, the tubes are

spun and DNA containing supernatant can be directly used

in PCR [93]. The limitation associated with the fast ChIP is

that it is suitable only for large cell samples.

Carrier ChIP

The carrier ChIP is based on immunoprecipitation from

very few cells up to 100 cells and is suited for examining

histone modifications associated with developmentally

regulated genes. Immunoprecipitation of such a small

amount of chromatin is facilitated by the addition of carrier

chromatin from Drosophila or any other species which is

evolutionarily distant from the species being investigated

to provide efficient precipitation of target chromatin [94].

Here native chromatin is partially digested using MNase

and immunoprecipitated using antibodies to modified his-

tones. The low amount of chromatin is detected by radio-

active PCR and phosphorimaging. This technique,

however, requires the primers to be designed with high

specificity to prevent any spurious amplification of carrier

DNA instead of the target chromatin.

Matrix ChIP

It is a microplate-based ChIP assay in which all the steps

are done in microplate wells without sample transfers [95].

In this method, antibodies immobilized with protein A/G

are coated into each well of a 96-well plate and further

processed. Hence, allowing 96 ChIP assays for histone and

various DNA-bound proteins, including transiently bound

286 Mol Cell Biochem (2012) 365:279299

1 3


9/21

protein kinases, in a single run. It also allows maintaining

antibodies in correct orientation which enhances its binding

capacity [96].

ChIP-Chip

As the name suggests, ChIP-Chip is a technique that com-

bines Chromatin Immunoprecipitation with Microarraytechnology. It consists of labelling the immunoprecipitated

DNA fragments with a fluorescent dye such as Cy5 or Alexa

647 and combining it with the genomic DNA labelled with

Cy3 which serves as the reference DNA. This probe mixture

is then applied to the microarray chip ideally consisting of

whole genome and allowed to hybridize. The results of the

experiment signify the regions of the DNA enriched by

immunoprecipitation. Hence, the Chip data is obtained in the

form of one dimensional series of signals with peaks iden-

tifying the regions bound by the protein of interest [97]. Also,

since the exact location of each arrayed element is known, a

genome-wide map of DNAprotein interactions can beconstructed.

Various computational and mathematical models are

available which allow the analysis of regions bound by the

proteins [98]. CisGenome is one such software which ful-

fils almost all the needs of ChIP data analysis including

visualization, data normalization, peak detection, false

discovery rate computation, gene-peak association,

sequence and motif analysis. Many statistical approaches

have also been used for the analysis of ChIP data including

Hidden Markov Model, Welchs t statistic method, and

titled model-based analysis of tiling-arrays (MAT), to

identify regions enriched by a transcription factor [99].

The ChIPChip technique offers several advantages

over traditional ChIP assays. First, it allows probing of a

large number of genomic regions in a single experiment,

eliminating bias and saving time. Second, commercially

available platforms can be used to study the localization of

protein binding dismissing the need of running expensive

large scale quantitative PCR assays. Third, it allows par-

allel analysis of different genes to be classified in various

classes which is further useful for their statistical com-

parison [100].

Since an ideal microarray covering all the human

chromosomes is not possible, this technique may be ben-

eficial if combined with other throughput technologies.

DIP-Chip

The modification of ChIPChip is DIPChip that over-

comes its limitations like interference of proteinprotein

interactions and competitive binding in vivo. DIPChip is

more of an in vitro technique with results comparable to in

vivo assays. The procedure involves interaction of purified

and mechanically sheared genomic DNA with purified

protein of interest. The DNAprotein complexes are then

affinity-purified using appropriate resins. These affinity-

purified genomic fragments along with the whole genome

fragments are then amplified and fluorescently labelled

separately with different dyes for assessing their relative

abundance in the entire genome of the organism using

microarray. The samples are analyzed by comparativehybridization to the DNA microarray that covers the entire

genome of the organism [101].

ChIP sequencing

ChIP sequencing combines the technique of chromatin

immunoprecipitation and DNA sequencing to identify the

binding sites of various protein factors co-precipitated

along with DNA fragments during ChIP [102]. For the

construction of ChIP-seq library, the ends of enriched DNA

fragments obtained by immunoprecipitation using con-

ventional ChIP protocol are blunted and phosphorylatedusing T4 kinase. Following this Adenine is added using

Taq and an adapter is ligated to both the ends of the

fragment [103]. The library obtained is amplified by PCR

and DNA fragments of length 100300 bp are selected and

sequenced. Finally the short sequenced fragments called

tags are analyzed computationally with the help of align-

ment tools using a particular genome as reference to

identify the enriched sites [104].

This technique has several advantages over ChIPChip

including low cost, lesser starting material and higher peak

resolution. However, it also has a number of issues which

need to be addressed. First, The ChIP-Seq tags represent

the ends of the enriched fragments and not the binding sites

of the protein factor. Moreover, the estimation of site to tag

distance is complicated. Second, no control samples are

sequenced deeply to check for the regional biases along the

genome arising on account of chromatin structure and copy

number variations [102]. Third, lack of advanced and user

friendly data analysis tools make the analysis of peaks

difficult.

However, ChIP-seq has been proved to be a potential

tool in the study of histone modifications, nucleosome

positioning and mapping of binding sites of various DNA-

binding proteins. Moreover, this strategy allows distin-

guishing alleles on the basis of difference in SNPs, which

would not have been possible using ChIPChip [103].

ChIP display

ChIPChip has been described as a potential method for

the identification of novel transcription factor binding sites

in the genome. But it suffers from severe limitations

including co-precipitation of non-specific DNA fragments

Mol Cell Biochem (2012) 365:279299 287

1 3


10/21

which may sometime even overwhelm the specific ones

resulting in a strong background noise. To overcome this

problem, a new technique has been devised called ChIP

display. This technique is based on the principle of con-

centrating the target fragments via restriction digestion and

then scattering the precipitated non-specific DNA frag-

ments by partitioning the digested fragments into different

families. The partitioning is based on the identity of thenucleotides at the end of these fragments [105]. Since all

the target fragments remain in the same family, the signal

is not eroded and is separated from the non-specific frag-

ments of different families.

ChIP display is a prospective tool for the reduction of

non-specific DNA precipitation. However, it suffers some

practical limitations. First, since non-specifically precipi-

tated DNA fragments can unexpectedly bind the protein in

vitro (but not in vivo), hence utility of this approach is

debatable. Second, ChIP display is not well suited for a

comprehensive analysis of target sequences for proteins

with a large number of genomic targets, such as GATAproteins, histone deacetylases, polycomb proteins or for the

mapping of histone modifications [105]. It is better suited

for transcription factors with a more limited number of

targets.

Other ChIP variations

There are certain other categories of ChIP assay setups that

are classified based on different buffers used which affect

the purpose and efficiency of the immunoprecipitation such

as Quick and Quantitative ChIP (Q2 ChIP) and MicroChIP.

Q2 ChIP incorporates histone deacetylase inhibitor during

cross-linking which helps in elimination of non-specific

backgrounds and also has different elution buffers and

reduced time of protocol. MicroChIP is a miniaturized

ChIP protocol for 10,000 cells that has applicability in

genome-wide studies [106,107].

DNA adenine methyltransferase identification

(DAMID)

DAMID is a novel methylation-based tagging technique

that has emerged as a powerful tool to study chromatin

interactions in vivo. It has been successfully used to gen-

erate genome-wide maps of several DNA-binding factors

including GAGA factors, Max family of transcription

regulators, coregulators and various other chromatin pro-

teins [108].

In this technique, the protein of interest is fused with a

bacterial DNA adenine methylase (DAM) which is a single

32 kDa polypeptide and methylates adenine at the sixth

position in the sequence GATC [109]. This methylation

causes few changes in the DNA topology and provides a

unique tagging system to mark the binding sites of specific

protein factors. This fusion protein is expressed in mam-

malian cells in low quantities by using a weak promoter

[108]. The binding of the fusion protein to the target site

results in the methylation of adenine nucleotides within the

DAM recognition sequence in close vicinity of the protein

target site. These methylated sequences are then cleaved by

DpnI enzyme to recover fragments containing regionsnearby or within the gene along with the target site itself.

Further, the fragments obtained may be analyzed by

quantitative PCR assay or subjected to microarray studies.

To overcome these effects of chromatin accessibility on the

level of methylation, a control experiment is run in parallel

which measures the methylation levels in the probed

sequences after the expression of dam [108].

DAMID has significant advantages over the conven-

tional ChIP technique. First, it does not use any cross-

linking agents to fix the chromatin and also eliminates the

use of protein-specific antibody. Hence, it provides a

simpler platform to study the binding properties of co-factors and other proteins that bind indirectly to the DNA

[110]. Also, there are lesser chances of misidentification of

target sequences due to accidental cross-linking as in case

of ChIP. Second, it provides an easier way to study the

effects of mutations on the targeting specificity of the

protein of interest which is difficult to perform using

conventional ChIP assays [108].

The limitations of DAMID are that it requires dam to be

bound to the protein without inducing any changes in its

function. Also, this technique is unsuitable for the detection

of post-translational modifications, while ChIP successfully

detects histone modifications. It is a time consuming

technique as it involves expression of DAM-fusion protein

for several hours [108].

Chromosome conformation capture (3C)

and ChIP-loop assay

One of the key regulators of gene expression is spatial

organization of the eukaryotic genome. 3C is a novel

technique that is used to detect the frequency of interaction

between two genomic loci in the nuclear space. It is a

powerful tool to study the link between nuclear organiza-

tion and transcription regulation. This technique is carried

out by initially fixing the cells with formaldehyde which

helps in cross-linking of interacting segments of the gen-

ome via contacts between their DNA-bound proteins. The

resulting network of proteinDNA complexes is subjected

to restriction digestion followed by ligation at low DNA

concentration, such that the ligation between cross-linked

DNA fragments is favoured. After the reversal of cross-

links, the fragments are subjected to quantitative PCR to

further allow for the measurement of cross-linking

288 Mol Cell Biochem (2012) 365:279299

1 3


11/21

frequency of the two specific restriction fragments

(Fig.2c) [111].

Although ChIP and 3C operate through same basic

principle of cross-linking proteinDNA interactions but the

two techniques differ from each other in the information

they provide whereas ChIP provides information about

DNA-binding activity of a protein, 3C is used to study

interaction between two different genomic sites looped bya protein factor.

To establish a link between 3C and ChIP a novel tech-

nique called ChIP-loop assay has been developed [112].

This technique allows the study of proteins mediating the

interaction between the two genomic loci, by combing the

two techniques. Initially, the cells are fixed by formalde-

hyde and the cross-linked chromatin purified from free

proteins by urea gradient centrifugation. This is followed

by restriction digestion of the purified cross-linked chro-

matin and precipitation by protein A/G beads and specific

antibodies. The precipitated chromatin is then allowed to

ligate and is further analyzed by quantitative PCR as instandard 3C experiments [111]. Hence, ChIP-loop assay

helps in studying the proteins that are involved in organi-

zation of DNA loops to mediate genomic interactions. This

technique provides a better insight into interactions than 3C

and ChIP when used alone. However, the major concern is,

when the DNA is concentrated before ligation, it may lead

to formation of loops between bead-associated DNA

fragments. Hence the results obtained may not accurately

identify the loops of DNA fragments formed in the nuclear

space. This also makes the quantification of ligation

products very complicated [113]. Nevertheless, their

potential use in identifying proteins participating in long-

range interactions cannot be denied.

In silico tools for identification of DNAprotein

interactions

The computational and in silico approaches to identify

DNAprotein interactions are an important aspect of these

interactions. Diverse computational tools are freely avail-

able which are used to predict DNAprotein interactions.

Most of these are aimed at predicting the transcription

factor-based gene regulation.

TRANSFAC

TRANSFAC is a comprehensive knowledgebase contain-

ing eukaryotic gene regulation data from a wide variety of

eukaryotic organisms, ranging from yeast to humans. It

mainly comprises of data on transcription factors, their

experimentally proven binding sites, regulated genes and is

an extremely diverse tool for transcription factor (TF)

binding predictions. It has a broad compilation of binding

sites and allows the derivation of positional weight matri-

ces, which can be used with the available tools to search

DNA sequences. Several entries are grouped under differ-

ent tables of the TRANSFAC database. One of the features

is the assigning of a quality value to describe the confi-

dence with which an observed DNA-binding activity could

be assigned to a specific factor. Nucleotide weight matricesare derived from a collection of binding sites for a factor,

and these matrices are used by the tool MatchTM to find

potential binding sites in uncharacterized sequences. Sev-

eral web programs are also available that utilize the

TRANSFAC database such as AliBaba2 which is a used for

predicting TF binding sites in an unknown DNA sequence

by utilizing the binding sites collected in TRANSFAC.

P-Match is another new tool for identifying transcription

factor binding sites in DNA sequences. It combines pattern

matching and weight matrix approaches to provide a high

accuracy of recognition.

TRANSFAC is maintained as a relational database, fromwhich public releases are made available via the web,

making it an easily accessible database. Several web-based

tools are linked to TRANSFAC and utilize its database to

perform unique computational functions [114,115].

Identification of DNA-binding proteins (iDBPs) server

The iDBPs server was developed for the identification of

DNA-binding proteins with known three-dimensional

structure. In the first stage of classification, the functional

region of the protein is predicted using the PatchFinder

algorithm which searches for clusters or patches of evolu-

tionarily conserved residues on the protein surface. The

maximum-likelihood (ML) patches found by PatchFinder

often delineate the functional regions in proteins and spe-

cifically, the core of DNA-binding regions within DNA-

binding proteins [116]. The results are sent to the user which

includes the prediction score of the protein, the expected

sensitivity and the expected precision at this score cut-off.

DNA site prediction from a list of adjacent residues

(DISPLAR)

DISPLAR is a neural network method that predicts the

residues of a protein which interact with DNA, if the

structure of a protein known to bind DNA is provided.

Several inputs have to be provided to the neural network

including position-specific sequence profiles and solvent

accessibilities of each residue and its spatial neighbours.

The neural network is trained on known structures of

proteinDNA complexes. DISPLAR shows prediction

accuracy over 80% and coverage of over 60% of actual

DNA-contacting residues [117].

Mol Cell Biochem (2012) 365:279299 289

1 3


12/21

FlyFactorSurvey

FlyFactorSurvey is a database of DNA binding specificities

for Drosophila TFs. It provides community access to over

400 recognition motifs and position weight matrices for

over 200 TFs, including many unpublished motifs. The

primary source of recognition motifs within FlyFactor-

Survey is TF binding site selections performed using thebacterial one-hybrid system. Search tools and flat file

downloads are provided to retrieve binding site information

(as sequences, matrices and sequence logos) for individual

TFs, groups of TFs or for all TFs with characterized

binding specificity. Linked analysis tools allow users to

identify motifs within the database that share similarity to a

query matrix or to view the distribution of occurrences of

an individual motif throughout the Drosophila genome

[118].

YEAst search for transcriptional regulators

and consensus tracking (YEASTRACT)

YEASTRACT information system allows the identification

of potential transcription regulators. It is a database that

contains over 12,346 regulatory associations between

transcription factors and target genes in Saccharomyces

cerevisiae [119]. It also characterizes set of genes with

common expression profile obtained from microarray data

and searches for occurrence of candidate TF binding sites.

Multi-genome analysis of positions and patterns

of elements of regulation (MAPPER)

It is a search method that helps in identifying the TF

binding sites which is based on hidden Markov model

obtained from alignments of known sites. TF binding site

models can be used to align with the sites provided by the

TRANSFAC and other databases and then scan the

sequences of the human, mouse, fly, worms and yeast

genomes to identify the sites. It has a better specificity and

sensitivity than other similar computational models. A

sequence is uploaded as a query and then a model is built

by allowing multiple sequence alignment of binding sites

of the transcription factor [120].

Zinc finger binding site database (ZIFIBI)

It is a tool that helps in identifying the C2H2 zinc finger

transcription binding site in the cis regulatory regions of

the target genes. It makes use of the available data to

predict the interactions between the nucleotides and the

amino acids of the zinc finger domain of the protein. The

most probable state path is calculated using a hidden

Markov model [121].

Bioprospector

It helps in identifying regulatory sequence motifs in cis

region of target sequence by examining it in the same gene

expression pattern group. It is based on a C program and

uses Gibbs sampling strategy. The significance of each

motif is estimated using the Monte Carlo method. It has

been successful in identifying binding motifs for Saccha-romyces cerevisiae Ras-related protein 1(RAP1), Bacillus

subtilisRNA polymerase, andEscherichia colicyclic AMP

receptor protein (CRP) [122].

Bindn

It is a web-based tool that helps to predict the DNA and

RNA binding sites with the help of support vector

machines (SVMs). The SVM models are prepared using

three sequence features like side chain pKa values,

hydrophobicity index and molecular mass of an amino

acid. Thus, it helps to identify the functions of the bindingproteins based on primary sequence data [123].

Bindn?

Bindn? uses protein sequence features different from

Bindn to identify the binding sites in the sequences. It also

takes the support of the SVMs. The protein sequence fea-

tures used in this case are the biochemical property of the

amino acids and evolutionary information in terms of the

position-specific scoring matrix. The new descriptors used

in Bindn? have shown better performance, sensitivity and

specificity in comparison to the previous version [124].

DP-bind

It helps in predicting the binding sites of a protein by ana-

lyzing the amino acid sequence. It uses three support models

for predicting the sites: support vector machines, kernel

logistic regression and penalized logistic regression. Pre-

diction can be done using the input sequence alone or the

profile of evolutionary conservation of the input sequence.

The output of all the three models are used to provide a

combined and consensus result with high confidence [125].

PreDs

It is a web-based server that allows DNA-binding site

prediction on protein molecular surfaces. The molecular

surfaces of the proteins are generated with the help of

atomic coordinates that are available in a .pdb format. The

prediction is based on the evaluation of the electrostatic

potential, local and global curvature of the protein surface

[126].

290 Mol Cell Biochem (2012) 365:279299

1 3


13/21

ProNIT

It is a thermodynamic database that uses quantitative

binding data rather than just structural data. It contains

several parameters for analyzing the protein-nucleic acid

recognition like thermodynamic parameters, experimental

conditions and structural information of both the protein

and the DNA. It provides various sorting output options.The thermodynamic parameters used are dissociation

constant, association constant, Gibbs free energy change,

enthalpy change and heat capacity change. A relational

database system combines all of this information to provide

flexible searching facilities [127].

Database for polyanion binding proteins (DB-PABP)

Polyanion binding proteins are diverse proteins that go and

interact with polyanions which are entities having multiple

negative charge. The various polyanions identified for such

interactions are actin, tubulin, DNA, heparin and heparinsulphate. The database thus created is a comprehensive and

searchable database which has been manually curated. It

has been implemented as a MY SQL relational database.

The search is based on four criteria: protein names, poly-

anion names, source species and the methods used to dis-

cover the interactions [128].

DNAProt

It helps in identifying the DNA-binding proteins from the

protein sequence. It has considerably good accuracy in

distinguishing between the DNA-binding proteins and the

non-DNA-binding proteins by characteristically recogniz-

ing specific DNA chains. The random forest method is used

to identify the DNA-binding proteins [129].

Biophysical techniques as a potential tool for

DNAprotein interaction studies

Fluorescence-based techniques

Fluorescence is a form of luminescence caused by emission

of an electromagnetic radiation [130]. The simultaneous

absorption of two photons by an electron (two-photon

absorption) excites molecule from ground state to higher

energy (high frequency, low stability) state, leading to

emission of radiations [131]. This principle and its modifi-

cations are used to device different fluorescence detection

techniques, e.g. fluorescence spectroscopy, fluorescence

intensity, florescence depolarization, fluorescence resonance

energy transfer and fluorescence correlation spectroscopy. In

fluorescence-intensity distribution analysis fluorescence

intensity of a sample with a heterogeneous brightness profiles

is monitored by spatial brightness distribution and calculat-

ing theoretical photon count number distributions [132].

Capillary electrophoresis with laser-induced

fluorescence

Capillary electrophoresis coupled with laser-induced fluo-rescence polarization is a hybrid approach to ultrasensitive

immunoassays [133]. Fluorescence polarization provides

additional information for identification of affinity com-

plexes. ProteinDNA interactions can be studied on the

basis of capillary electrophoretic (CE) separation of bound

from free fluorescent probe followed by detection with

laser-induced fluorescence polarization (LIFP) [134].

Changes in electrophoretic mobility and fluorescence

anisotropy upon complex formation can be monitored for

the determination of binding affinity and stoichiometry

[135]. There are two types of assaysHomogeneous and

heterogeneous. In the homogeneous assay, the free andbound tracers are joined together and the fluorescence

polarization of the mixture is a quantitative measure of the

antibody-bound tracer. The heterogeneous assay involves a

baseline separation of the free and bound tracer using CE

with a phosphate running buffer. Results from both assays

suggest that the CELIFP approaches have a wider appli-

cation than the immunoassays based on either CELIF or

fluorescence polarization alone [136].

Narrow-bore capillaries provide high-speed, high reso-

lution separations and ultrasensitive detection in a minimal

sample detection volume. Increased detection limits,

enhanced identification capacity, potential for miniaturi-

zation, etc. also adds to its advantages. However, the free

and bound tracer may have similar electrophoretic mobil-

ities and thus cannot be separated, making the technique

inefficient in their identification and quantitation [137].

Time-resolved fluorescence depolarization

Time-resolved fluorescence depolarization (Anisotropy) is

a technique where a short pulse of vertically polarized light

is directed at the sample where the absorbed light prompts

the molecule to an excited singlet state [138]. After

vibrational relaxation, fluorescence light is emitted at lower

energy; if the molecule rotates during the time interval

between absorption and emission, there is a decrease in the

polarization with time that reflects a rate at which the

molecule rotates diffusionally [139].

Time-resolved fluorescence spectroscopy can be used to

analyze the interaction between proteins and DNA. Fluo-

rescence polarization anisotropy decay can be used as a

spectroscopic handle to scrutinize the interaction between

several site-specific DNA-binding proteins and their target

Mol Cell Biochem (2012) 365:279299 291

1 3


14/21

DNA fragments. Solution conditions such as temperature,

pH, ionic strength, and the presence of effector molecules

can be varied and interaction can be studied [140].

Variety of DNA sequences can be tested, both for pre-

liminary experiments and for evaluating base sequence-

dependent effects. The assay is reversible which allows

manipulation of solution conditions so that the effects of

environment or effector molecules on complex formationcan be accessed directly. Also the rotational correlation

time directly measures molecular size and shape.

Double labelled native gel electrophoresis

and fluorescence-based imaging

Radiolabeled DNA gel mobility shift assay is modified to

incorporate an end-labelled DNA probe with a texas-red

fluorophore and a DNA-binding protein tagged with the

green fluorescent protein to monitor precisely DNApro-

tein complex by native gel electrophoresis [141]. Thismethod is applied to the DNA-binding proteins, demon-

strating that the method is sensitive, permits direct visu-

alization of both the DNA probe and the DNA-binding

protein, and enables quantitative analysis of DNA and

protein complex, and thereby an estimation of the stoi-

chiometry of protein-DNA binding [142].

Protein array method combining a near-infrared

fluorescence detection

The protein array methodology is used to study DNA

protein and proteinprotein interactions using probes

labelled with near-infrared fluorescence dyes (IRDye800)

with excitation characteristics near 700 or 800 nm detect-

ing signals from proteins immobilized on a nitrocellulose

membrane with a high sensitivity [143]. To study protein

DNA binding, the membranes are incubated in a DNA-

binding buffer containing poly-dGdC and poly-dAdT or

sonicated salmon DNA at room temperature for 30 min.

Then, an Infra red Dye-labelled DNA probe is added to the

pre binding solution and incubated with a slow rotation at

room temperature or at 60C. The membranes are washed

with PBS containing 0.1% Tween and then screened for the

detection of fluorescent signals by infrared Imaging sys-

tem. To perform proteinprotein binding, the membranes

are incubated in the PBS solution with BSA at room

temperature and then with corresponding Cy5.5-labelled

protein in PBS containing 1% BSA and 0.1% Tween 20 at

room temperature for 1 h [144].

The fluorophores in protein array method with longer

wavelengths provide a high-signal-to-noise ratio that

decreases the background effect on membrane surfaces;

thereby increasing the sensitivity of the detection.

Fluorescence resonance energy transfer (FRET)

techniques

FRET is a non-radiative process whereby an excited donor

fluorophore transfers energy to a ground-state acceptor as a

result of a coupling of their transition dipoles. FRET pro-

vides structural and kinetic information of proteinDNA

interactions by preparation of dye-labelled nucleic acidsand proteins and increased optical sensitivity. The principle

of FRET relies on the site-specific labelling with a donor

and an acceptor dye, with FRET dyes in each interacting

partner (Intermolecular FRET) or both in the same bio-

molecule (Intramolecular FRET) (Fig. 3a, b). Direct opti-

cal excitation of the donor dye results in fast energy

transfer to the FRET acceptor, which emits fluorescence at

a longer wavelength [145,146].

Intramolecular FRET assays, where both dyes are

located on the same biomolecule are extensively used to

monitor protein-induced conformational changes in the

DNA substrate and to determine the global structure andassembly dynamics of a variety of nucleoprotein

complexes.

FRET technique relies on its continuous character, so

that the cleavage reaction can be monitored from the initial

steps in real-time with no need for extensive sample han-

dling [146].

FRETFLIM in situ imaging for proteinDNA

interactions in the cell nucleus

This approach allows imaging of the in situ interaction

between a GFP-fusion protein and DNA in the cell nucleus,

using FRET [147]. A fluorescence lifetime imaging

microscopy (FLIM) is used as a reliable tool to detect

protein in contact with DNA. To develop a FRET-based

method to visualize DNAprotein interactions in situ, a

DNA-binding fluorescent dye that is suitable as FRET

acceptor if GFP is the donor must be used. The members of

the Sytox fluorescent dye family have a high-affinity for

nucleic acids and are available with a broad range of

excitation and emission spectra. Upon binding to DNA or

RNA, they show several hundred-fold enhancement of

fluorescence intensity [148].

Fluorescence lifetime measurements can be performed

by wide-field frequency-domain FLIM with Argon-ion

laser as an excitation source. Images at different phases can

be recorded at the image intensifier. Thereby, phase and

modulation depth-based lifetime of the emitted fluores-

cence can be calculated from the resulting set of images

[149].

FRETFLIM in situ imaging for proteinDNA interac-

tions in the cell nucleus is a reliable and quantitative

method to measure FRET. It is a donor-selective FRET

292 Mol Cell Biochem (2012) 365:279299

1 3


15/21

method, which is not influenced by acceptor dye molecules

that are not involved in FRET.

Nuclear magnetic resonance

NMR is used to investigate the interactions of DNA with

proteins. NMR provides dynamic and structural infor-

mation on the changes in conformation and molecular

flexibility and enables formulation of mechanistic models

of DNAprotein interactions [150]. There are some

sample preparation steps that need to be followed. The

sample needs to be labelled and various strategies may

be employed. Either the protein is 15N or 13C labelled

while the DNA is unlabeled or vice versa. Sample pre-

cipitation needs to be taken care of as there is a strongelectrostatic interaction involved within the complex

[150].

The sample can be analyzed by chemical shift mapping

where hetero nuclear single quantum coherence (HSQC)

spectra of labelled molecule is analyzed separately for

bound and free state. Chemical shifts are sensitive to

changes in the chemical environment of the protein. The

DNA interaction with a protein alters chemical environ-

ment. Thereby, causing shift in the spectra as compared to

unbound molecules [151]. Cross-saturation experiments

can also be used to analyze DNAprotein, proteinprotein

interactions and various binding surfaces of ligands on the

protein [152]. Another technique that is employed in NMRis the solvent accessibility test which helps in the quanti-

tative analysis of the amide proton exchange rates of the

free and the bound protein. However, there are certain

intermolecular restraints to the NMR spectroscopy like

nuclear overhauser effect, residual dipolar couplings and

paramagnetic relaxation enhancement. These restraints

may hamper the precision and accuracy of the technique

and hence various modifications have been made which

can overcome these restraints [150].

Circular dichroism

Circular dichroism (CD) is a quantitative technique that

helps to identify the DNAprotein and proteinprotein

interactions. It provides additional information about the

prosthetic groups, bound ligands and the co-factors

attached. It also helps to identify the conformational

change in protein molecules. There are signatures corre-

sponding to the particular interaction based on asymmetry

induced by the secondary structure of proteins. Thereby,

Fig. 3 Biophysical techniques to study DNAprotein interactions

Mol Cell Biochem (2012) 365:279299 293

1 3


16/21

identifying the structure of bound protein and the possible

interactions involved [153,154].

There are many variations to CD like the stopped flow

CD and the CD using synchrotron radiation [155]. In case

of nucleic acids-induced CD measures the asymmetry

among the bases. The bases as such are planar but there is

some amount of CD-induced due to the sugar present in the

backbone of the DNA. It is a powerful technique in ana-lyzing the structural change with respect to factors like

temperature, ionic strength and pH. It helps in judging the

extent of interaction between the helices by analyzing the

melting of peptides [156].

The circular dichroism technique is considered to be a

better method than other techniques like nuclear magnetic

resonance (NMR) as it is faster, economical, uses a small

amount of sample to analyze and most of the sample can be

recovered for further analysis.

The limitations of CD are relatively low resolution

structural details and little information about the quaternary

structure of the protein [157].

Atomic force microscopy (AFM)

AFM is another powerful tool for imaging DNAProtein

complexes at a single molecule level [158]. It allows to

characterize the mechanisms involved in DNAprotein

complex formation in different conditions with high reso-

lution. It quantitatively identifies protein position along

DNA molecules, DNA flexibility, curvature and confor-

mational change after protein binding.

AFM is operated in tapping mode which allows the

elimination of permanent shearing forces and causes less

damage to the sample surface, even with stiffer probes

[159]. Different components of the sample which exhibit

difference in adhesive and mechanical properties show a

phase contrast and therefore allow a compositional

analysis.

The potential of this technique for high-sensitivity, high-

throughput operation in fluid, and for force detection are

major considerations for its continued integration into

mainstream cellular and molecular analyses [160]. It uses

very small quantities (10-9 to 10-15) of DNA and proteins.

The technique has limitations when it is applied to

structural and functional studies of biomolecules, due to the

resolution limiting motion of DNA molecules. To over-

come this, the DNA must be tethered to the substrate sur-

face. Because of its flatness, mica is the most commonly

used substrate for DNA imaging [161, 162]. Also, large

DNA molecules remain difficult to be imaged by AFM

because of their tendency to aggregate. A modified method

is described by Lysetska et al. [163], to align long-DNA

fibres in a single direction on unmodified mica to facilitate

AFM studies.

Surface plasmon resonance (SPR)

SPR is a label-free optical technology and an emerging

alternative to the conventional in vitro techniques to study

DNAprotein interactions. It uses an evanescent wave

phenomenon to study changes in refractive index, occur-

ring close to the sensor chip surface, causing a shift in

plasmon resonance angle, detected by an imaging system.The general principal that lies behind the working of

SPR is total internal reflection that occurs when a polarized

light travels through a medium of higher to lower refractive

index. When this occurs, the electromagnetic field com-

ponent penetrates over a short distance into the medium of

low refractive index resulting in the exponential attenua-

tion of the evanescent wave. If the interface is coated with

a thin layer of gold, then the projected beam at the given

angle will cause resonance coupling between light photons

and surface plasmons of gold as their frequencies match. A

change in the refractive index within the environment of

evanescent wave occurs due to the binding of DNA toprotein. Hence, a real-time measurement of biomolecular

interactions can be enabled by measuring the refractive

index changes corresponding to mass changes [164]. Many

advances have been done in this technique to study DNA

protein interactions. A multistep chemical modification

procedure has been proposed to create DNA arrays on gold

surfaces specifically tailored for the study of proteinDNA

interactions [165].

To study DNAprotein interactions, DNA is immobi-

lized on the chip surface followed by a constant flow of

buffer over the surface (Fig.3c). The protein analyte is

allowed to bind to the immobilized DNA and a change in

the position of reflected light minimum observed in terms

of resonance units (RUs) which are recorded and a sens-

ogram is generated. A sensogram is divided into four dif-

ferent phases: association phase, steady state or equilibrium

phase, dissociation phase and regeneration phase (Fig. 3d)

[166].

SPR offers a variety of advantages over other tech-

niques. First, the interaction can be monitored very accu-

rately in real-time. Since the change in refractive index

corresponds to a change in mass, this method can also yield

data on the stoichiometry of complexes in addition to

binding kinetics [164]. Second, simultaneous analysis of

multiple interaction partners can be seen. Third, it is a

label-free technology and optical radiation does not harm

the biomolecules.

Microcalorimetry

Being a non-invasive technique, microcalorimetry is a

potential technique to study the interactions and study of

biomolecules. It is the measure of calorimetry of small

294 Mol Cell Biochem (2012) 365:279299

1 3


17/21

samples and relies on the similar basic principle of mea-

surement of heat energy changes occurring during any

physical or chemical processes.

For studying proteinDNA interactions, two most

commonly used microcalorimetric techniques arediffer-

ential scanning calorimetry (DSC) and isothermal titration

calorimetry (ITC). DSC measures the heat capacity profile

of proteins as a function of temperature during processeslike protein unfolding, thermal stability during complex

formation by measuring the differential heat energy chan-

ges between sample and reference cells [167]. A pair of

matched calorimetric cells (sample and reference cell)

enclosed in an adiabatic chamber and fitted with sensitive

thermocouple are used. Electronic/Computer controlled

feedback circuits are used to measure the differential

temperature lag between cells. ITC is used to study binding

proteins more directly by measuring not only the magni-

tude of the binding affinity but also the magnitude of the

two thermodynamic terms that define the binding affinity:

the enthalpy and entropy changes [168]. In a typicalexperiment, a solution of a one biomolecule is titrated into

a solution of its binding partner and the heat released upon

their interaction is monitored over time. The temperature

dependence of enthalpy of binding can be used to calculate

the binding heat capacity [167].

Since microcalorimetry is not affected by the constraints

due to size and shape of molecule and does not require any

chemical modification or solid support, it has become an

invaluable resource in laboratories [169]. Also the high

sensitivity and its ability to analyse true binding affinities

by measuring heat changes and measure nanomolar to

picomolar binding constants (109 to 1012 M-1) using the

competitive binding technique makes it a promising tech-

nique in molecular biology.

Although ITC is particularly suitable to follow the

energetics of an association reaction between biomole-

cules, the combination of ITC and DSC provides a more

comprehensive description of the thermodynamics of an

associating system [170].

Conclusion

DNAprotein interactions are an integral component of

biological systems and their study is important for almost

all biological processes. Several techniques are available to

aptly determine these interactions and their understanding

is imperative. At the in vitro level, molecular biology-

based techniques such as footprinting assays, EMSA,

southwestern blotting, Y1H phage display and proximity

ligation assay (PLA) screen DNAprotein interactions

reliably. The highly dynamic in vivo tools of chromatin

immunoprecipitation and its variants, DNA adenine methyl

transferase identification (DAMID) and ChIP-loop assay

are robust techniques to characterize several DNAprotein

interactions in cells.In silicoapproaches have also evolved

drastically over the years to supplement the information

available to researchers. Various recent biophysical tech-

niques including fluorescence-based techniques, CD,

NMR, AFM, SPR and microcalorimetry have a great

potential for the detection of protein-based interactions.Every technique is unique in its own way and serves a

unique purpose. As is evident, the current state of methods

leaves quite a lot to be desired. An ideal method would

require minimal cell numbers, able to detect rare interac-

tions with high specificity and sensitivity, easily modified

to quantify interactions and provide complete information

on either of protein or DNA, by themselves. Thus, the

above listed techniques will help researches to assess the

dynamics of DNAprotein interactions in cellular devel-

opment and disease progression.

Acknowledgments This study was supported by the research grantawarded to Dr. Vibha Rani by the Department of Science and

Technology, Government of India (SR/FT/LS-006/2009: Sept 4,

2009). We acknowledge Jaypee Institute of Information Technology,

Deemed to be University for providing the infrastructural support.

References

1. Bulyk ML, Gentalen E, Lockhart DJ, Church GM (1999)

Quantifying DNAprotein interactions by double-stranded DNA

arrays. Nat Biotechnol 17:573577

2. Bulyk ML (2006) DNA microarray technologies for measuring

proteinDNA interactions. Curr Opin Biotechnol 17:422430

3. Fox KR, Waring MJ (1987) The use of micrococcal nuclease as

a probe for drug-binding sites on DNA. Biochim Biophys Acta

909:145155

4. Dyke MWV, Dervan PB (1982) Footprinting with MPE.Fe(II)

complementary-strand analyses of distamycin-binding and

actinomycin-binding sites on heterogeneous DNA. Cold Spring

Harb Symp Quant Biol 47:347353

5. Dyke MWV, Dervan PB (1983) Methidiumpropyl-EDTA-Fe(II)

and DNase I footprinting report different small molecule bind-

ing site sizes on DNA. Nucleic Acids Res 11:55555567

6. Spassky A, Sigman DS (1985) Nuclease activity of 1,10-phe-

nanthroline-copper ion. Conformational analysis and footprint-

ing of the lac operon. Biochemistry 24:80508056

7. Nielsen PE, Hiort C, Sonnichsen SH, Buchardt O, Dahl O,Norden B (1992) DNA binding and photocleavage by ura-

nyl(VI)(UO22?) salts. J Am Chem Soc 114:49674975

8. Nielsen PE (1992) Uranyl photofootprinting of triple helical

DNA. Nucleic Acids Res 20:27352739

9. Churchill MEA, Hayes JJ, Tullius TD (1990) Detection of drug

binding to DNA by hydroxyl radical footprinting. Relationship

of distamycin binding sites to DNA structure and positioned

nucleosomes on 5S RNA genes of Xenopus. Biochemistry

29:60436050

10. Cons BMG, Fox KR (1989) High resolution hydroxy radical

footprinting of the binding of mithramydn and related antibiotics

to DNA. Nucleic Acids Res 17:54475460

Mol Cell Biochem (2012) 365:279299 295

1 3


18/21

11. Jain SS, Tullius TD (2008) Footprinting proteinDNA com-

plexes using the hydroxyl radical. Nat Protocols 3:10921100

12. Shafer GE, Price MA, Tullius TD (1989) Use of the hydroxyl

radical and gel electrophoresis to study DNA structure. Elec-

trophoresis 10:397404

13. Price MA, Tullius TD (1992) Using hydroxyl radical to probe

DNA structure. In: David MJ, Lilley JED (eds) DNA structures

part b: chemical and electrophoretic analysis of DNA, 11th edn.

Academic Press, San Diego, pp 194219

14. Routier S, Vezin H, Lamour E, Bernier JL, Catteau JP, Bailly C

(1999) DNA cleavage by hydroxy-salicylidene-ethylendiamine-

iron complexes. Nucleic Acids Res 27:41604166

15. Nielsen PE (1990) Chemical and photochemical probing of

DNA complexes. J Mol Recognit 3:125

16. Bailly C, Waring MJ (1995) Comparison of different foot-

printing methodologies for detecting binding sites for a small

ligand on DNA. J Biomol Struct Dyn 12:869898

17. Drew HR (1984) Structural specificities of five commonly used

DNA nucleases. J Mol Biol 176:535557

18. Fox KR, Waring MJ (2001) High-resolution footprinting studies

of drug-DNA complexes using chemical and enzymatic probes.

In: Chaires JB (ed) Drug-nucleic acid interactions. Academic

Press, San Diego, pp 412430

19. Galas DJ, Schmitz A (1978) DNAse footprinting: a simple

method for the detection of protein-DNA binding specificity.

Nucleic Acids Res 5:31573170

20. Leblanc B, Moss T (2000) DNAse I footprinting. In: Rapley R

(ed) The nucleic acid protocols handbook, 8th edn. Humana

Press, Totowa, NJ, pp 729735

21. Fox KR (2010) DNAse I footprinting. In: Fox KR (ed) Drug

DNA interaction protocols: methods in molecular biology. Hu-

mana Press, Totowa, NJ, pp 153172

22. Bailly C, Kluza J, Martin C, Ellis T, Waring MJ (2005) DNase I

footprinting of small molecule binding sites on DNA. In: Walker

JM, Herdewijn

DNA-Protein Interactions Techniques

Documents

Dam it's good! DamID profiling of protein-DNA interactions

Computational Tools for Protein-DNA Interactions

Protein DNA Interactions - University of Floridaoge.med.ufl.edu/courses/gms 6001/protein DNA interactions 2012-c.pdf · Objectives • Know the main factors that contribute to the

Micromechanical study of protein-DNA interactions and chromosome structure Assemble and disassemble protein-DNA complexes, observe in real time, control

Computational Tools for Protein-DNA Interactionsglaros.dtc.umn.edu/gkhome/fetch/papers/DNASurveyWIRE2011.pdf · 2021. 1. 14. · Computational Tools for Protein-DNA Interactions Christopher

Chapter 12: DNA-protein interactions

5- Dna Protein Interactions 2014

How can we detect measure gene/protein activation? DNA RNA Protein Northern Western (immunoblot) RNase Protection How do we examine DNA-protein interactions?

Measuring Protein-DNA interactions ... - DTU … · Measuring Protein-DNA interactions using Chromatin Immunoprecipitation and NGS (ChIP-Seq)

ChIP-SeqData Analysis: Probing DNA-Protein Interactions...5/27/20 1 ChIP-SeqData Analysis: Probing DNA-Protein Interactions Paul Schaughency1,2 , Tovah Markowitz1, Vishal Koparde3

Facilitation of DNA loop formation by protein–DNA non ...python.rice.edu/~kolomeisky/articles/c9sm00671k.pdf · non-specific interactions between protein and DNA molecules on the

Protein dna interactions

DNA AND PROTEIN INTERACTIONS IN THE REGULATION OF … · DNA AND PROTEIN INTERACTIONS IN THE REGULATION OF PLASMID REPLICATION MARCIN FILUTOWICZ, MICHAEL J. McEACHERN, PRADIP MUKHOPADHYAY*,

Computational Prediction of Protein-DNA InteractionsComputational Prediction of Protein-DNA Interactions Xide Xia Advisor: Dr. Mohammed AlQuraishi. Position Weight Matrix (PWM) PWMs

Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine

DNA–Protein Interactions- Methods for Detection and Analysis

Energetics of Protein–DNA Interactions

Modeling the speciﬁcity of protein-DNA interactions...R EVIEW Modeling the speciﬁcity of protein-DNA interactions Gary D. Stormo* Department of Genetics, Center for Genome Sciences

Biological Physics of DNA, protein-DNA interactions, and …2.pdf · 2006. 10. 3. · Biological Physics of DNA, protein-DNA interactions, and Chromosomes Part I. Micromechanics of

Cro Protein-DNA Interactions