13
Without proteolysis, the cell surface and extracellular matrix would be a rather static environment. Proteases have the unique ability to hydrolyse peptide bonds, irreversibly modifying the function of a substrate protein. By the spe- cific processing — rather than the complete degradation — of substrates, proteases modify signalling circuits and cell function. Proteins can be activated, inactivated or can undergo other changes in their function upon proteolysis. For example, it has long been recognized that processing is crucial to the selectivity of the clotting cascade 1,2 and in trypsinogen activation 3 . More recently, the proteolytic shedding of ectodomains has emerged as a crucial regulator of growth factors and cytokines such as transforming growth factor-α (TGFα) or tumour necro- sis factor-α (TNFα) 4–6 , and by processing chemokines, proteases orchestrate leukocyte trafficking and inflam- matory responses 7,8 . Equally important is that proteolysis can expose cryptic epitopes, such as in laminin following matrix metalloproteinase (MMP) cleavage to enhance cell migration 9 . Furthermore, proteolysis can release cryptic neoproteins such as angiostatin 10 and endostatin 11 , which are released from plasminogen and collagen type XVIII, respectively, by MMP cleavage. By precisely controlling cell function, proteases are essential components of signal- ling pathways, and therefore many are themselves classifi- able as signalling molecules 12 . Nearly 2% of mammalian genes encode proteases (FIG. 1) and proteases represent between 5 to 10% of drug targets 13,14 , reflecting their vital roles in vivo. Therefore, there is considerable interest in understanding the role of proteases in development, in healthy adults and in pathology, with the aim of devel- oping new strategies to block dysregulated proteolysis or to enhance its beneficial functions to treat disease. To understand the biological roles of individual proteases, it is essential to elucidate their substrate reper- toire, or substrate degradome 15 . Frequently, however, more than one protease can process a given substrate in vitro. This raises questions about which proteases are relevant in vivo: “just because it can, does not mean it does” 11 . For example, several proteases can process TNFα 4,5,16–19 , a pro-inflammatory cytokine with a crucial role in rheumatoid arthritis, and the chemokine (CXC motif) ligand-12 (CXCL12, or stromal cell-derived factor-1α (SDF1α)), which has a neurotoxic role in dementia 20 . Yet, in both cases, only one or a few of the candidate proteases seem pertinent to the disease, highlighting the importance of correctly identifying the relevant in vivo protease(s) for validation as a drug target. Once a relevant protease is identified, new ques- tions emerge, such as: what are its other substrates and functions, and could these limit the protease’s potential as a drug target or even relegate it as an anti-target 13 ? The goal of this review is to discuss the advantages and limitations of different strategies that can be used to link extracellular proteases to their substrates. The results obtained from biochemical and proteomics, or degradomics, studies can be extended and verified in cell cultures and in whole animals, and vice versa. We focus mainly on metalloproteinases, although the basic concepts are generally applicable to other extracellular protease–substrate pairs. In particular, we discuss how these approaches can help us to address the two most commonly encountered challenges in protease research: identifying the enzyme responsible for cleaving a par- ticular substrate in vivo, and searching for the substrates and biological functions of a protease. Biochemical approaches Establishing the biochemical properties of a protease is a crucial first step in its analysis, as this provides essential tools (BOX 1, TABLE 1) for subsequent in vivo characteri- zation and generates its molecular fingerprint (BOX 2). *The UBC Centre for Blood Research, CBCRA Program in Breast Cancer Metastasis, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, University of British Columbia, 4.401 Life Sciences Center, Vancouver, British Columbia, V6T 1Z3, Canada. Arthritis and Tissue Degeneration Program, Caspary Research Building, Room 426, Hospital for Special Surgery, 535 East 70th Street, New York, New York 10021, USA. Correspondence to C.P.B. e-mail: [email protected] doi:10.1038/nrm2120 Published online 14 February 2007 Extracellular matrix A complex extracellular network of structural proteins, including collagens, glycoproteins and proteoglycans, that supports cell adhesion and migration, and that transmits information through interactions with cell receptors. Processing Proteolysis that is distinct from degradation in that it represents a highly specific and efficient, yet limited, activity. Nonetheless, cleaving a protein at only one or two sites can result in a specific change of protein function. In search of partners: linking extracellular proteases to substrates Christopher M. Overall* and Carl P. Blobel Abstract | Proteases function as molecular switches in signalling circuits at the cell surface and in the extracellular milieu. In light of the many proteases that are encoded by the genome, and the even larger number of bioactive substrates, it is crucial to identify which proteases cleave a particular substrate and which substrates individual proteases cleave. Elucidating the substrate degradomes of proteases will help us to understand the function of proteases in development and disease and to validate proteases as drug targets. REVIEWS NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 245 © 2007 Nature Publishing Group

In search of partners: linking extracellular proteases to substrates

  • Upload
    carl-p

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Without proteolysis, the cell surface and extracellular matrix would be a rather static environment. Proteases have the unique ability to hydrolyse peptide bonds, irreversibly modifying the function of a substrate protein. By the spe-cific processing — rather than the complete degradation — of substrates, proteases modify signalling circuits and cell function. Proteins can be activated, inactivated or can undergo other changes in their function upon proteolysis. For example, it has long been recognized that processing is crucial to the selectivity of the clotting cascade1,2 and in trypsinogen activation3. More recently, the proteolytic shedding of ectodomains has emerged as a crucial regulator of growth factors and cytokines such as transforming growth factor-α (TGFα) or tumour necro-sis factor-α (TNFα)4–6, and by processing chemokines, proteases orchestrate leukocyte trafficking and inflam-matory responses7,8. Equally important is that proteolysis can expose cryptic epitopes, such as in laminin following matrix metalloproteinase (MMP) cleavage to enhance cell migration9. Furthermore, proteolysis can release cryptic neoproteins such as angiostatin10 and endostatin11, which are released from plasminogen and collagen type XVIII, respectively, by MMP cleavage. By precisely controlling cell function, proteases are essential components of signal-ling pathways, and therefore many are themselves classifi-able as signalling molecules12. Nearly 2% of mammalian genes encode proteases (FIG. 1) and proteases represent between 5 to 10% of drug targets13,14, reflecting their vital roles in vivo. Therefore, there is considerable interest in understanding the role of proteases in development, in healthy adults and in pathology, with the aim of devel-oping new strategies to block dysregulated proteolysis or to enhance its beneficial functions to treat disease.

To understand the biological roles of individual proteases, it is essential to elucidate their substrate reper-toire, or substrate degradome15. Frequently, however,

more than one protease can process a given substrate in vitro. This raises questions about which proteases are relevant in vivo: “just because it can, does not mean it does”11. For example, several proteases can process TNFα4,5,16–19, a pro-inflammatory cytokine with a crucial role in rheumatoid arthritis, and the chemokine (CXC motif ) ligand-12 (CXCL12, or stromal cell-derived factor-1α (SDF1α)), which has a neurotoxic role in dementia20. Yet, in both cases, only one or a few of the candidate proteases seem pertinent to the disease, highlighting the importance of correctly identifying the relevant in vivo protease(s) for validation as a drug target. Once a relevant protease is identified, new ques-tions emerge, such as: what are its other substrates and functions, and could these limit the protease’s potential as a drug target or even relegate it as an anti-target13?

The goal of this review is to discuss the advantages and limitations of different strategies that can be used to link extracellular proteases to their substrates. The results obtained from biochemical and proteomics, or degradomics, studies can be extended and verified in cell cultures and in whole animals, and vice versa. We focus mainly on metalloproteinases, although the basic concepts are generally applicable to other extracellular protease–substrate pairs. In particular, we discuss how these approaches can help us to address the two most commonly encountered challenges in protease research: identifying the enzyme responsible for cleaving a par-ticular substrate in vivo, and searching for the substrates and biological functions of a protease.

Biochemical approaches

Establishing the biochemical properties of a protease is a crucial first step in its analysis, as this provides essential tools (BOX 1, TABLE 1) for subsequent in vivo characteri-zation and generates its molecular fingerprint (BOX 2).

*The UBC Centre for Blood Research, CBCRA Program in Breast Cancer Metastasis, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, University of British Columbia, 4.401 Life Sciences Center, Vancouver, British Columbia, V6T 1Z3, Canada.‡Arthritis and Tissue Degeneration Program, Caspary Research Building, Room 426, Hospital for Special Surgery, 535 East 70th Street, New York, New York 10021, USA.Correspondence to C.P.B. e-mail: [email protected]:10.1038/nrm2120

Published online

14 February 2007

Extracellular matrixA complex extracellular

network of structural proteins,

including collagens,

glycoproteins and

proteoglycans, that supports

cell adhesion and migration,

and that transmits information

through interactions with cell

receptors.

ProcessingProteolysis that is distinct from

degradation in that it

represents a highly specific and

efficient, yet limited, activity.

Nonetheless, cleaving a protein

at only one or two sites can

result in a specific change of

protein function.

In search of partners: linking extracellular proteases to substratesChristopher M. Overall* and Carl P. Blobel‡

Abstract | Proteases function as molecular switches in signalling circuits at the cell surface

and in the extracellular milieu. In light of the many proteases that are encoded by the

genome, and the even larger number of bioactive substrates, it is crucial to identify which

proteases cleave a particular substrate and which substrates individual proteases cleave.

Elucidating the substrate degradomes of proteases will help us to understand the function

of proteases in development and disease and to validate proteases as drug targets.

R E V I E W S

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 245

© 2007 Nature Publishing Group

Threonine (28)

Threonine (26)

Aspartate (21)

Cysteine (148)

Serine (175)

Metallo (194)

273Extracellular

277Intracellular

16Proteases566

Aspartate (27)

Cysteine (162)

Serine (224)

Metallo (205)

341Extracellular

287Intracellular

16Proteases644

Human

Murine

EctodomainThe extracellular portion of a

plasma-membrane protein. In

secretory vesicles, the

topologically equivalent

compartment is the lumen.

ChemokinesA large group of cytokines that

elicit chemotactic responses

from leukocytes and some

other cells that express specific

G-protein-coupled chemokine

receptors.

Matrix metalloproteinases(MMPs). A family of 23

metzincin proteases in humans

capable of degrading

extracellular matrix proteins

and of processing many

bioactive molecules.

NeoproteinA protein with a new function

generated from a protease-

cleavage product that is

functionally different from the

parent protein.

DegradomeThe complete set of proteases

that are expressed at a specific

time by a cell, tissue or an

organism. The degradome of a

protease is its substrate

repertoire.

A purified active protease that is devoid of contaminating activities is necessary for determining its consensus cleavage site, for assay development and for the in vitro determination of its enzyme kinetics and inhibitor profile. Proteases have often been purified as a sub-strate-cleaving activity from a biological sample1–5.

Once sequenced and identified, recombinant protease is then required to validate that the substrate-cleavage activity was not due to a contaminant protease in the sample and to screen for other substrates. However, contaminating proteases can also occur in recombinant systems, so purifying a catalytically inactive mutant enzyme in parallel can rule out the co-purification of contaminating activities.

Cleavage-site identification. For biochemical character-ization and molecular fingerprinting of a protease, it is essential to identify a peptide sequence that the protease cleaves21. In the case of orphan enzymes, for which no substrates are known, identifying any substrate peptide or protein can be a hurdle. Purified orphan proteases are usually first incubated with proteins that are readily cleaved by many proteases, including the insulin B-chain, myelin basic protein, gelatin, casein and α2-macroglobulin22–24. Libraries of cleavage-site-containing peptides designed from candidate substrates, such as those that are shed from the cell surface, can also be screened. Candidate protein or peptide substrates can be analysed for cleavage using SDS–PAGE, zymog-

raphy, mass spectrometry (MS) or high-performance liquid chromatography22,24. Determining the substrate turn over rate (kcat/Km) of potential substrates can indi-cate the likelihood of a protease cleaving a substrate in vivo5,7,12. Substrates that are identified by biochemical assays in vitro can therefore point towards potential in vivo substrates. However, such a relationship does not always actually exist, so it must be verified in cells and animal models.

Many approaches can be used to identify consensus cleavage sites, such as filamentous phage display, bacterial, retroviral or yeast α-halo libraries, random peptide librar-ies21,25–28 or Ala-scanning mutagenesis of a cleavage site29. These techniques screen for short substrate-recognition and cleavage-site consensus sequences, in part to limit the number of possible sequence combinations30. Positional libraries, in which some residues are fixed but adjacent sites are randomized, allow the screening of longer sequences. Such soluble peptide libraries31 have been improved by fixing them to glass slides32 and in arrays33,34. Alternatively, random peptide sequences can be combined with chemical groups that target the active site to form synthetic pro-tease inhibitors. These can be incorporated with detection moieties such as biotin or fluorescent groups, so forming activity-based probes that can report specificity profiles while simultaneously generating selective inhibitors35. A caveat is that most peptide-library approaches only probe the non-prime residues, and only a few techniques allow the evalu ation of the prime residues of a cleavage site36,37. Although phage display provides information on sequences that can be cleaved, it does not directly iden-tify the cleavage site. Other limitations arise with some proteases such as ADAM proteases and MMPs, which do not have stringent consensus cleavage sites38,39. Other proteases such as ADAMTS4 (ADAM with thrombo-spondin motif-4) and collagenolytic MMPs require an exosite-recognition sequence40 to cleave aggrecan41 and native triple helical collagen40,42, respectively.

Figure 1 | Human and murine protease degradomes. a | The repertoire or degradome of human and murine

proteases, their class distribution (threonine, aspartate,

cysteine, serine or metalloproteinase) and their

partitioning (to the extracellular, intramembrane and

cytoplasmic compartments). The 16 intramembrane

proteases are found in the membranes on the cell surface,

endoplasmic reticulum and mitochondria and are

responsible for regulated intramembrane proteolysis.

Box 1 | Toolbox for biochemical characterization of proteases

Characteristic biochemical properties of proteases include their inhibitor profile (TABLE 1), optimum pH, cleavage-site preference, cleavage kinetics and exosites. For multidomain proteases, the catalytic domain alone can often be used for assay development and inhibitor screening, although the full-length form is preferable, as it preserves substrate-binding exosite interactions40 and might facilitate protein folding and pro-domain removal from the zymogen form of the protease113. Truncated domains can be used to identify exosites as well as regions that stabilize inhibitor binding39,40. For cell-surface proteases, deleting the membrane-anchor is necessary to obtain a soluble protease for assays22,24. Creating a catalytically inactive mutant protease (for example HEXXH to HAXXH in metalloproteinases39,114) is an essential control to validate substrate cleavage by the active protease in a cellular context where other co-purifying proteases, cofactors and substrates can confound data interpretation22,24. Pure protease can be used to generate antibodies for ELISA (enzyme-linked-immunosorbent assay) and immunodepletion techniques, as well as for histochemistry and western blots, and these antibodies might also block the protease’s function115. Anti-neoepitope antibodies are a particularly powerful means to specifically recognize the new amino or carboxyl groups of cleavage products in complex biological samples58,116, particularly for low-abundance bioactive molecules7. Similarly, anti-neoepitope antibodies to the N terminus of an activated protease that is secreted as a zymogen are a sophisticated tool for linking levels of active enzyme with substrate processing in complex samples13. After determining a consensus cleavage site, substrate peptides can be synthesized with quenched fluorophores, which become dequenched following cleavage, providing a fluorometric readout for catalytic activity. These in turn can be used for biochemical characterization of the protease.

R E V I E W S

246 | MARCH 2007 | VOLUME 8 www.nature.com/reviews/molcellbio

© 2007 Nature Publishing Group

Table 1 | Some class-specific chemical inhibitors of proteases

Inhibitor Concentration General comments

Metalloproteinase

EDTA* 1 mM Releases cells from substratum in culture, and so can confound cell-based assays

1,10-phenanthroline* 1–10 mM Toxic to cells

Hydroxymate inhibitors (such as GM6001, BB94 and TAPI-2)

Well tolerated by cells

Phosphoramidon 100 μg per ml

Bestatin 40 μM Inhibits aminopeptidase metalloproteinases

d-cysteine 1 mM

d-penicillamine 1 mM (2S)-2-amino-3-methyl-3-sulfanyl-butanoic acid is a new class of inhibitor, termed a catalytic inhibitor, that releases the Zn2+ ion from the enzyme generating the apoenzyme. Other metalloproteinase inhibitors form a stable ternary complex with the enzyme and the active-site zinc, thereby inactivating the enzyme

Serine

PMSF* 1 mM

DFP* 0.1 mM Di-isopropylfluorophosphate (DFP) is potent but highly toxic

AEBSF* 1 mM

Aprotinin 1 μM

DCI 5–100 μM 3,4-dichloroisocoumarin (DCI) is a class-specific reversible inhibitor

TLCK 50 μg per ml Tosyl lysyl chloromethyl ketone (TLCK) inhibits trypsin-like proteases and some cysteine proteases

TPCK 100 μg ml Tosyl phenylalanyl chloromethyl ketone (TPCK) inhibits chymotrypsin, many proteases with a P1 phenylalanine specificity and some cysteine proteases (such as papain, bromelain and ficin)

APSMF 50 μM (4-amidino-phenyl)-methane-sulfonyl fluoride (APSMF) inhibits trypsin-like serine proteases

Benzamidine 1 mM Inhibitor of trypsin-like serine proteases

Chymostatin 10–100 μM Inhibitor of chymotrypsin-like serine-protease inhibitors

Elastinal 10–100 μM Specific for elastases with no inhibition of trypsin or chymotrypsin

Antipain 1–100 μM Inhibits papain, trypsin and plasmin

Leupeptin 10 μM Inhibits trypsin-like serine and some cysteine proteases

6-aminocaproic acid 1 mM

Threonine

Epoxomicin* 1 μM

Lactacystin* 2–10 μM

Bortezomib (Velcade) Used for treatment of multiple myeloma

Cysteine

E64* 10 μM l-trans-epoxysuccinyl-leucylamide-(4-guanido)-butane or N-(N-(l-trans-carboxyoxiran-2-carbonyl)-l-leucyl)-agmatine will also inhibit trypsin

Organomercurials (APMA and PCMB)*

1 mM Aminophenylmercuric acetate (APMA) and parachloromercuric benzoate (PCMB) inhibit cysteine proteases but also activate MMPs owing to reactivity with the propeptide cysteine, which coordinates with the catalytic zinc ion in the latent enzyme. Disrupting this interaction allows for the autoactivation of the MMPs to then occur in vitro

N-ethylmaleimide* 1 mM

Chymostatin 10–100 μM

Leupeptin 10 μM Also inhibits some trypsin-like serine proteases

TLCK 50 μg per ml TLCK inhibits trypsin-like proteases and some cysteine proteases

TPCK 100 μg per ml TPCK inhibits chymotrypsin, many proteases with a P1 phenylalanine specificity and some cysteine proteases (such as papain, bromelain and ficin)

Aspartate

Pepstatin A * 1–5 μM

* Marks inhibitors that can block every protease in their respective class. AEBSF, 4-(2-aminoethyl) benzene sulfonylfluoride hydrochloride; MMP, matrix metalloproteinase; PMSF, phenylmethane sulfonylfluoride.

R E V I E W S

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 247

© 2007 Nature Publishing Group

Anti-targetA molecule with essential roles

in normal cell and tissue

function, or life; the down

modulation of an anti-target

results in clinically

unacceptable side effects,

initiation of disease or

deleteriously alters disease

progression.

Extracellular proteaseAn enzyme that has a catalytic

domain in the extracellular

compartment or in the lumen

of a secretory compartment,

both of which are topologically

equivalent.

ProteomicsInvestigations and techniques

for elucidating the proteome.

DegradomicsAll genomics, proteomics and

systems biology investigations

and techniques regarding the

genetic, structural and

functional identification and

characterization of the

proteases, inactive

homologues, protease

substrates and protease

inhibitors that are present in an

organism.

The consensus cleavage site for a particular protease can be used to search databases for candidate substrates, but only a few natural substrates have been discovered using bioinformatics15,37,43. The number of potential sites in databases far exceeds the number of those that are actually cleaved, mainly because only a fraction of the candidate sites are accessible in the folded protein. In collagen, for example, despite the presence of numerous Gly-Leu and Gly-Ile bonds that could be hydrolysed by MMP1, only those at positions 775-776 on the 3 α-chains are cleaved in the native triple helical form40,42. Database searches will also not identify substrates that are cleaved at sites that are not kinetically optimal44. Here, slow kinetics of such cleavage events can afford control of biologically important processes. For example, the cleavage of native collagen by MMPs is slow, but this might be useful in maintaining tissue integrity during homeostatic remodelling. Exosite contributions that drive catalysis of poorly cleaved bonds7,40,45 as well as the colocalization and co-expression of a protease and sub-strate in vivo are also not considered by current database searches. So, predicting substrates from consensus sites is not yet a reliable approach, but programmes such as PoPS (see Prediction of Protease Specificity in Further information) are improving coverage and hit rates43.

Exosite scanning and inactive-catalytic-domain capture. An improvement over serial biochemical techniques for substrate discovery are unbiased genetic screens that use yeast two-hybrid analyses or immobilized inactive pro-tease domains to capture interacting proteins that might be substrates. For multidomain proteases, exosite inter-actions can turn substrates with low kcat/Km ratios in vitro into good substrates in vivo, as has been shown for the

MMP-mediated cleavage of chemokines45, collagen40,42 and PAPP-A (pregnancy-associated plasma protease-A) cleavage of insulin-like growth-factor-binding protein-4 (IGFBP4) and IGFBP5 (REF. 46). Such interactions can be discovered in an unbiased manner by exosite-scanning approaches45, in which isolated exosite domains function as bait to identify novel interactor proteins that, following their biochemical validation, might be confirmed as substrates. Although first used in yeast two-hybrid screens7,45,47, high-throughput exosite scanning now involves solid-phase immobilization of the protease and tandem MS (MS/MS) identification of the proteins that are captured from complex samples15. Inactive-catalytic-domain capture (ICDC) uses a catalytically inactive mutant-protease domain as a bait to trap proteins that bind to the active site15. Using this method, the WNT1-inducible signalling-pathway protein-2 (WISP2) was identified as a novel substrate of membrane-type matrix metalloproteinase-1 (MT1-MMP) in a yeast two-hybrid screen48. However, not all protease families are amenable to ICDC owing to high Km values for substrates and short or featureless active-site clefts.

Cell-biological approaches

Cell-based assays are not only essential to validate the results of biochemical, proteomics and animal studies, but they are also powerful primary tools for identifying protease–substrate pairs in their native forms and in the physiological environment of intact cells. For example, in vitro, fibronectin is a substrate for MT1-MMP19 and TNFα can be cleaved by ADAM9 (REF. 17), but neither fibronectin nor TNFα are substrates in the cellular context. Post-translational regulation and subcellular localization49 of the protease and of the substrate can also constrain their possible interactions in cells. We will first discuss how cell-based assays can provide insights into the inhibitors and activators of a protease, and then how loss- and gain-of-function experiments can be used to link enzymes and substrates and to characterize the properties of individual enzymes in cells.

Inhibitor and activation profile. Informative criteria for linking a protease to a substrate in cellular systems are the protease’s response to different activators and inhibitors of proteolysis, which we refer to as its fingerprint (BOX 2 outlines what constitutes an enzyme’s fingerprint in dif-ferent systems, including biochemical assays and animal models). For illustrative purposes, if we assume that one protease is responsible for an activity in cells, then this activity will have the same fingerprint as the enzyme. If the targeted deletion of this enzyme leads to compen-sation by another enzyme, as defined by the increased expression or post-translational activation of another enzyme, the fingerprint of the activity will have changed in the knockout compared with the wild-type cells, and will now correspond to that of the compensating enzyme (BOXES 2,3). In the case of functional redundancy, there might be two or more enzymes that are each fully capable of carrying out the same task. In this case, the fingerprint of the activity is the combined fingerprint of the redundant enzymes in proportion to their relative

Box 2 | Protease fingerprinting

A key challenge in protease research is defining selective tools to monitor the activity of individual proteases. It is not uncommon, for example, that a single type of inhibitor, such as small interfering (si)RNA, is used to draw conclusions about the activity of an enzyme, despite the potential for off-target effects. Knocking out a single gene can also affect expression of other genes, which can sometimes mask or compensate for the knockout effect, but sometimes also potentiate it. So, to facilitate identifying physiologically relevant enzyme–substrate pairs, it is invaluable to establish the molecular fingerprint of each protease. The fingerprint is defined primarily as the response to inhibitors and activators of proteolysis, and combined with other information can provide strong clues to the identity of the protease. Inhibitors include small molecules (hydroxamates for metalloproteinases13,14,117), blocking antibodies115, natural inhibitors110 (including tissue inhibitors of metalloproteinases (TIMP1–4)118), pro-domains119, exosite domains7,119,120 and aptamers121. The inhibitor profile of an enzyme in cell-based assays and biochemical assays can differ, so both must be determined separately. Importantly, the selectivity of any inhibitor in cells must be determined in cell-based assays that provide a highly specific readout for an enzyme’s activity, such as the processing of a substrate that is abolished in cells from mice that lack this protease6,54. Some chemical inhibitors are cytotoxic, so reduced substrate cleavage might be misinterpreted because of cell death. Additional fingerprinting tools in cells are siRNA122, stable transfections of stable hairpin RNA interference (shRNAi) against a protease or protease inhibitor13, dominant-negative constructs15,104, and determining the response of a protease to cellular activators such as phorbol esters and Ca2+ ionophores55. Last, expression analysis by transcript analysis and proteomics in mice or in human disease helps determine whether proteases and substrates are co-expressed in the relevant cells or tissues.

R E V I E W S

248 | MARCH 2007 | VOLUME 8 www.nature.com/reviews/molcellbio

© 2007 Nature Publishing Group

ZymographyA method for determining

protease substrates.

Substrates are separated by

non-denaturing gel

electrophoresis and are

incubated with a protease.

Negative staining of the stained

substrate gel reveals enzymatic

activity because the protease

has degraded the substrate in

the gel.

Activity-based probeMechanism-based inhibitor

that has been modified by

incorporating detection

moieties, such as fluorophores,

biotin and radioactive

elements, to specifically target

and visualize individual

proteases or a family of

proteases in complex samples.

Non-prime residueA residue in the substrate that

is N-terminal to the proteolytic

cleavage site is called a non-

prime (P) residue, and in some

proteases forms part of the

recognition motif for substrate

cleavage.

Prime residueA residue in the substrate that

is C-terminal of the proteolytic

cleavage site is called a prime

(P′) residue and in some

proteases forms part of the

recognition motif for substrate

cleavage.

ADAM proteasesA disintegrin and

metalloprotease (ADAM)

proteases are multifunctional

membrane proteins with

crucial roles in ectodomain

shedding of other membrane

proteins, such as the ligands of

the epidermal-growth-factor

receptor.

Tissue inhibitors of metalloproteinases (TIMPs)A family of four specific

inhibitors of matrix

metalloproteinases and some

ADAM proteases that are

expressed by most mammalian

cells.

AptamersAptamers are chemically

synthesized (usually short)

strands of oligonucleotides

(DNA or RNA) that can adopt

highly specific three-

dimensional conformations.

Aptamers are designed to

have appropriate binding

affinities and specificities

towards certain target

molecules.

contributions to the activity. If all redundant enzymes are blocked and there is no compensation, then the activity should be absent. However, if the fingerprint remains unchanged in a knockout cell line, then a different candidate enzyme was more relevant in the first place.

It is important to note that interclass and interfamily protease cascades exist in the protease web13 (BOX 3), so completely different classes of inhibitor might block the processing of a substrate. For example, substrates of the furin-activated ADAM proteases22,50 and MMPs39,51 might show reduced substrate cleavage in the presence of inhibi-tors of the serine hydrolase furin as well as in the presence of inhibitors of metalloproteinases. Similarly, the chymase activation of MMP9 (REF. 52) and the chymase generation of angiotensin II53 is an interclass cascade, which results in a complex inhibitor fingerprint profile.

Through the careful use of an activity fingerprint, experiments can be devised that link the activity finger-print to that of a protease and can therefore assist in link-ing proteases and substrates in vivo (BOXES 2,3). However, to determine an enzyme’s fingerprint, a cell-based assay providing a specific readout of an enzyme’s catalytic activity is required. Such an assay could test for substrate processing in wild-type cells that is absent in knockout cells6,54 (loss of function) or for the overexpression of an enzyme (gain of function), in which case overexpressing an inactive form of the enzyme can serve as a good control17,55.

Loss-of-function studies. Loss-of-function studies in cells from protease-knockout mice are considered the gold standard for linking proteases to substrates. Generally, comparing primary cells from knockout and wild-type mice is preferable to comparing immortalized cell lines, because the protease degradome can be substantially altered by immortalization, independently of the dele-tion of a protease54,56. An important control is rescue by the re-introduction of the wild-type enzyme. If the level of an endogenous substrate protein is not high enough

to detect cleavage, then exogenous substrate can be added, as has been done when analysing the processing of chemokines by MMPs20. For membrane-protein substrates, the overexpression of the wild-type or of an alkaline-phosphatase-tagged form of the substrate protein has been used to facilitate the detection of processing6,54,57. If possible, results should be confirmed by analysing the proteolysis of the endogenous substrate in other primary cells, or of an untagged substrate if the initial experiments used a tagged substrate. Last, anti-neoepitope antibodies are excellent for monitoring substrate processing in vivo7,12,58. If knockout mice are not available, or when working with human cells, small interfering RNA represents a powerful method to carry out loss-of-function studies59.

Gain-of-function cell-based assays. Gain-of-function studies can be a good alternative to loss-of-function assays to establish candidate substrates, and also can pro-vide the first readout of the catalytic activity of an orphan enzyme in cells — an essential first step in defining its fingerprint17,19,55,60. However, an important caveat of overexpression experiments is that non-preferred or even biologically irrelevant substrates might be cleaved owing to artificially high enzyme-to-substrate ratios or because the enzyme and substrate are not normally co-expressed13. Nevertheless, if an overexpressed enzyme increases the processing of a substrate, then the enzyme might contribute to the substrate’s processing if both are co-expressed in vivo or if the enzyme is dysregulated in disease.

If no activity is observed for an overexpressed enzyme, then the enzyme might not be in its active form or might lack an essential cofactor. Expression in diff-erent cell lines might therefore be required. In cells in which there is already a dominant activity that cleaves a substrate, the overexpressed enzyme might only make a minor contribution to overall substrate processing, and it can help to carry out overexpression experiments in knockout cells that lack the main processing enzyme17,55. The role of dominant and minor enzymatic activities

Box 3 | Protease web: a systems biology perspective of proteolysis

An important concept that has only recently emerged from degradomics analyses of protease transfectants60 is the highly interdependent nature of proteolytic systems in vivo, which form the protease web13. Dynamic interconnections between proteolytic pathways and proteases of different families can form information conduits that determine the functional state of the protease web and, therefore, the proteolytic potential of a cell or tissue at a particular time. So, proteolytic activity towards a particular substrate in a cell or tissue represents the summed net activities of all proteases present that directly cleave the substrate or indirectly modify the activity of the responsible protease(s). How then do we define the relative importance of one of a number of proteases or their inhibitors that, at any one time, are responsible for substrate cleavage? Similarly, for substrates that can be cleaved by multiple proteases, which of these are essential and which can be compensated for? If there is a mechanism for compensation, it must be hard wired such that a cell can detect the lack of processing of a substrate, and then use another way to effect the processing. How long might it take for such a compensatory pathway to be activated? This has implications for whether or not it can even be detected in the first place, as rapid compensation would be difficult to detect. For example, fast-acting and selective inhibitors might demonstrate an essential role of a protease before compensation by other proteases begins, whereas compensatory mechanisms might be triggered in knockout cells. In this case, an activity in wild-type cells would have a fingerprint that is distinct from the compensatory activity in knockout cells. Answering such questions will be facilitated by further systematic development of well defined and selective fingerprinting tools and greater use of systems biology approaches to validate protease–substrate pairs. There is also a need to understand the interconnected nature of proteolytic systems and the changes in the protease web that are induced by developmental and pathological stimuli, as well as by experimental manipulations such as small interfering RNA, genetic knockouts and transgenic overexpression.

R E V I E W S

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 249

© 2007 Nature Publishing Group

ExositeA substrate-binding site that

lies outside the active-site cleft

of a protease. It is usually

located on substrate-binding

modules or domains and can

function to accelerate the rate

of substrate cleavage.

Anti-neoepitope antibodyAn antibody that specifically

recognizes the free amino or

carboxyl groups of the amino-

acid residues from a cleaved

scissile bond that forms the

new N and C termini of the

cleaved product.

ProteomeThe expressed set of proteins

that are encoded by the

genome and that are

expressed by a particular cell

or tissue.

Peptide mappingBy proteomically comparing

the abundance ratios of

multiple peptides from a

substrate with their location in

the protein sequence, the

domain that is proteolytically

released can be predicted, as

can the general location of the

cleavage site.

could be reversed in other cells, for example, if the minor enzyme is overexpressed in a disease. Moreover, activators such as phorbol esters, Ca2+ ionophores55,61, stimulation of G-protein-coupled receptors62 or activa-tion of the mitogen-activated-protein-kinase pathway63 are a few examples of factors that might be necessary to detect the activity of a highly regulated enzyme. As dif-ferent stimuli can activate distinct enzymes for the same substrate, it is important to confirm which enzyme is responding to any given stimulus55,61. The main value of gain-of-function experiments therefore lies in their abil-ity to serve as a hypothesis-building tool by identifying potential substrates, which must be verified in vivo.

Degradomics approaches

As biological substrates often differ from the theoretical substrates that have been inferred from in vitro experi-ments, one of the best methods of substrate discovery is to identify protease-cleaved products of natural sub-strates in complex cellular milieus19. In this case, the substrates are in their native forms and locations, and can interact with other proteins that might modulate their susceptibility to proteolysis. This complexity can be globally analysed using proteomics. Various proteo mics approaches have been adapted for degradomics with new techniques being developed to expand the protease–substrate degradome and to discover substrates for orphan proteases15,21.

Gel-based proteomics identification of substrates. By comparing the intensities of protein spots from pro-tease-treated and control samples on two-dimensional (2D) PAGE gels, candidate substrates can be identified from reduced amounts of intact substrate and new cleav-age fragments, as has been done for MT1-MMP64 and caspase-3 in vitro65. As reliable relative quantification of spots between gel sets is crucial, fluorescent 2D difference gel electrophoresis (DIGE) is a good approach to control gel standardization internally and has been applied to iden-tify substrates of parasite proteases, granzyme A and B, and of ADAM10 and ADAM17 (REFS 66–68) (FIG. 2a). Protease-treated and control proteomes are labelled with either Cy3 or Cy5, pooled and electrophoresed together. After merging the Cy3/Cy5 images, substrate and cleav-age products can be identified as individually coloured spots. Mass spectro metry or Edman sequencing then identifies proteins in spots. However, DIGE shares the limitations of 2D PAGE in that very large or small sub-strates and cleavage products will be missed, as will those that are highly hydrophobic, acidic or basic. Sensitivity remains an issue for low-abundance mole cules and it is a difficult process to automate. However, the concep-tual simplicity and minimal requirements for expensive infrastructure is attractive for the identification of mod-erately to highly abundant substrates.

Liquid-chromatography proteomics approaches. In bottom–up proteomics approaches, MS/MS sequencing of tryptic peptides can be used to identify the indi-vidual proteins in a trypsinized proteome. Applying multi dimensional liquid-chromatography separation

or MuDPIT (multidimensional protein-identification technology)69 to trypsinized samples improves the resolution of the peptide landscape before MS analysis and therefore enhances proteome coverage. Although liquid chromatography is non-quantitative and cannot distinguish peptides of proteins from different sources in mixed samples, it has been used to identify bioti-nylated membrane proteins shed following phorbol ester stimulation70. Undersampling, the technical inability to identify every peptide in a complex sample, means that two samples cannot be reliably compared to identify the loss of a protein substrate or the generation of proteo-lytic products. To address this issue, isotope tagging of samples can be used for relative quantification of protein amounts in different samples, as described below.

Isotope mass tags. Guo et al.71 first showed that deute-rium0 or deuterium5 N-ethyl-iodoacetamide-labelled Cys residues in proteins shed constitutively versus those shed following induction could be identified proteomically. In isotope-coded affinity tag (ICAT) labelling, all of the Cys residues in proteins are covalently labelled (FIG. 2b) with a biotinylated isotopically heavy or light tag, using [C13]9 or deuterium tags. This can be used to identify the peptides as having originated from the proteolysed or the control sample19,60. Protein domains and peptides that are shed from the cell or the pericellular matrix by proteolysis accumulate in the conditioned medium and so exhibit a mass-tag ratio of protease-transfected to inactive-mutant control that is greater than 1.0, whereas degraded proteins have a ratio of less than 1.0. Reduced cleavage of membrane proteins in the presence of hydrox-amate-type metalloproteinase inhibitors was reflected by increases in their ICAT ratios in drug-treated versus non-treated samples60. Many new MT1-MMP substrates as well as indirect effects on signalling and chemotactic factors have been proteomically identified and then bio-chemically validated using ICAT19. In cells treated with the hydroxamate Ilomastat, shedding of cell-membrane proteins was reduced, as detected by decreased levels in conditioned medium and increased levels in the plasma membrane compared with in untreated control cells.

In another approach, isobaric mass tags can be used to label all primary amines in a trypsinized proteome (FIG. 2c). iTRAQ tags are chemically identical, but frag-ment differently following MS/MS collision to generate distinct signature spectral peaks. The signature peaks from each of the 8 iTRAQ tags differ by a mass-to-charge ratio of 1, so it is possible to distinguish the relative contribution from each of up to 8 samples to a total peptide mixture when analysed by MS/MS sequencing. Comparison of the signature-peak areas provides relative quantifications. As the primary amines of all tryptic peptides are labelled, many more peptides per protein are identified than by ICAT and with peptide mapping, which can narrow down the cleavage site and predict the domain shed. This represents a significant improve-ment in isotype tagging for substrate discovery. Many new and known substrates of human MMP2 have been identified using iTRAQ72 by expressing the protein as an active enzyme at lower than physiological amounts

R E V I E W S

250 | MARCH 2007 | VOLUME 8 www.nature.com/reviews/molcellbio

© 2007 Nature Publishing Group

Cy5

Candidatecleavage fragment

+ protease– protease + protease– protease

Cy3

Proteome samples

2D SDS–PAGE gel

x

x

xx

Candidateintact substrate

Trypsin digestion

Biotin-avidin pullout ofICAT-labelled peptides

MuDPIT

Quantification in MS mode

Identification in MS/MS mode

m/z

Candidatesolublesubstrate

Candidatesheddasesubstrate

Cell-culture conditionedmedium proteomes

Label proteins with ICAT [C12]9 Label proteins with ICAT [C13]9

x

Pool

Condition 1– protease

Trypsin digestion Trypsin digestion Trypsin digestion

iTRAQ 115 iTRAQ 116 iTRAQ 117iTRAQ 114

MuDPIT

MS/MS quantification and identification

Trypsin digestion

Condition 1+ protease

Condition 2– protease

Condition 2+ protease

TrypsinizeSimplification of peptidesby N-terminal pullout andanalysis or removal of internaltryptic peptides

Cell-culture conditionedmedium or cell lysates Identification in MS/MS mode

x

Block primary aminesor guanidination of Lys

+ protease MuDPIT

a DIGE

c iTRAQ

d N-terminope analysis

b ICAT

Figure 2 | Degradomics discovery of protease substrates. a | Difference gel electrophoresis (DIGE). Samples are first

incubated with or without proteases, and then labelled with the fluorophores Cy3 (red) or Cy5 (green). The proteins and

peptides can be separated by two-dimensional (2D) SDS–PAGE and the fluorescence pattern is analysed. Fluorescence

from co-migrating Cy3- or Cy5-labelled proteins merge to yellow. Intact substrate protein spots are reduced in the

Cy5-protease treated sample and therefore appear red, whereas cleavage products appear green. Spots are sequenced

by tandem mass spectroscopy (MS/MS). b | Isotope-coded affinity tag (ICAT) labelling. Protein from protease-treated and

control samples are labelled on Cys residues with ICAT tags, pooled and digested with trypsin. The ICAT-tagged peptides

are then affinity purified by the biotinylated tag. These are quantified by analysis of peak pairs in the MS spectrum and

identified by MS/MS. ICAT ratios of extracellular protease-to-control peak areas that are less than 1 indicate protein

degradation, ratios that are greater than 1 indicate shedding. c | iTRAQ. Protease-treated or control samples are

trypsinized and all amino groups are then iTRAQ labelled. Following MS/MS, the iTRAQ labels fragment, generating ion

peaks at mass-to-charge (m/z) ratios of 114.1, 115.1, 116.1 and 117.1, which identify the sample origin. Peak-height analysis

enables relative quantification. d | N-terminope analysis. Several recent techniques have been designed to identify the

N-terminope, the N-terminal amino-acid sequence that is generated after cleavage of the peptide bond by the protease.

Owing to similar chemistry of the primary amine groups of the N terminus of a protein and of the Lys side chains, several

strategies can be adopted. In one, the Lys residues are guanidinylated and then the amino group of the N terminus can be

pulled out and sequenced after tryptic digestion. In another variant, the amines are chemically blocked and then the

sample is trypsinized. Except for the N termini of the proteins and the protease-cleavage products, which are blocked, the

tryptic peptides now have a free amino group. This can be used to react with a highly hydrophobic reagent such as TNBS

to allow for separation from the acetylated N termini by combinational fractional diagonal chromatography (COFRADIC),

or to remove these peptides from the mixture with a derivatized resin, magnetic beads or polymer. The remaining peptides

represent the N-terminome of the sample and include the protease-generated N-terminopes. MuDPIT, multidimensional

protein identification technology; TNBS, 2,4,6-trinitrobenzesulfonic acid.

R E V I E W S

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 251

© 2007 Nature Publishing Group

TerminopesThe N and C termini of a

protein are chemically

distinguished from the

remainder of the intact

molecule. Terminopes are

generated following proteolytic

cleavage. They might also be

immunologically recognized as

an antibody epitope, called a

neoepitope.

SingletonsIn proteomics analyses of two

peptide samples that are

labelled with different isotopes,

a singleton is a single ion peak

that is detected in the mass

spectrometry spectrum

without its comparative

isotopic counterpart owing to

the absence of that parent

protein or peptide in one

sample. This might occur

because of reduced expression

or following the cleavage of an

intact protein or in the

generation of a unique N- or C-

terminal peptide following

cleavage.

in murine Mmp2–/– cells. These labelling approaches are best suited for the study of extracellular proteases for which the accumulation of cleavage fragments can be detected in the conditioned medium. However, degra-dation of cytosolic proteins should also be quantifiable by isotope tagging, as should domains released from organelles or plasma membranes to the cytosol, such as through regulated intramembrane proteolysis73.

Neo-N-terminal isotope labelling of substrates. N- or C-terminal truncation of one or a few residues of a protein might be missed because of insufficient gel resolution on DIGE. Moreover, the certainty of identification of a pro-tease-truncated peptide in a complex mixture is reduced in the usual database searches. By modifying conventional procedures, the primary amine of the neo-N termini of pro-tease-cleaved substrates or the terminopes can be identified by proteomics and distinguished from amines present in Lys side chains. For example, chemical blocking of Lys by guanidination can allow specific N-terminal labelling and pullouts to be performed to identify the N-terminome74. Alternatively, the N-terminal and the primary amines of Lys residues can be acetylated. Trypsin treatment will now skip the blocked Lys residues to produce longer peptides, which facilitates the sequence identification of a cleaved substrate. The internal tryptic peptides can be specifi-cally removed to simplify the peptide mixture by various ways. Combined fractional diagonal chromatography separates these based on hydrophobicity differences75, and N-hydroxy-sulfosuccinimide (NHS)-biotinylation76 specifically removes the internal tryptic peptides leaving the N-terminome. If labelled, the protease-cleaved neo-N termini are singletons. Peptide positional information is used to identify protease-cleavage sites (FIG. 2d). However, since proteins can only be identified by one peptide, these techniques must be regarded as a screen only, requiring secondary validation.

Peptidomics. Because degradomics is extremely difficult in vivo, peptidomics™77,78, in which naturally present protease-generated peptides can be identified in complex biological samples such as serum79 and brain80, might emerge as an alternative approach. Typically, highly abundant proteins including albumin and antibodies need to be removed, and then the naturally present low-molecular-mass peptides are sequenced by MS/MS to identify their origins. Although it is difficult to distin-guish the direct and indirect effects of proteolysis, if suc-cessful, in vivo degradomics has tremendous potential. For example, peptidomics can link proteases to substrates in mice by comparing wild-type with transgenic mice in which the protease is deleted or overexpressed. An essential positive control should be the detection of pep-tides of known substrates of the protease. Recently, neu-ropeptides generated by prohormone convertase-2 have been identified by comparing convertase-2 knockout with wild-type mice80. It might also be possible to detect protease dysregulation in patient’s serum samples, which might function as a diagnostic marker for disease pro-gression and function as a surrogate marker to monitor the effect of protease inhibitors in vivo.

Animal models

Mouse models are ideal for assessing the physiological and pathological relevance of enzyme–substrate pairs that have been established using the techniques described above. In addition, they frequently reveal new horizons through unexpected insights into the functions of an enzyme. Moreover, crossing protease-knockout mice with mouse tumour models driven by the transgenic expression of oncogenes has produced considerable insights into the roles of proteases in cancer as targets and anti-targets13,81,82. Here we will outline how the analysis of protease expression in mice coupled with loss-of-function (knockout mice) and gain-of-function (transgenic mice) studies help us to understand the roles and substrates of enzymes at the level of an intact organism (FIG. 3).

Expression analysis. Knowledge about the expression pattern of a protease can help formulate hypoth-eses about its potential functions83 and can establish co-expression with candidate substrates84,85. Expression patterns can be determined by DNA or oligonucleotide microarrays48,84,86, whereas in situ mRNA hybridization and immunohistochemistry provide precise positional information84,85 and can reveal local areas of high protease expression that might be missed by gene-chip analysis87. Tissues from knockout mice provide an excellent control of specificity to rule out false-positive patterns88. Activity-based probes selectively bind to active proteases and contain detection moieties that can be localized by in vivo and ex vivo imaging83, denaturing gels and MS35,89. Such tools help determine the expression pattern of an active enzyme, which is not evident from mRNA or protein-expression analysis (a highly expressed enzyme might not be active, for example, if it is stored as a zymogen that requires activation). The expression profiles obtained can be correlated with substrate cleavage and with the co-expression of an enzyme and a substrate, which further narrows down the list of candidates, with the caveat that enzymes with low expression levels might have high specific activities.

Validating enzyme–substrate pairs. If the primary goal of working with mouse models is to confirm the role of an enzyme in processing a substrate that is implicated with disease, the candidate enzymes are usually first tested in biochemical and cell-based assays before a knockout mouse is analysed. If the correct candidate enzyme is deleted, the substrate should no longer be cleaved in the knockout mouse and the predicted develop mental defect or effect in the disease model should be observed90,91. However, knockout studies are not always straightforward and a unique challenge in protease research is that a lack of substrate processing can have various consequences, including increased, decreased or modified activity of the substrate92,93. Moreover, every protease usually has multiple sub-strates, which might include other proteases or inhibi-tors in the protease web and so has functions that are the sum of the various processing defects. Beyond that,

R E V I E W S

252 | MARCH 2007 | VOLUME 8 www.nature.com/reviews/molcellbio

© 2007 Nature Publishing Group

Protease

Membrane protein

Plasmamembrane

Solubleectodomain

a

b Protease knockout c Protease-resistant- substrate knock-in

d Substrate knockout e Catalytic-site-inactivating knock-in

Driver linesA system, such as the Cre–Lox

system, for creating conditional

gene deletions.

proteases with different protein modules probably also have functions that are independent of their catalytic activity. Last, if altered processing of signalling networks modulates the expression of other proteases, this can complicate data interpretation13,94.

Knock-in and conditional knockout mice. Addressing the issues of protease research outlined above can require generating several targeted mutations in mice (FIG. 3). If it is unclear how processing affects the function of a substrate, knock-in mice with an inac-tivated substrate-cleavage site can provide an answer (FIG. 3c)92,93. In knock-in mice, the substrate’s physio-logical expression pattern is preserved, a decisive

advantage over transgenic mice overexpressing an uncleavable substrate. Cell-based assays can confirm that the knock-in mutation does block processing and does not inadvertently move cleavage to a less preferred site or introduce sites for a different enzyme95. If the phenotype of a substrate-cleavage-site knock-in mouse resembles that of the protease knockout92,96, this can indicate a functional connection between the enzyme and substrate, although a similar phenotype does not guarantee that the mechanisms are linked. In addition, a similar phenotype could indicate that the substrate is a functionally dominant substrate of this enzyme. In such a scenario, the enzyme might be a good target for modulating the function of this dominant substrate in disease, even if it has other as-yet-functionally-silent or less relevant substrates.

If a protease-deficient mouse has a different or more severe pathological phenotype than the substrate-cleavage-site knock-in mouse, this points towards other functions and substrates for the protease. Protease knock-in mice with an inactivating point mutation in the catalytic site of the protease that does not affect its folding should help to distinguish between activities that require catalytic activity and those that depend on ancillary domains (FIG. 3e). A caveat is that, depending on in vivo enzyme-to-substrate ratios, inactive-catalytic-domain substrate capture (see above) might also affect the proteome. If knockout mice die too early to allow a proper analysis of substrate processing and its conse-quences, conditional knockout mice, which allow selec-tive inactivation of an enzyme via cell- or tissue-specific driver lines, can be produced or organ transplants can be made to wild-type littermates97.

Analysing orphan enzymes. Specific developmental defects or adult pathological phenotypes of knock-out mice can point towards candidate substrates, which without processing might be responsible for the observed defect6, so deorphaning some proteases. A knowledge of the enzyme’s expression pattern can focus attention on certain tissues or mouse models of disease82–85,88,98, which can be especially useful if knockout mice do not have evident developmental phenotypes. Moreover, the analysis of knockout mice can be guided by results from proteomics, biochemical and cell-biological analyses, if the results point towards a role for a protease in cleaving certain substrates that might affect development or disease if they are not processed99–101. Further information can be obtained from transgenic gain-of-function experiments, which help to address the direct consequences of an enzyme’s dysregulation102. However, aberrant expression pat-terns or excessively high enzyme-to-substrate ratios might lead to phenotypes that only resemble human disease without replicating the normal mechanistic basis, including effects that are due to the cleavage of non-physiological substrates13. Last, other genetic model systems, such as Caenorhabditis elegans103 and Drosophila melanogaster104, can place proteases in the context of important conserved signalling pathways, thereby guiding further studies in mouse models56.

Figure 3 | Mouse models in protease research. Mouse models allow researchers to

study the role of proteases in the context of an intact organism through loss- and gain-of-

function experiments. In the case of loss-of-function experiments, the first step is usually

to determine a candidate enzyme–substrate pair (a), and then to test the consequences

of the targeted deletion of the enzyme in mice (b). Knock-in mutations that inactivate a

substrate’s cleavage site help assess the consequences of the lack of processing of a

substrate (c) compared to the deletion of the substrate (d). Potential contribution of

ancillary domains to the function of an enzyme can be uncovered by comparing the

consequences of a catalytic-site-inactivating knock-in mutation (e) to those of the

targeted deletion of the enzyme (b).

R E V I E W S

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 253

© 2007 Nature Publishing Group

Cell biologyPerform loss- and gain-of-functionexperiments to evaluate catalyticactivity in an intact cell

a SubstrateMajor goal: identify the enzyme(s) that is relevant for cleaving a substrate with aknown role in development or disease

b ProteaseMajor goal: identify substrates forwhich cleavage and function aremodulated by a given enzyme

Proteomics and degradomicsHypothesis-generating toolusing biochemistry, cell biologyand mouse models

BiochemistyDetermine cleavage sites,enzyme kinetics andthe inhibitor profile

Mouse modelEstablish the in vivo function andexpression pattern of proteaseand substrates in animal modelsof development and disease

Summary and perspectives

Each of the individual approaches discussed in this review can provide a wealth of information about proteases, their substrates and their biological functions. By combining these different approaches, several specific limitations of individual techniques can be overcome, such that the combined insights can far exceed the sum of the parts (FIG. 4). This is especially evident when considering what each approach contributes to refining the fingerprint of a protease. Biochemical studies identify inhibitors, cleavage-site preference, kinetics and substrates by serial screening. Cell-based assays help to evaluate an enzyme’s function in a more physiological context, and are there-fore essential for validating the results of biochemical and proteomics techniques. Last, at the level of the whole organism, the co-expression of an enzyme and a substrate in relevant tissues as well as the dysregulation in disease coupled with gain- and loss-of-function studies are power ful criteria to link enzymes to substrates. Overall, these combined approaches have made it feasible to identify the enzymes that are responsible for processing certain substrates in vivo (FIG. 4a). A few examples of link-ing substrates to relevant proteases by combining various aspects of these approaches are the identification of the protease(s) that process TNFα4,6,18, syndecan-1 (REF. 105),

the TNF family member TRANCE/OPGL84,100,101, aggre-can90,91, amyloid precursor protein99,106–108 and the low affinity immunoglobulin E receptor CD23 (REF. 109).

Once a relevant protease for a substrate is identified, it is important to assess this enzyme’s other potential functions and substrates. This requires essentially similar considerations as studying orphan proteases, and benefits from input from all of the four approaches that have been discussed here (FIG. 4 b). Importantly, unbiased insights, such as from degradomics analyses, are crucial for identifying novel substrates and biological functions. These, in turn, can lead to new hypotheses about the unexpected roles of a protease in development or disease. Whether the initial interest in proteases is to identify an enzyme that is responsible for cleaving a pathologically relevant substrate, to identify functions and substrates for an orphan enzyme or to establish new functions and substrates for a known enzyme, it is well worth consider-ing the interconnected nature of all approaches discussed here. Ultimately, it will be important to do similar analy-ses in human tissues given that there are more than 80 additional proteases in the murine degradome than in the human one (FIG. 1), many of which are in inflammatory cells110, and so animal models might not always accurately replicate human diseases.

Figure 4 | Linking enzymes and substrates: the big picture. Combining information from different approaches for

protease characterization facilitates understanding the function of novel and known enzymes. The two major starting

points in protease research are: (a) identifying the protease(s) that is relevant for processing a substrate involved in human

disease (such as tumour necrosis factor-α or aggrecan), and (b) characterizing an orphan or known enzyme. Starting with

a substrate in search of a protease (a), purification of an activity from cells or tissues is often the first step. Fingerprinting

the activity with inhibitors or activators and loss- and gain-of-function experiments can narrow down the candidates.

Mouse models can confirm enzyme–substrate pairs that are functionally relevant in development and disease models.

Starting with an enzyme (b), the goal is usually to identify new substrates and functions. Mouse models provide important

clues to an enzyme’s functions and substrates, and cell-based and biochemical assays support hypothesis-driven

identification of substrates. However, proteomics and degradomics are needed for unbiased substrate identification.

The relevance of newly identified enzyme–substrate pairs can then be tested as outlined above, starting at entry point a.

Reiterations of this cycle, aided by improvements of degradomics and the development of selective activators and

inhibitors of proteases, should yield more comprehensive insights into the relevant substrates of individual enzymes.

R E V I E W S

254 | MARCH 2007 | VOLUME 8 www.nature.com/reviews/molcellbio

© 2007 Nature Publishing Group

Last, it seems probable that a wealth of mechanistic information, and potentially also unexpected new drug targets, might emerge from studying how proteolysis affects individual substrates. Even though genetic screens and knockout mice can uncover dominant functions of proteases in major pathways, it is usually much more difficult to understand the consequences of processing of individual substrates. Yet there are several examples for point mutations in substrate-cleavage sites that profoundly affect the substrate’s function, such as muta-tions in the TNF-receptor that lead to its accumulation, resulting in TNF-receptor-associated periodic febrile syndrome (TRAPS)111, or mutations in amyloid precur-sor protein that affect production of the amyloid ogenic amyloid-β peptide, leading to Alzheimer’s disease112.

Although blocking a validated protease drug target makes sense in cases of dominant roles in disease pathways, the pleiotropic and multifunctional nature of the 566 known human proteases110 and many more substrates provides a strong incentive to think about means to specifically activate109 or inactivate the processing of individual substrates only, such as by exosite inhibitors45, thereby avoiding potential side effects of blocking proteases13. Clearly, the fascinating functions of these signalling scissors in development and disease will provide fertile ground for basic and biomedical research for years to come. We hope that the guideposts outlined here will be helpful in considering how best to identify substrates and functions of proteases, and to separate the mere suspects from the actual perpetrators.

1. Macfarlane, R. G. An enzyme cascade in the blood clotting mechanism, and its function as a biochemical amplifier. Nature 202, 498–499 (1964).

2. Davie, E. W. & Ratnoff, O. D. Waterfall sequence for intrinsic blood clotting. Science 145, 1310–1312 (1964).

3. Davie, E. W. & Neurath, H. Identification of a peptide released during autocatalytic activation of trypsinogen. J. Biol. Chem. 212, 515–529 (1955).

4. Black, R. et al. A metalloprotease disintegrin that releases tumour-necrosis factor-α from cells. Nature 385, 729–733 (1997).

5. Moss, M. L. et al. Cloning of a disintegrin metalloproteinase that processes precursor tumour-necrosis factor-α. Nature 385, 733–736 (1997).

6. Peschon, J. J. et al. An essential role for ectodomain shedding in mammalian development. Science 282, 1281–1284 (1998).An excellent example of linking an enzyme,

ADAM17, to several substrates, including TGFα,

through the analysis of Adam17-knockout mice and

cell-based assays in cells lacking ADAM17, derived

from these mice.

7. McQuibban, G. A. et al. Inflammation dampened by gelatinase A cleavage of monocyte chemoattractant protein-3. Science 289, 1202–1206 (2000).

8. Parks, W. C., Wilson, C. L. & López-Boado. Matrix metalloproteinases as modulators of inflammation and innate immunity. Nature Rev. Immunol. 4, 617–629 (2004).

9. Hintermann, E. & Quaranta, V. Epithelial cell motility on laminin-5: regulation by matrix assembly, proteolysis, integrins and erbB receptors. Matrix Biol. 23, 75–85 (2004).

10. O’Reilly, M. S., Holmgren, L., Chen, C. & Folkman, J. Angiostatin induces and sustains dormancy of human primary tumors in mice. Nature Med. 2, 689–692 (1996).

11. Bergers, G., Javaherian, K., Lo, K. M., Folkman, J. & Hanahan, D. Effects of angiogenesis inhibitors on multistage carcinogenesis in mice. Science 284, 808–812 (1999).

12. Overall, C. M. Dilating the degradome: matrix metalloproteinase-2 cuts to the heart of the matter. Biochem. J. 383, e5–e7 (2004).

13. Overall, C. M. & Kleifeld, O. Validating MMPs as drug targets and anti-targets for cancer therapy. Nature Rev. Cancer 6, 227–239 (2006).

14. Turk, B. Targeting proteases: successes, failures and future prospects. Nature Rev. Drug Discov. 5, 785–799 (2006).

15. López-Otin, C. & Overall, C. M. Protease degradomics: a new challenge for proteomics. Nature Rev. Mol. Cell Biol. 3, 509–519 (2002).

16. Rosendahl, M. S. et al. Identification and characterization of a pro-tumor necrosis factor-α-processing enzyme from the ADAM family of zinc metalloproteases. J. Biol. Chem. 272, 24588–24593 (1997).

17. Zheng, Y., Saftig, P., Hartmann, D. & Blobel, C. Evaluation of the contribution of different ADAMs to TNFα shedding and of the function of the TNFα ectodomain in ensuring selective stimulated shedding by the TNFα convertase (TACE/ADAM17). J. Biol. Chem. 279, 42898–42906 (2004).

18. Haro, H. et al. Matrix metalloproteinase-7-dependent release of tumor necrosis factor-α in a model of herniated disc resorption. J. Clin. Invest. 105, 143–150 (2000).

19. Tam, E. M., Morrison, C. J., Wu, Y. I., Stack, M. S. & Overall, C. M. Membrane protease proteomics: isotope-coded affinity tag MS identification of undescribed MT1-matrix metalloproteinase substrates. Proc. Natl Acad. Sci. USA 101, 6917–6922 (2004).A key paper describing the use of isotope mass

tags and liquid-chromatography-based mass-

spectrometric identification of cleaved products of

native substrates in the cellular context.

20. Zhang, K. et al. Metalloproteinase cleavage of the chemokine SDF-1α induces neuronal apoptosis in HIV encephalitis. Nature Neurosci. 6, 1064–1071 (2003).

21. Shilling, O. & Overall, C. M. Proteomic discovery of protease substrates. Curr. Opin. Chem. Biol. 11, 1–10 (2007).

22. Roghani, M. et al. Metalloprotease-disintegrin MDC9: intracellular maturation and catalytic activity. J. Biol. Chem. 274, 3531–3540 (1999).

23. Loechel, F., Gilpin, B. J., Engvall, E., Albrechtsen, R. & Wewer, U. M. Human ADAM 12 (meltrin α) is an active metalloprotease. J. Biol. Chem. 273, 16993–16997 (1998).

24. Zou, J. et al. Catalytic activity of human ADAM33. J. Biol. Chem. 279, 9818–9830 (2004).

25. Smith, M. M., Shi, L. & Navre, M. Rapid identification of highly active and selective substrates for stromelysin and matrilysin using bacteriophage peptide display libraries. J. Biol. Chem. 270, 6440–6449 (1995).

26. Matthews, D. J. & Wells, J. A. Substrate phage: selection of protease substrates by monovalent phage display. Science 260, 1113–1117 (1993).A classic paper on the development of phage

display to screen for protease-cleavage sites.

27. Rosse, G. et al. Rapid identification of substrates for novel proteases using a combinatorial peptide library. J. Comb. Chem. 2, 461–466 (2000).

28. Turk, B. E. & Cantley, L. C. Using peptide libraries to identify optimal cleavage motifs for proteolytic enzymes. Methods 32, 398–405 (2004).

29. Cunningham, B. C., Henner, D. J. & Wells, J. A. Engineering human prolactin to bind to the human growth hormone receptor. Science 247, 1461–1465 (1990).

30. Zhu, L. et al. The role of dipeptidyl peptidase IV in the cleavage of glucagon family peptides: in vivo metabolism of pituitary adenylate cyclase activating polypeptide-(1–38). J. Biol. Chem. 278, 22418–22423 (2003).

31. Thornberry, N. A. et al. A combinatorial approach defines specificities of members of the caspase family and granzyme B. Functional relationships established for key mediators of apoptosis. J. Biol. Chem. 272, 17907–17911 (1997).Describes the development of a now widely used

technique for characterizing the active sites of

proteases and for defining cleavage-site specificities.

32. Salisbury, C. M., Maly, D. J. & Ellman, J. A. Peptide microarrays for the determination of protease substrate specificity. J. Am. Chem. Soc. 124, 14868–14870 (2002).

33. Gao, X. et al. High density peptide microarrays. In situ synthesis and applications. Mol. Divers. 8, 177–187 (2004).

34. Marnett, A. B., Nomura, A. M., Shimba, N., Ortiz de Montellano, P. R. & Craik, C. S. Communication between the active sites and dimer interface of a herpesvirus protease revealed by a transition-state inhibitor. Proc. Natl Acad. Sci. USA 101, 6870–6875 (2004).

35. Greenbaum, D. C. et al. Small molecule affinity fingerprinting. A tool for enzyme family subclassification, target identification, and inhibitor design. Chem. Biol. 9, 1085–1094 (2002).A comprehensive analysis of the development and

use of activity-based probes that launched recent

in vivo imaging studies, inhibitor design and active-

site characterization.

36. Barrios, A. M. & Craik, C. S. Scanning the prime-site substrate specificity of proteolytic enzymes: a novel assay based on ligand-enhanced lanthanide ion fluorescence. Bioorg. Med. Chem. Lett. 12, 3619–3623 (2002).

37. Turk, B. E., Huang, L. L., Piro, E. T. & Cantley, L. C. Determination of protease cleavage site motifs using mixture-based oriented peptide libraries. Nature Biotechnol. 19, 661–667 (2001).

38. Becherer, J. D. & Blobel, C. P. Biochemical properties and functions of membrane-anchored metalloprotease-disintegrin proteins (ADAMs). Curr. Top. Dev. Biol. 54, 101–123 (2003).

39. Gomis-Ruth, F. X. Structural aspects of the metzincin clan of metalloendopeptidases. Mol. Biotechnol. 24, 157–202 (2003).

40. Overall, C. M. Molecular determinants of metalloproteinase substrate specificity: matrix metalloproteinase substrate binding domains, modules, and exosites. Mol. Biotechnol. 22, 51–86 (2002).

41. Tortorella, M. et al. The thrombospondin motif of aggrecanase-1 (ADAMTS-4) is critical for aggrecan substrate recognition and cleavage. J. Biol. Chem. 275, 25791–25797 (2000).

42. Fields, G. B. A model for interstitial collagen catabolism by mammalian collagenases. J. Theor. Biol. 153, 585–602 (1991).

43. Boyd, S. E., Pike, R. N., Rudy, G. B., Whisstock, J. C. & Garcia de la Banda, M. PoPS: a computational tool for modeling and predicting protease specificity. J. Bioinform. Comput. Biol. 3, 551–585 (2005).

44. Berman, J. et al. Rapid optimization of enzyme substrates using defined substrate mixtures. J. Biol. Chem. 267, 1434–1437 (1992).

45. Overall, C. M., McQuibban, G. A. & Clark-Lewis, I. Discovery of chemokine substrates for matrix metalloproteinases by exosite scanning: a new tool for degradomics. Biol. Chem. 383, 1059–1066 (2002).

46. Boldt, H. B. et al. The Lin12-notch repeats of pregnancy-associated plasma protein-A bind calcium and determine its proteolytic specificity. J. Biol. Chem. 279, 38525–38531 (2004).

47. Torres-Collado, A. X., Kisiel, W., Iruela-Arispe, M. L. & Rodriguez-Manzaneque, J. C. ADAMTS1 interacts with, cleaves, and modifies the extracellular location of the matrix inhibitor tissue factor pathway inhibitor-2. J. Biol. Chem. 281, 17827–17837 (2006).

R E V I E W S

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 255

© 2007 Nature Publishing Group

48. Overall, C. M. et al. Protease degradomics: mass spectrometry discovery of protease substrates and the CLIP-CHIP, a dedicated DNA microarray of all human proteases and inhibitors. Biol. Chem. 385, 493–504 (2004).

49. Wild-Bode, C., Fellerer, K., Kugler, J., Haass, C. & Capell, A. A basolateral sorting signal directs ADAM10 to adherens junctions and is required for its function in cell migration. J. Biol. Chem. 281, 23824–23829 (2006).

50. Loechel, F., Overgaard, M. T., Oxvig, C., Albrechtsen, R. & Wewer, U. M. Regulation of human ADAM 12 protease by the prodomain. Evidence for a functional cysteine switch. J. Biol. Chem. 274, 13427–13433 (1999).

51. Pei, D. & Weiss, S. J. Furin-dependent intracellular activation of the human stromelysin-3 zymogen. Nature 375, 244–247 (1995).

52. Tchougounova, E. et al. A key role for mast cell chymase in the activation of pro-matrix metalloprotease-9 and pro-matrix metalloprotease-2. J. Biol. Chem. 280, 9291–9296 (2005).

53. Lundequist, A., Tchougounova, E., Abrink, M. & Pejler, G. Cooperation between mast cell carboxypeptidase A and the chymase mouse mast cell protease 4 in the formation and degradation of angiotensin II. J. Biol. Chem. 279, 32339–32344 (2004).

54. Sahin, U. et al. Distinct roles for ADAM10 and ADAM17 in ectodomain shedding of six EGFR-ligands. J. Cell Biol. 164, 769–779 (2004).Mouse embryonic fibroblasts from different Adam-

knockout mice were used in cell-based assays to

identify which ADAM is required for processing of

individual epidermal-growth-factor-receptor ligands.

55. Horiuchi, K. et al. Substrate selectivity and regulation of EGF-receptor ligand sheddases by phorbol esters and calcium influx. Mol. Biol. Cell 18, 176–188 (2007).

56. Hartmann, D. et al. The disintegrin/metalloprotease ADAM 10 is essential for Notch signalling but not for α-secretase activity in fibroblasts. Hum. Mol. Genet. 11, 2615–2624 (2002).

57. Hinkle, C. L. et al. Selective roles for tumor necrosis factor α-converting enzyme/ADAM17 in the shedding of the epidermal growth factor receptor ligand family: the juxtamembrane stalk determines cleavage efficiency. J. Biol. Chem. 279, 24179–24188 (2004).

58. Billinghurst, R. C. et al. Enhanced cleavage of type II collagen by collagenases in osteoarthritic articular cartilage. J. Clin. Invest. 99, 1534–1545 (1997).

59. Gschwind, A., Hart, S., Fischer, O. M. & Ullrich, A. TACE cleavage of proamphiregulin regulates GPCR-induced proliferation and motility of cancer cells. EMBO J. 22, 2411–2421 (2003).

60. Butler, G. S. & Overall, C. M. Proteomic validation of protease drug targets: Pharmacoproteomics of matrix metalloproteinase inhibitor drugs using isotope-coded affinity tag labelling and tandem mass spectrometry. Curr. Pharm. Des.13, 263–270 (2007).

61. Nagano, O. et al. Cell-matrix interaction via CD44 is independently regulated by different metalloproteinases activated in response to extracellular Ca2+ influx and PKC activation. J. Cell Biol. 165, 893–902 (2004).

62. Prenzel, N. et al. EGF receptor transactivation by G-protein-coupled receptors requires metalloproteinase cleavage of proHB-EGF. Nature 402, 884–888 (1999).

63. Fan, H. & Derynck, R. Ectodomain shedding of TGF-α and other transmembrane proteins is induced by receptor tyrosine kinase activation and MAP kinase signaling cascades. EMBO J. 18, 6962–6972 (1999).

64. Hwang, I. K., Park, S. M., Kim, S. Y. & Lee, S. T. A proteomic approach to identify substrates of matrix metalloproteinase-14 in human plasma. Biochim. Biophys. Acta 1702, 79–87 (2004).

65. Lee, A. Y. et al. Identification of caspase-3 degradome by two-dimensional gel electrophoresis and matrix-assisted laser desorption/ionization–time of flight analysis. Proteomics 4, 3429–3436 (2004).

66. Zhou, X. W., Blackman, M. J., Howell, S. A. & Carruthers, V. B. Proteomic analysis of cleavage events reveals a dynamic two-step mechanism for proteolysis of a key parasite adhesive complex. Mol. Cell. Proteomics 3, 565–576 (2004).

67. Bredemeyer, A. J. et al. A proteomic approach for the discovery of protease substrates. Proc. Natl Acad. Sci. USA 101, 11785–11790 (2004).Describes the successful application of DIGE to

identify substrates for granzyme A and B.

68. Bech-Serra, J. J. et al. Proteomic identification of desmoglein 2 and activated leukocyte cell adhesion molecule as substrates of ADAM17 and ADAM10 by difference gel electrophoresis. Mol. Cell. Biol. 26, 5086–5095 (2006).DIGE is used to identify substrates for ADAM10

and ADAM17, and the role of these ADAMs in

cleaving the newly identified substrates

(desmoglein-2 and activated leukocyte cell-adhesion

molecule (ALCAM)) was corroborated with cells

from ADAM10- or ADAM17-deficient mice.

69. Washburn, M. P., Wolters, D. & Yates, J. R. 3rd. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nature Biotechnol. 19, 242–247 (2001).

70. Ahram, M., Adkins, J. N., Auberry, D. L., Wunschel, D. S. & Springer, D. L. A proteomic approach to characterize protein shedding. Proteomics 5, 123–131 (2005).

71. Guo, L. et al. A proteomic approach for the identification of cell-surface proteins shed by metalloproteases. Mol. Cell. Proteomics 1, 30–36 (2002).

72. Dean, R. A. & Overall, C. M. Proteomic discovery of metalloproteinase substrates in the cellular context by iTRAQTM labeling reveals a diverse MMP-2 substrate degradome. Mol. Cell. Proteomics (in the press).

73. Xia, W. & Wolfe, M. S. Intramembrane proteolysis by presenilin and presenilin-like proteases. J. Cell Sci. 116, 2839–2844 (2003).

74. Ji, C., Guo, N. & Li, L. Differential dimethyl labeling of N-termini of peptides after guanidination for proteome analysis. J. Proteome Res. 4, 2099–2108 (2005).Clever use of chemical guanidination of Lys side

chains to mask these from the free N terminus of

proteins, potentially including cleaved substrates,

which can be adopted for degradomics analysis of

substrate cleavage.

75. Van Damme, P. et al. Caspase-specific and nonspecific in vivo protein processing during Fas-induced apoptosis. Nature Methods 2, 771–777 (2005).The first comprehensive and working method for

N-terminone analysis of proteolysis.

76. McDonald, L., Robertson, D. H., Hurst, J. L. & Beynon, R. J. Positional proteomics: selective recovery and analysis of N-terminal proteolytic peptides. Nature Methods 2, 955–957 (2005).

77. Schulz-Knappe, P. et al. Peptidomics: the comprehensive analysis of peptides in complex biological mixtures. Comb. Chem. High Throughput Screen. 4, 207–217 (2001).

78. Adermann, K., John, H., Standker, L. & Forssmann, W. G. Exploiting natural peptide diversity: novel research tools and drug leads. Curr. Opin. Biotechnol. 15, 599–606 (2004).

79. Villanueva, J. et al. Serum peptide profiling by magnetic particle-assisted, automated sample processing and MALDI-TOF mass spectrometry. Anal. Chem. 76, 1560–1570 (2004).

80. Pan, H. et al. The role of prohormone convertase-2 in hypothalamic neuropeptide processing: a quantitative neuropeptidomic study. J. Neurochem. 98, 1763–1777 (2006).Peptidomic analysis of protease-knockout and wild-

type mice focusing on neuropeptide products that

result from prohormone convertase-2 activity.

81. Egeblad, M., & Werb, Z. New functions for the matrix metalloproteinases in cancer progression. Nature Rev. Cancer 2, 161–174 (2002).

82. Gocheva, V. et al. Distinct roles for cysteine cathepsin genes in multistage tumorigenesis. Genes Dev. 20, 543–556 (2006).

83. Joyce, J. A. et al. Cathepsin cysteine proteases are effectors of invasive growth and angiogenesis during multistage tumorigenesis. Cancer Cell 5, 443–453 (2004).

84. Lynch, C. C. et al. MMP-7 promotes prostate cancer-induced osteolysis via the solubilization of RANKL. Cancer Cell 7, 485–496 (2005).Microarray analysis of a rodent prostate cancer

model, which metastasizes to bone, showed

upregulation of MMP7 in osteoclasts at the

tumour–bone interface, and demonstrated a

requirement for MMP7 in receptor activator of

NF-κB ligand (RANKL).

85. Willem, M. et al. Control of peripheral nerve myelination by the β-secretase BACE1. Science 314, 664–666 (2006).Expression analysis revealed high amounts of BACE

in peripheral nerves during myelination, and BACE-

deficient mice were found to display

hypomyelination and defects in processing of

neuregulin, an ErbB-ligand that is crucial for

myelination.

86. Acuff, H. B. et al. Analysis of host- and tumor-derived proteinases using a custom dual species microarray reveals a protective role for stromal matrix metalloproteinase-12 in non-small cell lung cancer. Cancer Res. 66, 7968–7975 (2006).

87. Peduto, L. et al. ADAM12 is highly expressed in carcinoma-associated stroma and is required for mouse prostate tumor progression. Oncogene 25, 5462–5466 (2006).

88. Horiuchi, K. et al. Potential role for ADAM15 in pathological neovascularization in mice. Mol. Cell. Biol. 23, 5614–5624 (2003).

89. Saghatelian, A., Jessani, N., Joseph, A., Humphrey, M. & Cravatt, B. F. Activity-based probes for the proteomic profiling of metalloproteases. Proc. Natl Acad. Sci. USA 101, 10000–10005 (2004).

90. Stanton, H. et al. ADAMTS5 is the major aggrecanase in mouse cartilage in vivo and in vitro. Nature 434, 648–652 (2005).

91. Glasson, S. S. et al. Deletion of active ADAMTS5 prevents cartilage degradation in a murine model of osteoarthritis. Nature 434, 644–648 (2005).In references 90 and 91, Adamts4- and Adamts5-knockout mice were used to test which of these is

the relevant aggrecanase in vitro and in a mouse

model for osteo arthritis. ADAMTS5 emerged as

the principal enzyme.

92. Yamazaki, S. et al. Mice with defects in HB-EGF ectodomain shedding show severe developmental abnormalities. J. Cell Biol. 163, 469–475 (2003).

93. Ruuls, S. R. et al. Membrane-bound TNF supports secondary lymphoid organ structure but is subservient to secreted TNF in driving autoimmune inflammation. Immunity 15, 533–543 (2001).

94. Ge, G. & Greenspan, D. BMP1 controls TGFβ1 activation via cleavage of latent TGFβ-binding protein. J. Cell Biol. 175, 111–120 (2006).

95. Alfalah, M. et al. A point mutation in the juxtamembrane stalk of human angiotensin I-converting enzyme invokes the action of a distinct secretase. J. Biol. Chem. 276, 21105–21109 (2001).

96. Jackson, L. F. et al. Defective valvulogenesis in HB-EGF and TACE-null mice is associated with aberrant BMP signaling. EMBO J. 22, 2704–2716 (2003).

97. Sternlicht, M. D. et al. Mammary ductal morphogenesis requires paracrine activation of stromal EGFR via ADAM17-dependent shedding of epithelial amphiregulin. Development 132, 3923–3933 (2005).

98. de Visser, K. E., Korets, L. V. & Coussens, L. M. De novo carcinogenesis promoted by chronic inflammation is B lymphocyte dependent. Cancer Cell 7, 411–423 (2005).

99. Lammich, S. et al. Constitutive and regulated α-secretase cleavage of Alzheimer's amyloid precursor protein by a disintegrin metalloprotease. Proc. Natl Acad. Sci. USA 96, 3922–3927 (1999).

100. Schlöndorff, J. S., Lum, L. & Blobel, C. P. Biochemical and pharmacological criteria define two shedding activities for TRANCE/OPGL that are distinct from the TNFα convertase (TACE). J. Biol. Chem. 276, 14665–14674 (2001).

101. Hikita, A. et al. Negative regulation of osteoclastogenesis by ectodomain shedding of receptor activator of NF-κB ligand. J. Biol. Chem. 281, 36846–36855 (2006).

102. Sternlicht, M. D. et al. The stromal proteinase MMP3/stromelysin-1 promotes mammary carcinogenesis. Cell 98, 137–146 (1999).

103. Wen, C., Metzstein, M. M. & Greenwald, I. SUP-17, a Caenorhabditis elegans ADAM protein related to Drosophila KUZBANIAN, and its role in LIN-12/NOTCH signaling. Development 124, 4759–4767 (1997).

104. Pan, D. & Rubin, J. KUZBANIAN controls proteolytic processing of NOTCH and mediates lateral inhibition during Drosophila and vertebrate neurogenesis. Cell 90, 271–280 (1997).

105. Li, Q., Park, P. W., Wilson, C. L. & Parks, W. C. Matrilysin shedding of syndecan-1 regulates chemokine mobilization and transepithelial efflux of neutrophils in acute lung injury. Cell 111, 635–646 (2002).An elegant analysis of the role of matrilysin (also

known as MMP7) in shedding syndecan-1 with an

attached CXC chemokine, and the requirement of

this process in neutrophil efflux to sites of lung

injury in mice.

R E V I E W S

256 | MARCH 2007 | VOLUME 8 www.nature.com/reviews/molcellbio

© 2007 Nature Publishing Group

106. Vassar, R. et al. β-secretase cleavage of Alzheimer's amyloid precursor protein by the transmembrane aspartic protease BACE. Science 286, 735–741 (1999).

107. Roberds, S. L. et al. BACE knockout mice are healthy despite lacking the primary β-secretase activity in brain: implications for Alzheimer's disease therapeutics. Hum. Mol. Genet. 10, 1317–1324 (2001).

108. Buxbaum, J. D. et al. Evidence that tumor necrosis factor α converting enzyme is involved in regulated α-secretase cleavage of the Alzheimer amyloid protein precursor. J. Biol. Chem. 273, 27765–27767 (1998).

109. Weskamp, G. et al. ADAM10 is a principal ‘sheddase’ of the low-affinity immunoglobulin E receptor CD23. Nature Immunology 7, 1393–1398 (2006).Gain- and loss-of-function studies with mouse

cells, in vivo shedding studies in mice and use of

a selective pharmacological inhibitor on mouse

and human B cells identified ADAM10 as the

major sheddase for CD23, a target for the

treatment of allergic disease and rheumatoid

arthritis.

110. Puente, X. S., Sanchez, L. M., Overall, C. M. & Lopez-Otin, C. Human and mouse proteases: a comparative genomic approach. Nature Rev. Genet. 4, 544–558 (2003).An excellent review on the identification and

classification of all proteases in the human genome

as well as diseases of proteolysis.

111. McDermott, M. F. et al. Germline mutations in the extracellular domains of the 55 kDa TNF receptor, TNFR1, define a family of dominantly inherited autoinflammatory syndromes. Cell 97, 133–144 (1999).

112. Selkoe, D. J. The cell biology of β-amyloid precursor protein and presenilin in Alzheimer's disease. Trends Cell Biol. 8, 447–453 (1998).

113. Milla, M. E. et al. Specific sequence elements are required for the expression of functional tumor

necrosis factor-α-converting enzyme (TACE). J. Biol. Chem. 274, 30563–30570 (1999).

114. Grams, F. et al. X-ray structures of human neutrophil collagenase complexed with peptide hydroxamate and peptide thiol inhibitors. Implications for substrate binding and rational drug design. Eur. J. Biochem. 228, 830–841 (1995).

115. Zhao, Y. G., Wei, P. & Sang, Q. X. Inhibitory antibodies against endopeptidase activity of human adamalysin 19. Biochem. Biophys. Res. Commun. 289, 288–294 (2001).

116. Mort, J. S. & Roughley, P. J. Production of antibodies against degradative neoepitopes in aggrecan. Methods Mol. Med. 100, 237–250 (2004).

117. Parkin, E. T. et al. Structure-activity relationship of hydroxamate-based inhibitors on the secretases that cleave the amyloid precursor protein, angiotensin converting enzyme, CD23, and pro-tumor necrosis factor-α. Biochemistry 41, 4972–4981 (2002).

118. Murphy, G. et al. Role of TIMPs (tissue inhibitors of metalloproteinases) in pericellular proteolysis: the specificity is in the detail. Biochem. Soc. Symp., 65–80 (2003).

119. Gonzales, P. E. et al. Inhibition of the TNFα converting enzyme (TACE) by its Pro domain. J. Biol. Chem. 279, 31638–31645 (2004).

120. McQuibban, G. A. et al. Matrix metalloproteinase processing of monocyte chemoattractant proteins generates CC chemokine receptor antagonists with anti-inflammatory properties in vivo. Blood 100, 1160–1167 (2002).

121. Blank, M. & Blind, M. Aptamers as tools for target validation. Curr. Opin. Chem. Biol. 9, 336–342 (2005).

122. Akashi, H., Matsumoto, S. & Taira, K. Gene discovery by ribozyme and siRNA libraries. Nature Rev. Mol. Cell Biol. 6, 413–422 (2005).

AcknowledgementsC.M.O. is supported by a Canada Research Chair in Metalloproteinase Proteomics and Systems Biology, with research grants from the Canadian Institutes of Health

Research, the National Cancer Institute of Canada (with funds raised by the Canadian Cancer Association), and the Canadian Breast Cancer Research Alliance special program grant on metastasis, as well as with a Centre Grant from the Michael Smith Research Foundation. C.P.B. is funded by grants from the National Institutes of Health, from the National Institute of General Medical Sciences and from the Eye Institute, and by a sponsored Research Agreement from Novartis, Basel, Switzerland.

Competing interests statementThe authors declare no competing financial interests.

DATABASESThe following terms in this article are linked online to:UniProtKB: http://ca.expasy.org/sprot

ADAM9 | ADAM10 | ADAM17 | MMP2 | MT1-MMP | SDF1α |

TGFα | TNFα | WISP2

FURTHER INFORMATIONCarl P. Blobel’s homepage: http://www.hss.edu/research-staff_blobel-carl.asp

Christopher M. Overall’s homepage: http://www.clip.ubc.ca

CancerDegradome: http://bioweb2.bio.uea.ac.uk/cancerdegradome/

welcome.html

Human, Mouse and Rat Degradomes: http://www.uniovi.es/degradome

International Proteolysis Society: http://www.protease.org

MEROPS: the Peptidase Database: http://merops.sanger.ac.uk

Nomenclature Committee of the International Union of Biochemistry and Molecular Biology — Peptidase Nomenclature: http://www.chem.qmul.ac.uk/iubmb/enzyme/EC34

Prediction of Protease Specificity (PoPS): http://pops.csse.monash.edu.au

Access to this links box is available online.

R E V I E W S

NATURE REVIEWS | MOLECULAR CELL BIOLOGY VOLUME 8 | MARCH 2007 | 257

© 2007 Nature Publishing Group