Click here to load reader

Retrovirology BioMed Central - · PDF fileBioMed Central Page 1 of 14 (page number not for citation purposes) Retrovirology Review Open Access BioAfrica's HIV-1 Proteomics Resource:

  • View

  • Download

Embed Size (px)

Text of Retrovirology BioMed Central - · PDF fileBioMed Central Page 1 of 14 (page number not for...

  • BioMed CentralRetrovirology


    Open AcceReviewBioAfrica's HIV-1 Proteomics Resource: Combining protein data with bioinformatics toolsRyan S Doherty*1, Tulio De Oliveira1, Chris Seebregts2, Sivapragashini Danaviah1, Michelle Gordon1 and Sharon Cassol1,3

    Address: 1Molecular Virology and Bioinformatics Unit, Africa Centre for Health and Population Studies, Doris Duke Medical Research Institute, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa, 2Biomedical Informatics Research Division, South African Medical Research Council, Cape Town, South Africa and 3Department of Medical Virology, University of Pretoria, Pretoria, South Africa

    Email: Ryan S Doherty* - [email protected]; Tulio De Oliveira - [email protected]; Chris Seebregts - [email protected]; Sivapragashini Danaviah - [email protected]; Michelle Gordon - [email protected]; Sharon Cassol - [email protected]

    * Corresponding author

    AbstractMost Internet online resources for investigating HIV biology contain either bioinformatics tools,protein information or sequence data. The objective of this study was to develop a comprehensiveonline proteomics resource that integrates bioinformatics with the latest information on HIV-1protein structure, gene expression, post-transcriptional/post-translational modification, functionalactivity, and protein-macromolecule interactions. The BioAfrica HIV-1 Proteomics Resource is a website that contains detailed information aboutthe HIV-1 proteome and protease cleavage sites, as well as data-mining tools that can be used tomanipulate and query protein sequence data, a BLAST tool for initiating structural analyses of HIV-1 proteins, and a proteomics tools directory. The Proteome section contains extensive data oneach of 19 HIV-1 proteins, including their functional properties, a sample analysis of HIV-1HXB2,structural models and links to other online resources. The HIV-1 Protease Cleavage Sites sectionprovides information on the position, subtype variation and genetic evolution of Gag, Gag-Pol andNef cleavage sites. The HIV-1 Protein Data-mining Tool includes a set of 27 group M (subtypes Athrough K) reference sequences that can be used to assess the influence of genetic variation onimmunological and functional domains of the protein. The BLAST Structure Tool identifies proteinswith similar, experimentally determined topologies, and the Tools Directory provides acategorized list of websites and relevant software programs. This combined database and softwarerepository is designed to facilitate the capture, retrieval and analysis of HIV-1 protein data, and toconvert it into clinically useful information relating to the pathogenesis, transmission andtherapeutic response of different HIV-1 variants. The HIV-1 Proteomics Resource is readilyaccessible through the BioAfrica website at:

    BackgroundAlthough the HIV-1 genome contains only 9 genes, it iscapable of generating more than 19 gene products. These

    products can be divided into three major categories: struc-tural and enzymatic (Gag, Pol, Env); immediate-early reg-ulatory (Tat, Rev and Nef), and late regulatory (Vif, Vpu,

    Published: 09 March 2005

    Retrovirology 2005, 2:18 doi:10.1186/1742-4690-2-18

    Received: 30 September 2004Accepted: 09 March 2005

    This article is available from:

    2005 Doherty et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

    Page 1 of 14(page number not for citation purposes)
  • Retrovirology 2005, 2:18

    Vpr) proteins. Tat, Rev and Nef are synthesized from smallmultiply-spliced mRNAs; Env, Vif, Vpu and Vpr are gener-ated from singly-spliced mRNAs, the Gag and Gag-Polprecursor polyproteins are synthesized from full-lengthmRNA. The matrix (p17), capsid (p24) and nucleocapsid(p7) proteins are produced by protease cleavage of Gagand Gag-Pol, a fusion protein derived by ribosomalframe-shifting. Cleavage of Nef generates two differentprotein isoforms; one myristylated, the other non-myri-stylated. The viral enzymes (protease, reverse tran-scriptase, RNase H and integrase) are formed by proteasecleavage of Gag-Pol. Alternative splicing, together with co-translational and post-translational modification, leads toadditional protein variability [1].

    Phylogenetic analysis, on its own, provides little informa-tion about the conformational, immunological and func-tional properties of HIV-1 proteins, but instead, focuseson the evolution and historical significance of sequencevariants. To understand the clinical significance of geneticvariation, sequence analysis needs to be combined withmethods that assess change in the structural and biologi-cal properties of HIV-1 proteins. At present, informationand tools for the systematic analysis of HIV-1 proteins arelimited, and are scattered across a wide-range of onlineresources [2,3]. To facilitate studies of the biological con-sequences of genetic variation, we have developed an inte-grated, user-friendly proteomics resource that integratescommon approaches to HIV-1 protein analysis (Figure 1).We are currently using this resource to better understandthe structure-function relationships underlying the emer-gence of antiretroviral drug resistance, and to examine theprocess of immune escape from cytotoxic T-lymphocytes(CTLs).

    We have categorized the Proteomics Resource into the fol-lowing main subject headings (Figure 2 &3):

    1. HIV Proteome Information about structure andsequence, as well as references and tutorials, for each ofthe HIV-1 proteins (Figure 4);

    2. HIV-1 Cleavage Sites Information about the positionand sequence of HIV-1 Gag, Pol and Nef cleavage sites(Figure 5);

    3. HIV Protein Data Mining Tool Application for detect-ing the characteristics of HIV-1 M group isolate (subtypeA to K) proteins using information available in publicdatabases and tools (Figure 6);

    4. HIV Structure BLAST Similarity search for analyzingHIV protein sequences with corresponding structural data(Figure 7);

    5. Proteomics Online Tools Directory of data resourcesand tools available for both protein sequence and proteinstructure analyses of HIV (Figure 8 &9).

    The proteome linkIn the HIV-1 Proteome section, each of the 19 HIV-1 pro-teins has a webpage that is divided into six parts: "generaloverview", "genomic location", "domains/folds/motifs","protein-macromolecule interactions", "primary and sec-ondary database entries", and "references and recom-mended readings" (Figure 4). The overview includes adescription of the protein, a list of known isoforms, a rep-resentative tertiary structure animated image (GIF format)of the protein and its co-ordinates (PDB format), a link tochime tutorials, if available, and information about cleav-age sites, localization, and functional activity. Thegenomic location section provides information on thelocation of the sequence relative to the referencesequence, HIV-1HXB2 [4], sequence data (fasta format),and information about the length, molecular weight andtheoretical isoelectric point (pI) of the protein. Thedomains/folds/motifs section contains informationabout functional domains and predicted motifs (glyco-sylation, myristoylation, amidation, phosphorylation andcell attachment sites) of HIV-1HXB2 [4], and provides struc-tural predictions (secondary structure, transmembraneregions, low complexity regions, and coiled-coil regions).The section on protein-macromolecule interactionsincludes information on protein complexes, protein-pro-tein/DNA/RNA interactions, signal-transduction path-ways, and potential interactions with other pathogens.The section on primary and secondary databases containsa list of database entries that are needed to retrieve infor-mation on protein structure, nucleotide/amino acidsequence data, protein sequence annotation, proteinswith similar sequence and structure (such as Los AlamosNational Laboratories HIV Sequence Database and theRCSB Protein Data Bank), as well as information on post-translational modification and protein-protein interac-tions. A list of key reviews and publications, used in thedevelopment of the BioAfrica HIV-1 Proteomics Resource,is provided in the references and recommended readingssection. As an example, the proteome webpage for Tat,describes how this protein up-regulates HIV-1 geneexpression by interacting with the long-terminal repeat(LTR) of HIV-1, promoting the elongation phase of viraltranscription, allowing full-length HIV-1 mRNA tran-scripts to be produced [5,6] (Figure 10). The webpage alsogives information on the structural organization of tatgene. The mRNA is derived from spliced exons encoded intwo different open reading frames. In HIV-1HXB2, thesereading frames are separated by a distance of 2334 nucle-otides. Some HIV-1 isolates, including HIV-1HXB2, containan artifact of laboratory strains consisting of a prematurestop codon at position 8424 of exon 2. The presence of

    Page 2 of 14(page number not for citation purposes)

Search related