Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
1Squires RB, et al. Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance. Influenza Other Respir Viruses. 2012, 6(6):404-16.
2Noronha JM, et al. Influenza Sequence Feature Variant Type (Flu-SFVT) analysis: evidence for a role of NS1 in influenza host range restriction. J Virol. 2012, 86(10):5857-66.
3CDC. http://www.cdc.gov/flu/pdf/avianflu/h5n1-inventory.pdf 4Burke DF, et al. A recommended numbering scheme for influenza A HA subtypes. PLoS One. 2014, 9(11):e112302.
References
Overview
Influenza Research Database (IRD, www.fludb.org), funded by the National Institute of Allergy and Infectious Diseases, serves as a single publicly-accessible repository of integrated datasets and analysis tools for influenza virus research1.
IRD Integrates Data from External Sources and Generates Novel Data from Internal Computational Pipelines
IRD Provides Analysis and Visualization Tools • BLAST Sequence Similarity Search • Multiple Sequence Alignment • Phylogenetics in Super-Computing Environment • Sequence Variation (SNP) Analysis • Metadata-driven Comparative Genomics Analysis • Sequence Feature Variant Type (SFVT) Analysis • 3D Protein Structure Visualization • Host Factor Enrichment Analysis • Short Peptide Search • PCR Primer Design • Genome Annotation including SF Annotation • HPAI H5N1 & Swine H1 Clade Classifications • HA Subtype Numbering Conversion • Batch sequence submission to GenBank
IRD Provides Personal Workbench for Data Storage & Sharing
Figure 1. Search sequences based on swine H1 clade(s) from the Swine H1 Clade Sequence Search page.
H1 & H5 Clade Classifications
We would like to thank the primary data providers for the data that was used throughout this study. We also recognize the scientific and technical personnel responsible for supporting and developing IRD, which has been wholly supported by the NIH/NIAID (No. HHSN272201400028C). Conflict of interest: None declared.
Acknowledgements
Sequence Feature Phenotypic Variant Type
Figure 2. Strain Details page shows that the HA sequence of the A/Jiangsu/4/2007 strain carries the 110N PVT substitution that has been shown to increase binding to alpha 2-6 receptor in the publication cited.
Figure 3. HA Subtype Numbering Conversion Result page showing the coordinates of user-provided HA protein sequence are converted to the coordinates of other HA subtypes.
HA Subtype Numbering Conversion
Custom Metadata Capturing
• Provides comprehensive enriched influenza virus sequence annotations
• Supports custom sequence annotation, analysis, and visualization
Conclusion
Novel Sequence Annotation and Analysis Tools in the Influenza Research Database (IRD) Yun Zhang1, Alexandra J. Lee1, Catherine Macken2, Tavis Anderson3, Amy Vincent3, David Burke4, Brian Aevermann1,
Douglas S. Greer1, Lucy Stewart1, Brian Reardon1, Sherry He5, Lei Tong5, Sanjeev Kumar5, Zhiping Gu5, Christopher N. Larson6, Guangyu Sun6, Sam Zaremba5, Edward B. Klem5, Richard H. Scheuermann1,7
1J. Craig Venter Institute, La Jolla, CA, USA; 2University of Auckland, Auckland, New Zealand; 3U.S. Department of Agriculture, Ames, IA, USA; 4University of Cambridge, UK; 5Northrop Grumman Health Solutions, Rockville MD, USA;
6Vecna Technologies, Greenbelt MD, USA 7Department of Pathology, University of California, San Diego, CA, USA www.fludb.org
User-uploaded sequences
Figure 4. A phylogenetic tree constructed from a combination of user-provided (downloaded from GISAID) and IRD sequences. Tree leaves colored-coded by subtype. User-provided sequences are highlighted in green.
• New utility for capturing user-provided sequence associated metadata
• Analyze and visualize user-provided sequence data and metadata along with IRD data using any IRD tools
10/20/2014 Influenza Research Database - Strain A/duck/Vietnam/LBM568/2014(H5N1)
http://www.fludb.org/brc/fluStrainDetails.spg?strainName=A/duck/Vietnam/LBM568/2014(H5N1)&decorator=influenza&context=1413844826442 1/2
Loading Influenza Research Database...
Influenza Strain Details for A/duck/Vietnam/LBM568/2014(H5N1)
Strain Information
Strain Name A/duck/Vietnam/LBM568/2014
Organism Name Influenza A Virus
Subtype H5N1
Host IRD:Mallard/AvianGenBank:Anas platyrhynchos var.domestica
2009 Pandemic H1N1like(SOP) ?
Negative
Isolation Country Viet Nam
Collection Date 01/08/2014
GenBank Submission Date 08/07/2014
NCBI Taxon ID 1518578
Complete Genome Set Yes
Sequence Derived Phenotype Marker
Sequence Information
Segment Subtype Gene Product NameGenBank Source Sequence
Accession
Complete
Sequence
Segment
Length
IRD
Submission
pH1N1
like
1 H5N1 PB2 Polymerase (basic) protein 2 AB972688 Complete 2308 N/A No
2 H5N1 PB1 Polymerase (basic) protein 1, PB1F2 AB972689 Complete 2309 N/A No
3 H5N1 PA Polymerase (acidic) protein, PAXprotein(+61)
AB972690 Complete 2200 N/A No
4 H5N1 HA Hemagglutinin AB972691 Complete 1737 N/A No
5 H5N1 NP Nucleoprotein AB972692 Complete 1550 N/A No
6 H5N1 NA Neuraminidase AB972693 Complete 1362 N/A No
7 H5N1 M1 Matrix protein 1, M2 Matrix protein 2 AB972694 Complete 992 N/A No
8 H5N1 NS1 Nonstructural protein 1, NS2 Nonstructural protein 2
AB972695 Complete 840 N/A No
alpha26 conferred increased binding to alpha26 without lossof binding to alpha23 by comparing HA activitiesusing enzymatically modified chicken RBCs.
HA Influenza A_H5_determinantofvirulence_171(3)_171N, 172A,239N_Decreasedvirulence
171N, 172A,239N
No Introduction of Ser171Asn, Thr172Ala, Ser239Asnsubstitutions in the A/Vietnam/1203/2004 backboneconferred increased affinity for alpha26SAL usingsolid phase assay. The mutant virus showed 100 foldreduction in the lethality of WT.
PubMed:19116267
HA Influenza A_H5_speciesadaptation_171(3)_171N, 172A,239N_Increasedbindingtoalpha26
171N, 172A,239N
No Introduction of Ser171Asn, Thr172Ala, Ser239Asnsubstitutions in the A/Vietnam/1203/2004 backboneconferred increased affinity for alpha26SAL usingsolid phase assay. The mutant virus showed 100 foldreduction in the lethality of WT.
PubMed:19116267
HA Influenza A_H5_speciesadaptation_172(1)_172A_Increasedbindingtoalpha26
172A Yes Introduction of Thr172Ala naturally occurringsubstitution in the A/Vietnam/1203/2004 backboneconferred increased binding to alpha26 without lossof binding to alpha23 by comparing HA activitiesusing enzymatically modified chicken RBCs.
PubMed:20427525
HA Influenza A_H5_species 172A, 238L No Introduction of Thr172Ala, Gln238Leu naturally PubMed:20427525
Home Sequence... Sequence... Sequence... Sequence... Sequence Feature Strains Strain Details (A/duck/Vietnam/LBM568/2014)
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA HELP
About Us Community Announcements Links Resources Support Workbench Sign In
• Curated Sequence Features2 including phenotype markers in the CDC H5 Genetic Changes Inventory3
• Computed Variant Types of each Sequence Feature • Annotated all IRD sequences with the presence/absence
of Phenotypic Variant Types (PVT) • PVT annotation tool for user-provided sequences
• Based on the HA subtype numbering scheme by Burke and Smith (2014)4
• Automatically convert the coordinates of any HA protein sequences to coordinates of any other subtypes, in order to map functional domains or phenotype markers across subtypes.
• Integrated with IRD analysis tools including Sequence Variation Analysis and metadata-driven Comparative Analysis Tool for Sequences (meta-CATS)
• Swine H1 classification algorithm based on the USDA/OFFLU swine H1 classification scheme
• H5 classification algorithm based on the CDC/WHO HPAI H5N1 classification scheme
• Annotated all IRD sequences with H1/H5 clade assignments
• H1 & H5 clade classification tools for user-provided sequences
Data Aggregated by IRD (Source) Strains (GenBank) 106,760
Segment Sequences (GenBank) 442,929 Proteins (GenBank and UniProt) 705,701 3D Protein Structures (PDB) 662 Experimentally Determined Epitopes (IEDB) 6,304
Data Directly Submitted to IRD (Source) Surveillance Records (NIAID CEIRS) 629,403 Serology Data Records (NIAID CEIRS) 35,584 Human Samples with Clinical Metadata (NIAID GSCID) 736,576
Host Factor Experiments (NIAID Systems Biology) 57 Host Factor Data (ViPR Driving Biological Projects) coming soon Data Derived/Annotated by IRD Sequence Features 3,482
Proteins with Predicted Epitopes 616,961 Proteins with Pfam Domains 662,132 Proteins with Other Domains/Motifs 442,032 Proteins with GO IDs 508,568
Segments with Pre-computed Alignments 425,864 Strains with Predicted pH1N1 Classification 44,918 Strains with Predicted H5 Clade Classification 7,028 Antiviral Drugs 70
Loading Influenza Research Database...
Release Date: May 12, 2016
This system is provided for authorized users only. Anyone using this system expressly consents to monitoring while using the system. Improper use of this system may be referred to lawenforcement officials. This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272201400028C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute, and Vecna Technologies.
DATA TO RETURNSegment / NucleotideProteinStrain
SELECT CLADE(S)
Include Partial SequencesComplete Segments OnlyComplete Genomes only
SELECT SEGMENTS
COMPLETE SEQUENCES
DATE RANGEFrom: YYYY To: YYYY
To add month to search, seeAdvance Options: Month Range
HOST GEOGRAPHIC GROUPING
COUNTRY
ADVANCED OPTIONS
SearchClear
Results matching your criteria: 904
Tip: To select multiple or deselect, Ctrlclick (Windows) or Cmdclick (MacOS)
Show All
Swine H1 Clade Sequence Search
An IRD algorithm classifies the clade of the HA of H1 viruses, from any host and for any NA subtype, with reference to the USDA classification of US swine H1 viruses. Thisalgorithm, which is based on phylogenetic analysis, is an adaptation of that used for classifying HA(H5) sequences; it was developed by IRD team member Catherine Macken, inconjunction with Tavis Anderson and other swine influenza experts at the USDA. It has been verified as highly accurate (> 99%) for sequences of at least 300 nucleotides ofHA1. See SOP for more details. Those HA's not belonging to any of the recognized US swine H1 clades are given the classification "Other." Nonsegment 4 sequences from avirus with a US swine H1 classification are given the same assignment as the HA. Representative tree of swine HA(H1) sequences showing named US swine clades Description of clades with name that include "like"
Home Swine H1 Clade Sequence Search
SEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA HELP
About Us Community Announcements Links Resources Support Sign Out