Upload
tayler-westfield
View
228
Download
0
Tags:
Embed Size (px)
Citation preview
European Nucleotide Archive
Data Search and Retrieval
Nicole Silvester
www.ebi.ac.uk/ena
Objectives
• Understand the different types of data available from ENA
• Know how to find/download them
Data available from ENA
• Metadata (XML or tab-separated text)
• Flat files
• Sequences (FASTA)
• Raw reads (FASTQ or submitted format)
• Analysis (submitted format)
Searching for data
• By accession
• By taxon
• By search conditions:
• Text search
• Advanced search
• By sequence:
• Sequence search
Accession entry point
• http://ebi.ac.uk/ena/data/view/<ACCESSION>• http://www.ebi.ac.uk/ena/data/view/ERS027401
• Retrieve XML format• http://www.ebi.ac.uk/ena/data/view/ERS027401&display=XML
• Retrieve Flat file• http://www.ebi.ac.uk/ena/data/view/AF059042&display=TEXT
• Retrieve FASTA sequence• http://www.ebi.ac.uk/ena/data/view/AF059042&display=FASTA
Taxon entry point
• NCBI tax ID
• http://www.ebi.ac.uk/ena/data/view/Taxon:6643
• Scientific name
• http://www.ebi.ac.uk/ena/data/view/Taxon:Octopus
• Retrieve FASTA sequences• http://www.ebi.ac.uk/ena/data/view/Taxon:6643&subtree=true&p
ortal=sequence_release&offset=1&length=1000&display=fasta&download=fasta
Text Search vs Advanced Search
• Text search:
• “full text” search
• terms searched against data files
• searches across all ENA data domains
• Advanced search:
• “field-based” search
• search against specific meta-data fields
• searches within chosen ENA data domain
Text Search vs Advanced Search
• want: human sequences
Text Search vs Advanced Search
• want: human mRNA sequences
Sequence Search
• http://www.ebi.ac.uk/ena/search/
Resources
• http://www.ebi.ac.uk/ena/about/browser
• http://www.ebi.ac.uk/ena/about/marker-portal-web-interface
• http://www.ebi.ac.uk/ena/about/taxon-portal-web-interface
• http://www.ebi.ac.uk/ena/about/data_download
• http://www.ebi.ac.uk/ena/about/sequence_download
• http://www.ebi.ac.uk/ena/about/read_download
• http://www.ebi.ac.uk/ena/about/sequence_search