142
Training materials Ensembl materials are protected by a CC BY license http://creativecommons.org/licenses/by/4.0/ If you wish to re-use these, please credit Ensembl for their creation If you use Ensembl for your work, please cite our papers http://www.ensembl.org/info/about/publications.html

Ensembl Browser Workshop

Embed Size (px)

Citation preview

Page 1: Ensembl Browser Workshop

Training materials

Ensembl materials are protected by a CC BY license http://creativecommons.org/licenses/by/4.0/ If you wish to re-use these, please credit Ensembl for their creation If you use Ensembl for your work, please cite our papers

http://www.ensembl.org/info/about/publications.html

Page 2: Ensembl Browser Workshop

Denise Carvalho-Silva European Molecular Biology Laboratory

European Bioinformatics Institute

Browsing Genes, Variation and Regulation data with Ensembl

UCD - Dublin

Page 3: Ensembl Browser Workshop

Today 09:30-17:00 •  Introduction to Ensembl

•  Browser walkthrough

10:45-11:00 coffee/tea

•  Browser exercises

•  BioMart (Talk + Exercises)

13:00-14:00 lunch break

•  Genetic variation (Talk + Exercises)

15:30-15:45 coffee/tea

•  Gene Regulation and/or Custom data (Talk + Exercises)

•  Wrap up, photo opportunity & feedback survey

Page 4: Ensembl Browser Workshop

http://www.ebi.ac.uk/ ~denise/workshops/

2016/dublin

Materials

Page 5: Ensembl Browser Workshop

Course Objectives

What is Ensembl?

What type of data can you get in Ensembl?

How to navigate the Ensembl browser website?

How to connect with Ensembl

Page 6: Ensembl Browser Workshop

bli blo

bla blu bla bla

bli blo

bla blu

bla bla

bli blo

bla blu

bla bla

bla blu

bla bla

bli blo

Page 7: Ensembl Browser Workshop

Introduction Why do we need/have genome browsers?

Page 8: Ensembl Browser Workshop

Genome sequencing

1977: 1st genome to be sequenced (5 kb) 2000: draft human sequence (3 gb)

Large amounts of raw DNA sequence data

Page 9: Ensembl Browser Workshop

Raw DNA sequence data

Page 10: Ensembl Browser Workshop

Annotation: making sense

Page 11: Ensembl Browser Workshop

Annotation of vertebrate genomes

w

ww

.ens

embl

.org

pre.

ense

mbl

.org

>80 genomes* D. melanogaster

C. elegans S. cerevisae

*Release 84 March 2016

Page 12: Ensembl Browser Workshop

1 human genome à 3 assemblies

www.ensembl.org grch37.ensembl.org

e54.ensembl.org

Page 13: Ensembl Browser Workshop

EBI is an Outstation of the European Molecular Biology Laboratory.

Comparative Genomics Gene models

Regulation Variation

Custom data display Programmatic access

Toolkit

Ensembl Features

Page 14: Ensembl Browser Workshop

EBI is an Outstation of the European Molecular Biology Laboratory.

Comparative Genomics Gene models

Regulation Variation

Custom data display Programmatic access

Toolkit

Ensembl Features

Page 15: Ensembl Browser Workshop

Ense

mb

l au

tom

atic

an

not

atio

n

Page 16: Ensembl Browser Workshop

Gene models in Ensembl

Goal: Generate set of well-supported genes

Automatic Manual

Page 17: Ensembl Browser Workshop

•  many species •  genome-wide at once •  ~ 4 months

•  fewer species •  gene by gene •  many years

Automatic and coding (20_)

Manual and coding (00_)

Automatic + Manual (“gold”)

Manual and non-coding (00_)

Automatic annotation* Manual annotation*

* based on experimental, biological evidence (INSDC, UniProtKB…)

Page 18: Ensembl Browser Workshop

Ensembl genes & transcripts

•  merged annotation

•  higher confidence and quality

•  comprehensive: alternatively spliced transcripts

UTR Exon Intron

5’ UTR 3’ UTR

Gold (identical annotation) = Automatic + Manual

Page 19: Ensembl Browser Workshop

Alternatively splicing

rich and comprehensive annotation

Page 20: Ensembl Browser Workshop

Which transcript to use?

http://www.ensembl.org/Help/Glossary?id=493 http://www.ensembl.org/Help/Glossary?id=492

APPRIS

TSLs

Page 21: Ensembl Browser Workshop

CCDS project

•  annotate a consensus coding DNA sequence set •  EBI, WTSI, UCSC and NCBI • 

Genome Res. 19:1316-23 (2009)

http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi

CCDS transcript

Page 22: Ensembl Browser Workshop

Disclaimer: which transcript to use

No single method will tell us which transcript to use Decision on a case by case basis

•  All transcripts OR one/two well supported ones?

List of transcripts: we offer choices based on •  CCDS (Ensembl, HAVANA, NCBI, UCSC) •  Golden transcripts (identical Ensembl and HAVANA) •  Cross reference entries (e.g. UniProtKB, RefSeq) •  APPRIS •  TSLs

Page 23: Ensembl Browser Workshop

Annotation based on RNASeq

http://www.ensembl.org/info/genome/genebuild/rnaseq_annotation.html

Page 24: Ensembl Browser Workshop

ncRNA gene annotation

http://www.ensembl.org/info/genome/genebuild/ncrna.html

Page 25: Ensembl Browser Workshop

Ensembl stable identifiers

•  ENSG########### Ensembl Gene ID •  ENST########### Ensembl Transcript ID •  ENSP########### Ensembl Peptide ID •  ENSE########### Ensembl Exon ID

•  For non-human species a suffix is added: ENSMUSG MUS (Mus musculus) for mouse

Page 26: Ensembl Browser Workshop
Page 27: Ensembl Browser Workshop

Ensembl Browser

Live demo: Walking through the website

pages 11-31

Page 28: Ensembl Browser Workshop

The ESPN gene products are active in the inner ear, where it appears to play an essential role in normal hearing and balance.

Let’s explore ESPN

Before we start: background

Page 29: Ensembl Browser Workshop

A) What is the location and strand of the human ESPN gene?

B) How can I view protein alignments and variants mapped to this location?

C) Can I move data tracks up and down,

share and delete tracks?

Human ESPN: location

Page 30: Ensembl Browser Workshop

A) How can I find the genomic sequence of this gene? What is the ID of its first exon?

B) Can I display the genomic coordinates and variants on this sequence?

C) Can I find information on the expression of this gene in different tissues?

Human ESPN: gene

Page 31: Ensembl Browser Workshop

A) How many exons does the longest ESPN transcript have? Are there any completely untranslated exons?

B) Can I find its cDNA sequence?

C) What are the UniProt and RefSeq entries cross referenced to this transcript?

Human ESPN: transcript

Page 32: Ensembl Browser Workshop

Ensembl Browser

Exercises pages 32-35

Answers

www.ebi.ac.uk/~denise/workshops/2016/ dublin/answers

Feel free to explore your favourite gene/region too!

Page 33: Ensembl Browser Workshop

EBI  is  an  Outsta,on  of  the  European  Molecular  Biology  Laboratory.    

BioMart

Page 34: Ensembl Browser Workshop

Outline

•  Definitions

•  The principle: 4 steps •  Tutorial: simple query in human

•  Find Ensembl BioMart and BioMart elsewhere

•  Sophisticated platforms: mart services, APIs, etc… •  Exercises

Page 35: Ensembl Browser Workshop

Would you like to…

•  … convert protein IDs into gene IDs or names?

•  … get a list of all genes mapped to a region deleted in a patients’ cohort?

•  … export sequences for a bunch of genes or variants?

If you answered yes, keep listening!

Page 36: Ensembl Browser Workshop

What is BioMart?

•  Free service for easy retrieval of Ensembl data •  Data export tool with little/no programming required

•  Complex queries with a few mouse clicks

•  Output formats (.xls, .csv, fasta, tsv, html)

Page 37: Ensembl Browser Workshop

The four-step principle

DATA FILTERS ATTRIBUTES

RESULTS

IDs Regions Domains

Expression

Tables Fasta

Dataset

Database Homologs Sequences Features

Structures

Page 38: Ensembl Browser Workshop

Choosing the data

Database and dataset

Page 39: Ensembl Browser Workshop

Limit your data set (information that you know)

Selecting the filters

Click “Count” to see if BioMart is reading

the input data

Page 40: Ensembl Browser Workshop

Picking the attributes

Determine output columns (information you want to know)

Page 41: Ensembl Browser Workshop

The different attributes

Page 42: Ensembl Browser Workshop

Getting the results

Tables/sequences

click “Unique results only”

For the full table: click View “ALL”

rows or “Go”

Page 43: Ensembl Browser Workshop

Selected IBD genes

IL23R, PTPN22, CUL2, C1orf106, IL18RAP

Page 44: Ensembl Browser Workshop

For the IL23R, PTPN22, CUL2, C1orf106 and IL18RAP genes, use BioMart to retrieve a table (.xls) containing: •  Associated gene name, ENSG and ENST IDs

•  Chromosome name, gene start and end

•  GO term name and Interpro description

Tutorial: BioMart

Page 45: Ensembl Browser Workshop

The four-step principle

DATA FILTERS ATTRIBUTES

RESULTS

Gene

Gene name, ENSG/ENST ID, Chr start end,

GO term name, Interpro

description

.xls table

Human

IL23R, CUL2,

PTPN22, C1orf106 IL18RAP

Page 46: Ensembl Browser Workshop

Ensembl BioMart

Live demo

Page 47: Ensembl Browser Workshop

Find BioMart

www.ensembl.org/biomart

Page 48: Ensembl Browser Workshop

Ensembl BioMarts

Page 49: Ensembl Browser Workshop

http://tinyurl.com/biomart-video

BioMart video

Page 50: Ensembl Browser Workshop

More sophisticated platforms

•  BioMart queries: MartService www.biomart.org/martservice.html

•  APIs: PERL, Java, Web Services

•  Third party softwares

galaxyproject.org bioconductor.org taverna.org.uk

Page 51: Ensembl Browser Workshop
Page 52: Ensembl Browser Workshop

Ensembl BioMart

Thomas Maurel

Amonida Zadissa

Page 53: Ensembl Browser Workshop

BioMart

Step-by-step example pages 36-40

Exercises

pages 41-43

Answers

www.ebi.ac.uk/~denise/workshops/2016/ dublin/answers

Feel free to explore BioMart in other contexts too!

Page 54: Ensembl Browser Workshop

EBI  is  an  Outsta,on  of  the  European  Molecular  Biology  Laboratory.    

Genetic Variation

Page 55: Ensembl Browser Workshop

Outline

•  Classes of variation, species and sources

•  Browsing variation data: some entry points Location tab Gene tab Variation tab

•  Phenotypic data and population genetics

•  How to annotate your own variants

•  Exercises

Page 56: Ensembl Browser Workshop

1) Large scale: structural (> 50 base pairs)

Genetic variation

duplication deletion inversion translocation loss

2) Short scale: SNPs (or SNVs), indels

G A C T G A C T A T C G G G G T T T C C C A A A

G A A T G A C T T T C G G - G - T T C C - A A A

Page 57: Ensembl Browser Workshop

Species with variation data

Understand the types of genetic variation data and how to view them in the context of our genomes

Page 58: Ensembl Browser Workshop

Sources of variation data

•  Import alleles and frequencies

•  Annotate variants

http://www.ensembl.org/info/docs/variation/sources_documentation.html

Page 59: Ensembl Browser Workshop

Location tab: across a region

SVs SNPs

Ensembl genes

Page 60: Ensembl Browser Workshop

Gene tab: gene-centric SNPs

SVs

Page 61: Ensembl Browser Workshop

Variation tab: variant centric

summary  data  

SNP or SV

Page 62: Ensembl Browser Workshop

Variants on the karyotype

Page 63: Ensembl Browser Workshop

Phenotype data in Ensembl species and sources

Page 64: Ensembl Browser Workshop

Population data for variants

http://hapmap.ncbi.nlm.nih.gov/

http://www.1000genomes.org

Page 65: Ensembl Browser Workshop

pie charts: 1KG super populations

Human Population Genetics

Page 66: Ensembl Browser Workshop

Coffee intake is a worldwide phenomenon

with Finland at the top, and UK in the 44th

place. Is caffeine consumption in our genes?

A)  What are the chromosome locations of variants associated with this phenotype?

B)  Which variant has got the most significant association?

C)  What is the ancestral allele of this variant? Is it conserved in eutherian mammals?

D)  What is the most frequent allele in GBR?

E)  Can you download this variant and 200 nt upstream and downstream flanking sequence in RTF (Rich Text Format)?

Live demo

Page 67: Ensembl Browser Workshop

before we find out

•  What is it?

•  What does it do?

•  Where can I find it?

Page 68: Ensembl Browser Workshop

I’ve got a list of genetic variants from my resequencing project of a cohort study of breast cancer in London. The positions are all on chromosome 9, GRCh37 assembly:

131090740 A/- (positive strand)

131084628 C/A (positive strand)

131085358 C/G (positive strand)

131085196 G/A (positive strand)

1)  Do any of these cause a change at the amino acid level?

2)  Are these predicted to be deleterious according to PolyPhen?

3)  Can I get the flanking sequence (200 nt both up and downstream) for the known variants in this set?

A use case: cancer patients

Page 69: Ensembl Browser Workshop

My resequencing experiments cancer patients

X healthy controls

9 131090740 131090740 A/- 9 131084628 131084628 C/A 9 131085358 131085358 C/G 9 131085196 131085196 G/A

Chromosome Alleles

End Start

Positions in the genome vary between the two groups

Page 70: Ensembl Browser Workshop

Can I annotate these variants?

•  Variant Effect Predictor •  Annotate variants (SNPs, CNVs, indels) •  Available for GRCh38 and GRCh37 (hg=19)

Yes, you can!

PMID: 20562413

Perl  script  Web  interface   REST  API  

XML  

Page 71: Ensembl Browser Workshop

CODING Synonymous

INTRONIC 5’ UTR

ATG AAAAAAA Regulatory

Splice sites

CODING Missense

3’ UTR 5’ Upstream 3’ downstream

Mapping variants on transcripts

Identify transcripts that overlap variants and predict the consequence of these on Ensembl (or RefSeq) transcripts using

Page 72: Ensembl Browser Workshop

Consequence terms for variants

http://www.ensembl.org/info/genome/variation/predicted_data.html#consequence_type_table

* defined by the Sequence Ontology (SO) project (http://www.sequenceontology.org/)

Page 73: Ensembl Browser Workshop

SIFT sift.jcvi.org/

Consequence: missense GAG >GGG Glu > Gly

PolyPhen-2 genetics.bwh.harvard.edu/pph2/ Condel

dbNSFP

Page 74: Ensembl Browser Workshop

Ensembl tools http://www.ensembl.org/tools.html

http://www.ensembl.org/vep

Page 75: Ensembl Browser Workshop

Inputting data into

Chromosome Start End Alleles Strand

Page 76: Ensembl Browser Workshop

Output options in

GAG > GGG Glu > Gly

GAG > GAA Glu > Glu

Page 77: Ensembl Browser Workshop

Queued Running Done Failed

Save to your account (log in) Edit and resubmit your job Delete job

Ticket system in

Ticket identifier Job name

Page 78: Ensembl Browser Workshop

Viewing results

SO consequence terms*

*http://www.sequenceontology.org/index.html

Page 79: Ensembl Browser Workshop

ensembl.org/info/docs/tools/vep/online/results.html#summary

Table •  Before / after filtering •  novel / existing variants

Pie charts (consequence terms) •  total observed (more than one per variant)

•  Separate chart: coding consequences

Viewing results

Page 80: Ensembl Browser Workshop

Navigate results (one row per variant/ transcript overlap)

Show/hide columns in results table

more columns: scroll right

•  Download results •  Send results to BioMart

Create and edit filters

ensembl.org/info/docs/tools/vep/online/results.html#table

results table

Page 81: Ensembl Browser Workshop

Filters consist of three components Field •  e.g. Consequence, biotype Operator •  e.g. is, matches (partial string matches)

Value •  the value to compare against •  some fields have autocomplete values

Multiple filters allowed with logical relationship (AND, OR) Active filters can be edited too!

ensembl.org/info/docs/tools/vep/online/results.html#filter

Filtering results

Page 82: Ensembl Browser Workshop

My resequencing experiments cancer patients

X healthy controls

9 131090740 131090740 A/- 9 131084628 131084628 C/A 9 131085358 131085358 C/G 9 131085196 131085196 G/A

Chromosome Alleles

End Start

Positions in the genome vary between the two groups

Page 83: Ensembl Browser Workshop

Ensembl VEP

Live demo

Page 84: Ensembl Browser Workshop

VEP video

http://tinyurl.com/vep-video

Page 85: Ensembl Browser Workshop

Things to bear in mind

1)  No distinction between polymorphisms and mutations. Exception HGMD and COSMIC: all mutations;

2)  C/T à first allele is the one in the reference genome, not necessarily the major or the ancestral;

3)  Ensembl reports all alleles on the forward strand (different from dbSNP).

Page 86: Ensembl Browser Workshop

Ensembl Variation API

Variation Schema Description

http://useast.ensembl.org/info/docs/api/variation/index.html

Page 87: Ensembl Browser Workshop

Variation Team

Fiona Cunningham

Will McLaren

Laurent Gil

Sarah Hunt

Anja Thormann

Page 88: Ensembl Browser Workshop
Page 89: Ensembl Browser Workshop

Ensembl Genetic Variation

Exercises pages 46-50

Answers

www.ebi.ac.uk/~denise/workshops/2016/ dublin/answers

Feel free to explore your favourite variant/phenotype too!

Page 90: Ensembl Browser Workshop

EBI  is  an  Outsta,on  of  the  European  Molecular  Biology  Laboratory.    

Gene Regulation

Page 91: Ensembl Browser Workshop

Outline •  Definition and models

•  Epigenetics and Epigenomics

•  Ensembl Regulation: goal, data sources, species •  Viewing / accessing regulation data in Ensembl

•  Track hubs: ENCODE, Blueprint •  Exercises

Page 92: Ensembl Browser Workshop

Regulation of gene expression

•  Change in the production of mRNA/proteins ( or ) •  From the transcription to post-translational levels •  Models of regulation of gene transcription

•  Basic •  Expanded •  Complete ??

Page 93: Ensembl Browser Workshop

Transcription regulation

Transcription Factor Binding Sites Promoter Gene

mRNA

Transcription Factors Activation

Repression RNA polymerase complex

2 nm

basic model

•  TF binding (promoters, enhancers) à transcription

Page 94: Ensembl Browser Workshop

Nucleosomes

Histones

Histone marks

CpG methylation

11 nm

Transcription regulation expanded model

•  Epigenetic marks may affect the binding of TFs

Page 95: Ensembl Browser Workshop

Histone modifications dynamically regulating genes

Jill

S. B

utle

r, an

d Sha

ron

Y. R

. D

ent

Blo

od 2

013;

121:

3076

-308

4

Page 96: Ensembl Browser Workshop

Packed Chromatin

30 nm

Open Chromatin

Distal enhancer

Complete (???) model

Transcription regulation

Page 97: Ensembl Browser Workshop

Epigenetics/Epigenomics

Epigenetics* The study of inherited changes in phenotype without changes in genotype

Epigenomics Epigenetics on a genome-wide scale

http://integratedhealthcare.eu/

*One of the routes to regulate gene transcription

Page 98: Ensembl Browser Workshop

Measuring gene expression

Northern/Western blot Microarrays

SAGE

Adp

ated

fro

m D

arry

l Lej

a, I

an D

unha

m

NGS techniques

DNase-seq ChIP-seq RNASeq

RT-qPCR

Page 99: Ensembl Browser Workshop

ChIP-sequencing

crosslink and shear

TF1 TF2 TF3

TF1 TF3 TF2

Antibodies and IP

unlink, purify and DNA sequencing

Y Y Y

TF1 TF3 TF2

ACGTC CGCTT GAACA

map back to the genome

DNA and proteins

Page 100: Ensembl Browser Workshop

Ensembl Regulation Goal: Annotate the genome with features that may play a

role in the transcriptional regulation of genes

Multiple data sources: collection and summary

http://www.ensembl.org/info/docs/funcgen/regulation_sources.html http://www.ensembl.org/Homo_sapiens/Experiment/

Page 101: Ensembl Browser Workshop

Data source: ENCODE

“Encyclopedia of DNA Elements” Trying to assign function to many regions as possible Transcription and regulatory information 4,626 datasets, 2,498 cell types à functional elements PMID: 22955616, PMID: 17571346

http://www.nature.com/encode/#/threads

Page 102: Ensembl Browser Workshop

Data source: Roadmap NIH consortium: public resource of normal epigenomes DNA methylation, histone marks, open chromatin, small RNA

http://www.roadmapepigenomics.org/data http://www.roadmapepigenomics.org/publications

Page 103: Ensembl Browser Workshop

•  EU consortium: generate 100 reference epigenomes •  Blood cells: healthy individuals and malignant leukaemic

counterparts •  1046 experiments {ChIP, RNA, Bisulfite, DNase}-Seq •  425 cell types and seven cell lines •  http://www.blueprint-epigenome.eu/

Data source: Blueprint

Page 104: Ensembl Browser Workshop

Dataset for the Ensembl build

raw data à Ensembl Regulation pipeline à Ensembl annotation

Page 105: Ensembl Browser Workshop

Regulation data: view

MultiCell: all cell lines combined Displayed by default

Page 106: Ensembl Browser Workshop

Regulatory features: view

Configure this page à Regulation à Regulatory features

For single and individual cell lines, e.g. GM12878, HUVEC

Page 107: Ensembl Browser Workshop

ChiP-Seq signal for TF

signal

Regulatory Features: motifs

Ensembl regulatory feature

Position Weight Matrix for TF (JASPAR database)

Page 108: Ensembl Browser Workshop

Viewing the raw NGS data

DNaseI and TFBS

Histone marks and polymerases

Configure this page à Regulation à Open chromatin &…

Configure this page à Regulation à Histones &…

Page 109: Ensembl Browser Workshop

How to choose raw data: matrix

Supporting evidence: 1) Open chromatin & TFBS 2) Histones & polymerases

http://tinyurl.com/matrix-ensembl

Page 110: Ensembl Browser Workshop

CTCF  enriched  Predicted  Weak  Enhancer/Cis-­‐reg  element  Predicted  Transcribed  Region    Predicted  Enhancer  Predicted  Promoter  Flank  Predicted  Repressed/Low  AcAvity  Predicted  Promoter  with  TSS  

Segmentation data in Ensembl ca

tego

ries

of

com

bine

d se

gmen

ts

Configure this page à Regulation à Regulatory features

Page 111: Ensembl Browser Workshop

Experimental confirmation

•  CTCF: good recall, reproducible across multiple cell lines, tight boundaries. •  TSS:

•  88.9% of FANTOM 5 strict TSSs were covered. •  Enhancers:

•  92.4% of 882 VISTA enhancers were detected. •  80.3% of 40279 robust FANTOM 5 enhancers were found.

Page 112: Ensembl Browser Workshop

Methylation data in Ensembl CpG DNA methylation (RRBS, WGBS, MeDIP)

ENCODE and PMID: 18577705

Configure this page à Regulation à DNA Methylation

Page 113: Ensembl Browser Workshop

The STRADA controls tumor suppressor activities of LKB1 (https://www.wikigenes.org/)

A.  What are the Ensembl regulatory features annotated in this gene?

B.  Are there any features in the 5’ region of STRADA?

C.  Do the regulatory features for K562, CD8+ cells (ENCODE) and erythroblast (Blueprint) differ at this region?

D.  What is the stable IDs of the most 5’ regulatory feature?

Tutorial

Page 114: Ensembl Browser Workshop

Browsing Regulation data

tinyurl.com/ensembl-regulation

Page 115: Ensembl Browser Workshop

Things to bear in mind

1)  The annotation of regulatory elements in Ensembl highlight where the biochemical data (ChIP-seq, etc) maps to on the human (mouse) genomes;

2)  Features can be nearby genes but might not affect their transcription/expression;

3)  Disclaimer: Ensembl can not tell you how your favourite gene is regulated.

Page 116: Ensembl Browser Workshop

In addition to the big names CpG islands, TSS, miRNA target predictions (TarBase)

Configure this page à Regulation à Other regulatory regions

Configure this page à Sequence and assembly à Simple features

Page 117: Ensembl Browser Workshop

Track hubs in Ensembl

Page 118: Ensembl Browser Workshop

ENCODE data hub in Ensembl www.ensembl.org/info/encode.html

>2,800 data tracks

Page 119: Ensembl Browser Workshop

Ensembl Regulation in BioMart

Human, mouse and fruit fly

FILTERS

ATTRIBUTES

Page 120: Ensembl Browser Workshop

Ensembl Regulation API

http://useast.ensembl.org/info/docs/api/funcgen/index.html

Funcgen Schema Description

Page 121: Ensembl Browser Workshop

Regulation Team

Thomas Juettemann

Myrto Kostadima

Ilias Lavidas

Michael Nuhn

Page 122: Ensembl Browser Workshop
Page 123: Ensembl Browser Workshop

Ensembl Regulation

Exercises Pages 52-54

Answers

www.ebi.ac.uk/~denise/workshops/2016/ dublin/answers

Feel free to explore your favourite gene/genomic region!

Page 124: Ensembl Browser Workshop

EBI  is  an  Outsta,on  of  the  European  Molecular  Biology  Laboratory.    

Custom data display

Page 125: Ensembl Browser Workshop

Outline •  Overview

•  Supported file formats

•  Add your own data

•  Where to view your own data

•  Tutorials and exercises

Page 126: Ensembl Browser Workshop

Overview

•  Genome browsers have pre-defined sets of data

•  Need to display personal data

•  Compare one’s own data to publicly available one

•  Requisite: own data organised to specific rules

http://www.ensembl.org/info/website/upload/index.html

Page 127: Ensembl Browser Workshop

Supported files in Ensembl Sequence alignments

http://www.ensembl.org/info/website/upload/index.html#formats

•  BAM (compact representation) CRAM (compressed version)

Flexible definition of data lines

Variation data

Feature information

Continuous-valued data (probability scores)

•  VCF: Variant Call Format

•  BED (Browser Extensible Data) e.g. chr, start, end •  Gene Finding Format (GFF) General Transfer Format (GTF)

•  Wig, BigWig

Page 128: Ensembl Browser Workshop

Add custom data

•  Data upload: small files (< 5MB; file name or URL)

•  Attach your data: larger files (> 5MB; URL)

Things to bear in mind Saved in a temp location (file system)

Saved in a db if logged in Standard security http, https or ftp

Page 129: Ensembl Browser Workshop

How can I add my data

Page 130: Ensembl Browser Workshop

Where to view my data

Page 131: Ensembl Browser Workshop

Structural variants in the 350-50 kb region upstream of the SOX9 cause severe dysplasia and other phenotypes. Many enhancers (e.g. E250, located at -250 kb) activate the SOX9 promoter, whereas E70 seems to be active in somatic tissues. CGH/other experiments have revealed the following deletions : 17 69872078 69886644 patient1 17 70040357 70049956 patient2

17 70111957 70116270 patient3 A)  Is any of these deletions known to be polymorphic in 1KG?

B)  Would these deletions affect E250 and E70?

C)  Do they map to regions of promoter activity or CpG islands?

Tutorial

Page 132: Ensembl Browser Workshop

Custom data display video

http://tinyurl.com/ensembl-upload

Page 133: Ensembl Browser Workshop
Page 134: Ensembl Browser Workshop

Custom data display

Exercises Pages 55-57

Answers

www.ebi.ac.uk/~denise/workshops/2016/ dublin/answers

Feel free to attach your own data

Page 135: Ensembl Browser Workshop

Wrap up Ensembl is the place!

Genes, genomes, variants, regulatory features, tools and more

�������������� ��������������������������� ��� ����� ������������������������������ ������������������������� �����!�� "����# ��������$� � %&���� ��� �&��!�� "����$� ��' ��������(��������������� ���������)�� ������"����*��+� ����������%����$,-�,����.������/��%� ������0�122�3-4,556�272829

����!�������!"�%�� �������"�������������������

��������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

�����������������������������������

����������������������

��������������������������������������������������������������������������� ��� �� �� ����������� ������ ��� ���� � !�" #� ������ �$$� ������ �����������

%�������

&�� �������� ��� �������� ������ ��� ���������� ��������� ���!���� '�$� �������������������������()**���������� ������������� ��������"�#����� ����++���������������������������������������������������

&��������������������������������������� ���������������$$���%�&���������)+�"))���������������������������������������������������

%���������������������������������������������������������������������������������� ����������������������������������������������� �������������������������������������������������������

����������������

����������������������,������������������������������������������������������� ������ ������� ��� �������� ��,��� &�� �������� ��������� ������ �����������-%������� ����� ���� ��� ������� ����������� ����������� ��� ������������������

����.���/���������0����

��� ��������� ��� ������������ ���������� ���� ���������� ���� ������ %����������������� �� '��������$� ������������ 1���� ��,�� %������� ������� ����/���������0�����������������������������������������������������������������������$����(�'�����������������������������������������

����� ����������� ��� �������� ������ ��) ����� �� ��� ����� ��� ������������� �������������,�� ���� �2� ��������� ���� ����������� ������ ���� ������ ������������������3� �����*��� ���� ���+�(�� ������ ���������� ��,�� ��� ������������������������������������������������������������������� ����������������������������������������������������������������������������

/��������� %�������� ��� ����������� ��� ������� ���� ����� ����� ��������������������� ����������� ��� ��������,� ���$����,� �������-����� ���������� ��� ����� ��� ����� �������-����� '������ !�� ���� ����� 0����� ���������������� ����%������� ������ +� �������������� �������������� $����������������������������������4��5����������4������5��6�62�����������4����5����������� 4������5� ���� ����� ����� 4����5� ������������ ��� ������� ��� ������������������������,��

�������������%���������������

����,�� ��� ���� %������� ���� ���������� ���� ���� ���� �������� �������� ������������������������������

����/���������(�������

�� ���� ����������� ��� ���� ���������� ������� ������������� ��7����� ��� ��������� ������ ��������� ������ ������������������������������������������������������������������ 8���� ��7����� ����� ��� ��� .�� ���� /�� ��-��-��������� ����� ���� ���� ��� ���� ��������������������� ����� ��� 012�3/����� ������ �������� ����������������������������������

%������� ������� �� ������ �������� ������ ��7����� ������������� ��������� ������� ��������� ������� �� ���'�� �-�-�$���� ���� ��,���� ����� ��������� ������� �� ���$�������'����� ����� ����� ��� ������������ ����� ���� /���$����(�0��$ �� ������ �������� ������������� ������ ������� ������������ ������ ���� ������ ����� ������� ���������� ����� ����������� ������ ���������� ���� ����� ����� ��������� ��� ������ ��-������� ��� ���� %������� �������� ������ �����������������������������

������������������%�������/�������������������������9����3����������������������� ��� �� �������� ����9��������� ���� ������������ ��� ���� ��� �������������� �������� ����� ����� ����� ������� ��� ���������� ����������������������

���� %������� /���������� ��������� ��������� ��������� �������� ��� ���������������� ���� ��������� ����� ������� �� ������� ���������� $���� ����� ���������� ��� ���� ������� ������ 04��� ���� ������������ ������� ���,��4��������������������������������������������,��������������������5������������������������������������������4��01������������2������������,�� ��� ������ ���������� ����� ,����� �������� ������� ���� ���� 5�-�����������

�����������������������������������������������������!$�+�!���������� ���� ����� ������ �������� ���� ��������� :������ ������� ����������������������������������������������������������������(�61�7�������������������������%������;��3��$��3���0������������/�����3����/8�!����/��-��9�����

� !������ �% !������ D�( #&�E�$� ���� �!��

1��������������������������

%������������������������������������������������������4�0����//0����� %.6�<%� ���� ����� �������� ��� ��,�� ������ ������ ��� ���� ����������7��$� ��� � ��:������ ������� � ��� ������ ���������� ��� ��������� ��0�������� �������� ���������� ��������� ��� �����,�� ���������� ������ �����������������������

<����������������������

��� ������������� ���������� ���� ��������� ����� ������� ��� ����� ��� ����������� ���������� �������� ����� ������ '��� -���$$�$� �-�������� ����������� 4��� ���������������5���������� 4�������0��0� ,�0��4������0��� �����5� ������������ ��������� �������� ��� �:����� ��� ���� ���� �������������������������

&�� ��� ���� ���������� �� ���� ���������� ��������� ����� ��� �:�������������������������������������������������������������������%0��

!������� �����������: ���&� '�����%� '� �� ��������� ��������� 3%���������������-7;7-9�������-79-;,4�<���������� ����'�����%�' ���"���'���"� :����� �" ������'� ���������� ����*�����)�� ��� ��������� (��������� 3,�52��-658;9� ���� ,�-,*�-=2-=94�� ���� $$���� 3$$>(-5;;-8>,��$$>(-5;68->,�����$$>/--7;52>,4������!�� "����# ��������$� � %&���� ��� �&���������"���'���0�?�������������������%�� ����������������������������'�����%�'� ������!�� "����.�� �@���������� ����< ����� %������3 �=>5--=A5-,64�������%������%���������B�595;,-�A�$�.!��(��?C� ?� ��� ��"" ����� �&� ���� !�� "���� � ������ �� <������ ���� ����< ����� %������ =� ��"�������� �"���'��� �� %������� ������ )����� D%�������� � � 6,56-,� 3*���+�������� A� ���� �������� �� ��4?C� ���� %�� ����� ��� ������ �������� ������ ���� ��"" ��� ����"���'���0� ?���� ��������� ������%� � � ������ �������� ���� ��������� '�����%� '� �� ���� !�� "����� ������&@�� �������� ����< ��� �� %������ 3 �=>5--=A5-,64� ������ %����� �%�������� �B�*!D��*A 2A5-,-A52,;-2�3!.�D��D��4?

Page 136: Ensembl Browser Workshop

Ensembl Retreat June 2015

Page 137: Ensembl Browser Workshop

Latest publication

Page 138: Ensembl Browser Workshop

Acknowledgements The Entire Ensembl Team

Funding

Co-funded by the European Union

Page 139: Ensembl Browser Workshop

Your take home message

Page 140: Ensembl Browser Workshop

Feedback survey

http://tinyurl.com/dublin-020616

Page 141: Ensembl Browser Workshop

Connect with Ensembl

? ??

? ?

? ?

? ? ?

[email protected]

https://www.youtube.com/user/EnsemblHelpdesk

Page 142: Ensembl Browser Workshop

Training materials

Ensembl materials are protected by a CC BY license http://creativecommons.org/licenses/by/4.0/ If you wish to re-use these, please credit Ensembl for their creation If you use Ensembl for your work, please cite our papers

http://www.ensembl.org/info/about/publications.html