27
The Genome Analysis Centre The Genome Analysis Centre TGAC Browser TGAC Browser: visualisa2on solu2ons for big data in the genomic era Anil S. Thanki Scien7fic Programmer – Sequencing Informa7cs [email protected] @anilthanki and @tgacbrowser July 11, 2014 BOSC #Poster 6

TGAC Browser bosc 2014

Embed Size (px)

DESCRIPTION

TGAC Browser presentation at BOSC 2015 http://tgac-browser.tgac.ac.uk mail: [email protected]

Citation preview

Page 1: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 TGAC  Browser  

TGAC  Browser:  visualisa2on  solu2ons  for  big  data  in  the  genomic  era  

     

-­‐  Anil  S.  Thanki  Scien7fic  Programmer  –  Sequencing  Informa7cs  

[email protected]    

@anilthanki  and  @tgacbrowser    

July  11,  2014  BOSC  

 

#Poster  6  

Page 2: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Genome  Browsers  In  bioinforma7cs,  a  genome  browser  is  a  graphical  interface  for  display  of  informa7on  from  a  biological  database  for  genomic  data.    

Genomic  region  

Geno

mic  fe

ature  

Page 3: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 TGAC  Browser  Controls  

Tracks/  SeNngs  

Save  Session  /  share  

Tracks  

Posi7on  

Chromosome  Map  with  Marker  

TGAC  Browser  developed  at  TGAC  from  scratch,  works  on  top  of  Ensembl  Core  database  

Page 4: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 TGAC  Browser  server-­‐client  •  Heavy  database  queries  and  parsing  on  server  •  Transfer  informa7on  in  Text  format  •  U7lise  Client  system  to  generate  and  render  images    

•  Performance  and  easy  access  of  data  from  server  implementa7on    •  Flexibility  of  web  Browsers  for  sharing  data  

Page 5: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 TGAC  Browser  server-­‐client  

Server  

Client  

Ensembl  Core  database  

Java  SQL  DAO  

Java  Classes  

Cache  

Session  files  

User  files  

Ajax  JSON  

Java-­‐genomics-­‐IO  

JavaScript   CSS  

JSP   SVG  

HTML  

Page 6: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Supported  Data  

Genomics  Data  Ensembl  core  

SAM/BAM  

Wig/BigWig  

GFF  

VCF  

Page 7: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Ensembl  Data  •  Genomic  features  from  Ensembl  Database  

•  Genes,  SNPs,  repeats,  assembly,  alignments,  markers,  etc  

Page 8: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 BAM/SAM  

Visualising  reads  directly  from  SAM/BAM  file  •  Coloured  Paired  end  reads  

•  Blue  First  in  Pair  •  Brown  Second  in  Pair  •  Orange  unpaired  •  Skipping  dele7ons  

Page 9: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Wig/bigwig  

Visualising  expression  data  directly  from  wig/bigwig  file  •  Coloured  and  oriented  peaks  

•  Upwards  red  are  posi7ve  •  Downwards  blue  are  nega7ve  

Page 10: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 VCF  and  GFF  

•  Visualising  variant  data  directly  from  vcf  file  •  Coloured  based  on  base  pairs  •  Visualise  inser7on,  dele7on,  muta7on  

Visualising  Genes  data  directly  from  gff  file  •  Exons,  Introns,  CDS  

Page 11: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Non  Ensembl  Data  

{  "colour":  "blue",  "source":  "file",  "filepath":  "/storage/browser/test.bw",  ”track-­‐group":  LIB1777,  }  

•  Adding  non  Ensembl  data  in  TGAC  Browser  •  analysis_descrip-on  table  of  Ensembl  Core  schema  •  web_data  column  for  file  informa7on  

Page 12: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Visual  Types  

•  Graphical  presenta7on  of  large  data  e.g.  SNPs,  alignments  density  

•  Heat  map  presenta7on  of  large  data  

•  Select  Visual  types  based  on  amount  of  feature  •  Bar  charts    •  Heat  Map  

Page 13: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Search  

Result  on  Chromosomes  

Result  as  list  

•  Search  with  keyword  among  data  

Page 14: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 BLAST  Integra2on  •  BLAST  results  for  sequence  search    •  BLAST  Manager  

•  BLAST  history  logs  •  Run  mul7ple  BLAST  simultaneously  and  toggle  between  result  

Link  to  TGAC  Browser  

Page 15: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 BLAST  Integra2on  

•  BLAST  run  on  a  track  or  selected  region  •  BLAST  results  showing  as  a  track  run    •  Coloured  based  on  score  and  with  indel  informa7on  

Page 16: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 BLAST  Integra2on  

Page 17: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Popup  

Page 18: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Popup  

Track  Type  

Posi7on  

Track  Descrip7on  

Page 19: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Popup  

Pep7de  Sequence  

Fasta  Sequence  

BLAST  analysis  

Zoom  to  track    

Flag  

Open  more  op7ons  

Track  Type  

Posi7on  

Track  Descrip7on  

Page 20: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Popup  

Show  Aeributes  

Make  top  

Jump  to  link  

Revert  edited  Track  

Delete  Track  

Rename  Track  

Pep7de  Sequence  

Fasta  Sequence  

BLAST  analysis  

Zoom  to  track    

Flag  

Open  more  op7ons  

Track  Type  

Posi7on  

Track  Descrip7on  

Page 21: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Future  Work  •  Manual  Annota2on    •  Aequatus  Browser  integra2on  •  Upload  user’s  data  •  New  Visuals  

•  Manhaean  Plots,    •  Expression  data  on  Genes  

•  Integrate  HMMER  and  BLAT  analysis    •  Data  Download  from  TGAC  Browser  •  REST  API  to  load  data  from  Ensembl    •  Write  a  Paper  

Page 22: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Manual  Annota2on

Manual  Annota2on  

Users  annotate  features  

within  TGAC  Browser  

Can  edit/add/remove  

tracks    

Export  edited  

informa2on  Send  it  to  a  curator  

Update  database  

Read  Only     Read  and  Write  

Page 23: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Manual  Annota2on •  Edit  various  genomic  track  creden7als  •  Add  new  tracks    

Page 24: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 Aequatus  Browser   •  Integra7on  of  Aequatus  Browser  within  TGAC  Browser  

•  Under  development  at  TGAC    

•  Visualising  complex  similarity  rela7onships  among  species  •  Beta:  hep://tgac-­‐browser.tgac.ac.uk/plants_compara    •  Slides:  hUp://2nyurl.com/aequatous-­‐browser    

Page 25: TGAC Browser bosc 2014

The  Genome  Analysis  Centre  The  Genome  Analysis  Centre  

 TGAC  Browser  

• Oct  2011  Started  

• August  2012  Presented  at  NGS  DeepSeq  

• May  2013  V0.1.1  

• July  2013  Presented  at  ISMB  

• Aug  2013  v0.1.2  

• Sept  2013  Presented  at  UK  Genome  Science  

• Dec  2013  V0.2.0  

• March  2014  Presented  at  VizBi  

• May  2014  v0.2.1  

• July  2014  Presen7ng  at  BOSC  2014  

• Soon…    V0.2.2  

Instances:  •  TGAC  Browser  Demo  •  Chinese  Hamster  •  Wheat  Yellow  Rust  •  Chalara  Fraxinus  •  Brassica  •  Homo  Sapiens  •  Vietnamese  Rice  •  Lactobacillus  salivarius  •  IWGSC  Wheat  •  Hordeum  •  Oryza  Sa7va  •  Brachypodium  

hUps://documenta2on.tgac.ac.uk/display/TB/TGAC+Browser  Documenta2on:  

Page 26: TGAC Browser bosc 2014

26  

   Acknowledgements  

The  Genome  Analysis  Centre  

Robert  Davey   Sarah  Ayling   Mario  Caccamo  

Gemy  KaithakoYl  

Xingdong  Bian  

Jon  Wright   Daniel  Mapleson   Mar2n  Ayling  

Page 27: TGAC Browser bosc 2014

27  

   Acknowledgements  

The  Genome  Analysis  Centre  

Jinhong  Li  Mariella  Ferrante  Remo  Sanges  

Paul  Linehan  Burkhard  Steuernagel  

   hep://tgac-­‐browser.tgac.ac.uk        [email protected]        @anilthanki  and  @tgacbrowser