Transcript
Page 1: Internet2 Support for Biomedical Research

Internet2  Support  for  Biomedical  Research  

AAMC  2013  Informa0on  Technology  in  Academic  Medicine  Conference  Vancouver  CA      June  5-­‐7,  2013    Michael  Sullivan,  M.D.  Associate  Director,  Health  Sciences,  Internet2  

Page 2: Internet2 Support for Biomedical Research

Internet2  Research  Support  •  Community  and  Network  •  Data-­‐intensive  Science  •  Interna0onal  Collabora0on  •  Innova0on  PlaLorm  

 Big  Data  Challenges  

•  Transport  •  Security  •  Storage  and  Compute  

2  –  6/7/13,  ©  2012  Internet2  

Overview  

Page 3: Internet2 Support for Biomedical Research

3  –  6/7/13,  ©  2010  Internet2  

Internet2  Community                  220  Universi0es                    60  Corpora0ons                    70  Government  agencies                    38  Regional  and  state  networks                    65  Interna0onal  R&E  networks  

Page 4: Internet2 Support for Biomedical Research

4  –  6/7/13,  ©  2010  Internet2  

Advanced  100G  Produc0on  and  Research  Network  

Page 5: Internet2 Support for Biomedical Research

Physics  Large  Hadron  Collider  

5  –  6/7/13,  ©  2012  Internet2  

Data  Tsunami  

Life  Sciences  Magne0c  Resonance  Imager  (MRI)  

Image by: CERN"

Page 6: Internet2 Support for Biomedical Research

6  –  6/7/13,  ©  2012  Internet2  

Visualizing  Big  Data  

Physics  LHC  –  Lead  Ion  Collision  

Life  Sciences  MRI  –  Monkey  Brain  

Source: Van Wedeen, M.D., Martinos Center and Dept. of Radiology, Massachusetts General Hospital and Harvard University Medical School"

Source: CERN (ALICE detector)"

Page 7: Internet2 Support for Biomedical Research

Illumina HiSeq 2500/1500  

7  –  6/7/13,  ©  2012  Internet2  

Sequencing:  Smaller,  Faster,  Cheaper  

Handheld USB Sequencer"

Image: Oxford Nanopore Technologies"Source: http://www.illumina.com/systems/hiseq_systems/hiseq_2500_1500.ilmn"

Page 8: Internet2 Support for Biomedical Research

8  –  6/7/13,  ©  2012  Internet2  

Democra0za0on  of  Sequencing  2,386  Genome  Sequencers  Worldwide  –  30  May  2013  

Source: Map of High-throughput Sequencers"

Page 9: Internet2 Support for Biomedical Research

9  –  6/7/13,  ©  2012  Internet2  

North  American  Genome  Sequencers  998  Sequencers  in  NA  –  30  May  2013  

Source: Map of High-throughput Sequencers"

Page 10: Internet2 Support for Biomedical Research

10  –  6/7/13,  ©  2012  Internet2  

Sequencing  in  Vancouver  13  Sequencers  at  the  Genome  Science  Center  

Source: Map of High-throughput Sequencers"

Page 11: Internet2 Support for Biomedical Research

11  –  6/7/13,  ©  2012  Internet2  

Canarie  Weathermap  

Page 12: Internet2 Support for Biomedical Research

12  –  6/7/13,  ©  2011  Internet2  

US-­‐based  Interna0onal  Exchange  Points  

US-­‐based  Exchange  Points  

StarLight,  Chicago  IL  MAN  LAN,  New  York  NY  NGIX-­‐East,  College  Park  MD  Atlan0cWave  (distributed)  AMPATH,  Miami  FL  PacificWave-­‐S,  Los  Angeles  CA  PacificWave-­‐N,  Seahle  WA  

Page 13: Internet2 Support for Biomedical Research

13  –  6/7/13,  ©  2011  Internet2  

GEANT  Interna0onal    

Page 14: Internet2 Support for Biomedical Research

14  –  6/7/13,  ©  2011  Internet2  

APAN  

14  –  6/7/13,  ©  2012  Internet2  

Page 15: Internet2 Support for Biomedical Research

15  –  6/7/13,  ©  2012  Internet2  

Synchronized  Genomic  Repositories:  NCBI,  EBI,  DDBJ  

Page 16: Internet2 Support for Biomedical Research

16  –  6/7/13,  ©  2012  Internet2  

US  –  China  10  Gbps  Link    Fed  Ex:  

Internet  +  FTP:  China-­‐US  10G  Link:  

 2  days  26  hours  30  seconds  

Sample.fa  (24GB)  

Dr.  Dawei  Lin  Dr.  Lin  Fang  

Page 17: Internet2 Support for Biomedical Research

100  GigE  Layer  2  ConnecOon  

www.internet2.edu  

Innovation Platform

SDN  Control  Server  

Performance  Node  

Switches,  data  stores  for  data-­‐intensive  science  

TradiOonal  L3  Campus  Border  Security  

High-­‐Performance  Layer  2/3  

Switch/Router  

TradiOonal  Campus  

Border  Router  

Campus  Enterprise  Network  

Science  DMZ  

For  more  informaOon,  see  fasterdata.es.net  

SoWware  Defined  Networking  GENI  

Experiments  

Dark  Fiber  

OpOcal  System  

GENI   ?  Dynamic  Layer  2  

IP  Network  Layer  3  

StaOc  Layer  2  

R&E  IP   TR-­‐CPS  

InnovaOon  Services  TradiOonal  Services  

SoWware  Defined  Networking  Substrate  

TradiOonal  Switch  Substrate  

Your  Research  Internet2  innovaOon  backbone  delivered  as  100G  L1  

TradiOonal  regional  and  commodity  providers  

17  –  6/7/13,  ©  2012  Internet2  

Page 18: Internet2 Support for Biomedical Research

18  –  6/7/13,  ©  2012  Internet2  

Innova0on  PlaLorm  Pilot  Sites  

Page 19: Internet2 Support for Biomedical Research

Transport  •  Science  DMZ  •  PerfSONAR  Toolkit  •  MaDDash  Tes0ng  Mesh  •  File  Transfer  Tools  

Security  •  Science  DMZ  Hardening  •  Federated  IdM:  InCommon  and  NSTIC  

Storage  and  Compute  •  Storage  and  Compute  

 

19  –  6/7/13,  ©  2012  Internet2  

Mee0ng  the  Big  Data  Challenges  

Page 20: Internet2 Support for Biomedical Research

20  –  6/7/13,  ©  2012  Internet2  

Challenge  #1:  Transport  

hhp://fasterdata.es.net/science-­‐dmz/science-­‐dmz-­‐security/  

Science  DMZ  

Page 21: Internet2 Support for Biomedical Research

21  –  6/7/13,  ©  2012  Internet2  

Performance  Monitoring  

Page 22: Internet2 Support for Biomedical Research

22  –  6/7/13,  ©  2012  Internet2  

MaDDash  XSEDE  Tes0ng  Mesh  

Page 23: Internet2 Support for Biomedical Research

•  scp,  smp,  rsync  –  poor  choices  for  WAN  (RTT  >  25ms)  •  scp  with  HPN  patch  –  beher  but  s0ll  has  limita0ons  

•  Globus  Online  –  hhp://www.globusonline.org  –  Uses  GridFTP  with  TCP  op0miza0ons  –  Friendly  GUI,  Fire  and  Forget,  Galaxy  integra0on  

•  Aspera:  hhp://www.asperasom.com/  •  Annai  Systems:  hhp://www.annaisystems.com  

23  –  6/7/13,  ©  2012  Internet2  

File  Transfer  Tools  

TCP  –  based  Open  Source  

UDP  –  based  Commercial  

Unix  LAN  Tools  

Page 24: Internet2 Support for Biomedical Research

24  –  6/7/13,  ©  2012  Internet2  

Tool  Speeds  

Berkeley,  CA    çè  Argonne,  IL      RTT=53  

Page 25: Internet2 Support for Biomedical Research

Hardening  the  Science  DMZ  •  ESnet  Big  Data  design  pahern  •  Internet2  Innova0on  PlaLorm  •  NSF  CC-­‐NIE  grants  •  University  of  Florida  

–  HIPAA  alignment  –  Efficient  encryp0on  –  Comprehensive  logging  –  Robust  authen0ca0on  

25  –  6/7/13,  ©  2012  Internet2  

Challenge  #2:  Security  

Source:  www.securearc.com    

Page 26: Internet2 Support for Biomedical Research

0

50

100

150

200

250

300

350

400

450

2004 2005 2006 2007 2008 2009 2010 2011 2012 (June)

Num

ber o

f Par

ticip

ants

26  –  6/7/13,  ©  2012  Internet2  

Federated  Iden0ty  Management  

Page 27: Internet2 Support for Biomedical Research

•  White  House  iniOaOve  administered  by  NIST  •  Goal  is  to  create  an  “IdenOty  Ecosystem”  •  IDEGS  –  IdenOty  Ecosystem  Steering  Group  •  Five  awards  for  pilots  spanning  mulOple  sectors:  

–  Resilient  Network  Systems,  AMA,  Aetna,  ACC,  NeHC,  …  –  Criterion  Systems,  ID/DataWeb,  AOL,  Experian,  Ping  Iden0ty,  …  –  Daon,  Inc.,  AARP,  PayPal,  Purdue,  …  –  American  Assoc.  of  Motor  Vehile  Admins,  Microsom,  AT&A,  etc…    –  Internet2,  Carnegie  Mellon,  Brown,  MIT,  U.  of  Texas,  U.  of  Utah…  

27  –  6/7/13,  ©  2012  Internet2  

NSTIC  –  Na0onal  Strategy  for  Trusted  Iden00es  in  Cyberspace  

Page 28: Internet2 Support for Biomedical Research

•  Cloud  CompuOng  –  many  iniOaOves  –  Private:  NCI  bake-­‐off  to  create  Cancer  Knowledge  Clouds  –  Public/Private:  AWS  EC2  instances  ––  [100G]  ––  NCBI  repository  –  Open  Cloud:  BioNimbus  Protected  Data  Cloud  –  Proprietary:  BGI  EasyGenomics  Cloud  

•  NaOonal  Cyberinfrastructure  –  XSEDE  –  Internet2  –  NCGAS  

 

28  –  6/7/13,  ©  2012  Internet2  

Challenge  #3:  Storage  and  Compute  

Page 29: Internet2 Support for Biomedical Research

29  –  6/7/13,  ©  2012  Internet2  

NCI:  Cancer  Knowledge  Cloud  -­‐  RFI  

Summary  of  Community  Input  

hhps://wiki.nci.nih.gov/display/NCIPinput/Summary+of+Input+Request%3A+Computa0onal+Needs+to+Support+Large-­‐Scale+Genomics+Inves0ga0ons  

Page 30: Internet2 Support for Biomedical Research

Reduced  Data  Size  

Incrementally  Transfer  Large  Files  

High  Speed  Network  

Connec0ons  

Cloud  Access  and  Support  

30  –  ©  2013  Internet2  

NCBI:  Four  Different  Approaches  

Source:  Don  Preuss,  NCBI  Experiences  and  Big  Data  Strategy,  presented  at  2013  Internet2  Annual  Mee0ng,  Arlington,  VA  

Page 31: Internet2 Support for Biomedical Research

bionimbus.opensciencedatacloud.org  

BioNimbus:  An  Open  Cloud  with  Protected  Data  

Page 32: Internet2 Support for Biomedical Research

32  –  6/7/13,  ©  2012  Internet2  

EasyGenomics:  BGI’s  Cloud  Solu0on  

Source:  Xu  Xing,  Managing  Big  Data:  The  Genome  Center  PerspecBve,  presented  at  Bio-­‐IT  World  Conference  &  Expo  ‘13,  Boston,  MA  

Page 33: Internet2 Support for Biomedical Research

•  XSEDE  –  NSF-­‐funded  –  Supercomputers  –  HPC  resources  

•  Internet2  –  220  universi0es  –  XSEDEnet  

•  NCGAS  –  Indiana  University  –  TACC  –  SDSC  –  PSC  

33  –  6/7/13,  ©  2012  Internet2  

Na0onal  Cyberinfrastructure  

Source:  hhps://www.xsede.org/networking  

Page 34: Internet2 Support for Biomedical Research

NSF-­‐Funded  or    XSEDE  Alloca0on  

Federally  Funded  

NCGAS  Galaxy    Portal  

POD  Galaxy    Portal  

5  PB    D.C.  

6  PB    Storage  

5.5  PB    Storage  

4  PB    Storage  

TACC  

SDSC  

PSC  

Mason  

POD  

Sequencing  Center   NCBI  

100  Gig    Internet2  

10  Gig    NLR  

NCGAS Virtual Instrument Indiana  University  

Source:  Barneh,  W.K.,  and  R.D.  LeDuc,  Next  GeneraBon  Cyberinfrastructures  for  Next  GeneraBon  Sequencing  and  Genome  Science,  presented  at  2013  AAMC  GIR  Conference,  Vancouver,  BC  

Page 35: Internet2 Support for Biomedical Research

Focused  Technical  Workshop  on  July  17-­‐  18,  2013  Lawrence  Berkeley  NaOonal  Laboratory  

Berkeley,  California    

•  Building  on  the  success  of  Joint  Techs,  mee0ng  will  bring  together  technical  experts  in  a  smaller  seyng  with  domain  scien0sts.    

•  Workshop  will  include  a  slate  of  invited  speakers  and  panels.  •  Format  to  encourage  lively,  interac0ve  discussions  with  the  goal  of  

developing  a  set  of  tangible  next  steps  for  suppor0ng  this  data-­‐intensive  science  community  

•  Four  sub-­‐topic  areas:    Network  Architectures,  Workflow  Engines,  Public  and  Private  Cloud  Architectures,  and  Data  Movement  Tools  

•  See:    hhp://events.internet2.edu/2013/mw-­‐life-­‐sciences/index.cfm  

35  –  6/7/13,  ©  2012  Internet2  

Networking  Issues  for  Life  Sciences  Research  

Page 36: Internet2 Support for Biomedical Research

•  The  Fourth  Paradigm  –  Data-­‐Intensive  Scien0fic  Discovery  –  http://research.microsoft.com/en-us/collaboration/fourthparadigm/  

•  Internet2  Network  and  Innova0on  PlaLorm  –  http://www.internet2.edu/network/  

•  Science  DMZ  –  http://fasterdata.es.net/science-dmz/  

•  perfSONAR  –  http://www.perfsonar.net/  

•  Internet2  Research  Support  Center  –  [email protected]

•  Internet2  Life  Sciences  –  Michael  Sullivan,  MD,  Associate  Director  –  [email protected]  

36  –  6/7/13,  ©  2012  Internet2  

Resources  

Contact  

Page 37: Internet2 Support for Biomedical Research

INTERNET2  SUPPORT  FOR  BIOMEDICAL  RESEARCH  AAMC  2013  Informa0on  Technology  in  Academic  Medicine  Conference  Vancouver  CA      June  5-­‐7,  2013    Michael  Sullivan,  M.D.  Associate  Director,  Health  Sciences,  Internet2  

Thank  You  

37  –  6/7/13,  ©  2012  Internet2  


Recommended