37
Internet2 Support for Biomedical Research AAMC 2013 Informa0on Technology in Academic Medicine Conference Vancouver CA June 57, 2013 Michael Sullivan, M.D. Associate Director, Health Sciences, Internet2

Internet2 Support for Biomedical Research

Embed Size (px)

Citation preview

Page 1: Internet2 Support for Biomedical Research

Internet2  Support  for  Biomedical  Research  

AAMC  2013  Informa0on  Technology  in  Academic  Medicine  Conference  Vancouver  CA      June  5-­‐7,  2013    Michael  Sullivan,  M.D.  Associate  Director,  Health  Sciences,  Internet2  

Page 2: Internet2 Support for Biomedical Research

Internet2  Research  Support  •  Community  and  Network  •  Data-­‐intensive  Science  •  Interna0onal  Collabora0on  •  Innova0on  PlaLorm  

 Big  Data  Challenges  

•  Transport  •  Security  •  Storage  and  Compute  

2  –  6/7/13,  ©  2012  Internet2  

Overview  

Page 3: Internet2 Support for Biomedical Research

3  –  6/7/13,  ©  2010  Internet2  

Internet2  Community                  220  Universi0es                    60  Corpora0ons                    70  Government  agencies                    38  Regional  and  state  networks                    65  Interna0onal  R&E  networks  

Page 4: Internet2 Support for Biomedical Research

4  –  6/7/13,  ©  2010  Internet2  

Advanced  100G  Produc0on  and  Research  Network  

Page 5: Internet2 Support for Biomedical Research

Physics  Large  Hadron  Collider  

5  –  6/7/13,  ©  2012  Internet2  

Data  Tsunami  

Life  Sciences  Magne0c  Resonance  Imager  (MRI)  

Image by: CERN"

Page 6: Internet2 Support for Biomedical Research

6  –  6/7/13,  ©  2012  Internet2  

Visualizing  Big  Data  

Physics  LHC  –  Lead  Ion  Collision  

Life  Sciences  MRI  –  Monkey  Brain  

Source: Van Wedeen, M.D., Martinos Center and Dept. of Radiology, Massachusetts General Hospital and Harvard University Medical School"

Source: CERN (ALICE detector)"

Page 7: Internet2 Support for Biomedical Research

Illumina HiSeq 2500/1500  

7  –  6/7/13,  ©  2012  Internet2  

Sequencing:  Smaller,  Faster,  Cheaper  

Handheld USB Sequencer"

Image: Oxford Nanopore Technologies"Source: http://www.illumina.com/systems/hiseq_systems/hiseq_2500_1500.ilmn"

Page 8: Internet2 Support for Biomedical Research

8  –  6/7/13,  ©  2012  Internet2  

Democra0za0on  of  Sequencing  2,386  Genome  Sequencers  Worldwide  –  30  May  2013  

Source: Map of High-throughput Sequencers"

Page 9: Internet2 Support for Biomedical Research

9  –  6/7/13,  ©  2012  Internet2  

North  American  Genome  Sequencers  998  Sequencers  in  NA  –  30  May  2013  

Source: Map of High-throughput Sequencers"

Page 10: Internet2 Support for Biomedical Research

10  –  6/7/13,  ©  2012  Internet2  

Sequencing  in  Vancouver  13  Sequencers  at  the  Genome  Science  Center  

Source: Map of High-throughput Sequencers"

Page 11: Internet2 Support for Biomedical Research

11  –  6/7/13,  ©  2012  Internet2  

Canarie  Weathermap  

Page 12: Internet2 Support for Biomedical Research

12  –  6/7/13,  ©  2011  Internet2  

US-­‐based  Interna0onal  Exchange  Points  

US-­‐based  Exchange  Points  

StarLight,  Chicago  IL  MAN  LAN,  New  York  NY  NGIX-­‐East,  College  Park  MD  Atlan0cWave  (distributed)  AMPATH,  Miami  FL  PacificWave-­‐S,  Los  Angeles  CA  PacificWave-­‐N,  Seahle  WA  

Page 13: Internet2 Support for Biomedical Research

13  –  6/7/13,  ©  2011  Internet2  

GEANT  Interna0onal    

Page 14: Internet2 Support for Biomedical Research

14  –  6/7/13,  ©  2011  Internet2  

APAN  

14  –  6/7/13,  ©  2012  Internet2  

Page 15: Internet2 Support for Biomedical Research

15  –  6/7/13,  ©  2012  Internet2  

Synchronized  Genomic  Repositories:  NCBI,  EBI,  DDBJ  

Page 16: Internet2 Support for Biomedical Research

16  –  6/7/13,  ©  2012  Internet2  

US  –  China  10  Gbps  Link    Fed  Ex:  

Internet  +  FTP:  China-­‐US  10G  Link:  

 2  days  26  hours  30  seconds  

Sample.fa  (24GB)  

Dr.  Dawei  Lin  Dr.  Lin  Fang  

Page 17: Internet2 Support for Biomedical Research

100  GigE  Layer  2  ConnecOon  

www.internet2.edu  

Innovation Platform

SDN  Control  Server  

Performance  Node  

Switches,  data  stores  for  data-­‐intensive  science  

TradiOonal  L3  Campus  Border  Security  

High-­‐Performance  Layer  2/3  

Switch/Router  

TradiOonal  Campus  

Border  Router  

Campus  Enterprise  Network  

Science  DMZ  

For  more  informaOon,  see  fasterdata.es.net  

SoWware  Defined  Networking  GENI  

Experiments  

Dark  Fiber  

OpOcal  System  

GENI   ?  Dynamic  Layer  2  

IP  Network  Layer  3  

StaOc  Layer  2  

R&E  IP   TR-­‐CPS  

InnovaOon  Services  TradiOonal  Services  

SoWware  Defined  Networking  Substrate  

TradiOonal  Switch  Substrate  

Your  Research  Internet2  innovaOon  backbone  delivered  as  100G  L1  

TradiOonal  regional  and  commodity  providers  

17  –  6/7/13,  ©  2012  Internet2  

Page 18: Internet2 Support for Biomedical Research

18  –  6/7/13,  ©  2012  Internet2  

Innova0on  PlaLorm  Pilot  Sites  

Page 19: Internet2 Support for Biomedical Research

Transport  •  Science  DMZ  •  PerfSONAR  Toolkit  •  MaDDash  Tes0ng  Mesh  •  File  Transfer  Tools  

Security  •  Science  DMZ  Hardening  •  Federated  IdM:  InCommon  and  NSTIC  

Storage  and  Compute  •  Storage  and  Compute  

 

19  –  6/7/13,  ©  2012  Internet2  

Mee0ng  the  Big  Data  Challenges  

Page 20: Internet2 Support for Biomedical Research

20  –  6/7/13,  ©  2012  Internet2  

Challenge  #1:  Transport  

hhp://fasterdata.es.net/science-­‐dmz/science-­‐dmz-­‐security/  

Science  DMZ  

Page 21: Internet2 Support for Biomedical Research

21  –  6/7/13,  ©  2012  Internet2  

Performance  Monitoring  

Page 22: Internet2 Support for Biomedical Research

22  –  6/7/13,  ©  2012  Internet2  

MaDDash  XSEDE  Tes0ng  Mesh  

Page 23: Internet2 Support for Biomedical Research

•  scp,  smp,  rsync  –  poor  choices  for  WAN  (RTT  >  25ms)  •  scp  with  HPN  patch  –  beher  but  s0ll  has  limita0ons  

•  Globus  Online  –  hhp://www.globusonline.org  –  Uses  GridFTP  with  TCP  op0miza0ons  –  Friendly  GUI,  Fire  and  Forget,  Galaxy  integra0on  

•  Aspera:  hhp://www.asperasom.com/  •  Annai  Systems:  hhp://www.annaisystems.com  

23  –  6/7/13,  ©  2012  Internet2  

File  Transfer  Tools  

TCP  –  based  Open  Source  

UDP  –  based  Commercial  

Unix  LAN  Tools  

Page 24: Internet2 Support for Biomedical Research

24  –  6/7/13,  ©  2012  Internet2  

Tool  Speeds  

Berkeley,  CA    çè  Argonne,  IL      RTT=53  

Page 25: Internet2 Support for Biomedical Research

Hardening  the  Science  DMZ  •  ESnet  Big  Data  design  pahern  •  Internet2  Innova0on  PlaLorm  •  NSF  CC-­‐NIE  grants  •  University  of  Florida  

–  HIPAA  alignment  –  Efficient  encryp0on  –  Comprehensive  logging  –  Robust  authen0ca0on  

25  –  6/7/13,  ©  2012  Internet2  

Challenge  #2:  Security  

Source:  www.securearc.com    

Page 26: Internet2 Support for Biomedical Research

0

50

100

150

200

250

300

350

400

450

2004 2005 2006 2007 2008 2009 2010 2011 2012 (June)

Num

ber o

f Par

ticip

ants

26  –  6/7/13,  ©  2012  Internet2  

Federated  Iden0ty  Management  

Page 27: Internet2 Support for Biomedical Research

•  White  House  iniOaOve  administered  by  NIST  •  Goal  is  to  create  an  “IdenOty  Ecosystem”  •  IDEGS  –  IdenOty  Ecosystem  Steering  Group  •  Five  awards  for  pilots  spanning  mulOple  sectors:  

–  Resilient  Network  Systems,  AMA,  Aetna,  ACC,  NeHC,  …  –  Criterion  Systems,  ID/DataWeb,  AOL,  Experian,  Ping  Iden0ty,  …  –  Daon,  Inc.,  AARP,  PayPal,  Purdue,  …  –  American  Assoc.  of  Motor  Vehile  Admins,  Microsom,  AT&A,  etc…    –  Internet2,  Carnegie  Mellon,  Brown,  MIT,  U.  of  Texas,  U.  of  Utah…  

27  –  6/7/13,  ©  2012  Internet2  

NSTIC  –  Na0onal  Strategy  for  Trusted  Iden00es  in  Cyberspace  

Page 28: Internet2 Support for Biomedical Research

•  Cloud  CompuOng  –  many  iniOaOves  –  Private:  NCI  bake-­‐off  to  create  Cancer  Knowledge  Clouds  –  Public/Private:  AWS  EC2  instances  ––  [100G]  ––  NCBI  repository  –  Open  Cloud:  BioNimbus  Protected  Data  Cloud  –  Proprietary:  BGI  EasyGenomics  Cloud  

•  NaOonal  Cyberinfrastructure  –  XSEDE  –  Internet2  –  NCGAS  

 

28  –  6/7/13,  ©  2012  Internet2  

Challenge  #3:  Storage  and  Compute  

Page 29: Internet2 Support for Biomedical Research

29  –  6/7/13,  ©  2012  Internet2  

NCI:  Cancer  Knowledge  Cloud  -­‐  RFI  

Summary  of  Community  Input  

hhps://wiki.nci.nih.gov/display/NCIPinput/Summary+of+Input+Request%3A+Computa0onal+Needs+to+Support+Large-­‐Scale+Genomics+Inves0ga0ons  

Page 30: Internet2 Support for Biomedical Research

Reduced  Data  Size  

Incrementally  Transfer  Large  Files  

High  Speed  Network  

Connec0ons  

Cloud  Access  and  Support  

30  –  ©  2013  Internet2  

NCBI:  Four  Different  Approaches  

Source:  Don  Preuss,  NCBI  Experiences  and  Big  Data  Strategy,  presented  at  2013  Internet2  Annual  Mee0ng,  Arlington,  VA  

Page 31: Internet2 Support for Biomedical Research

bionimbus.opensciencedatacloud.org  

BioNimbus:  An  Open  Cloud  with  Protected  Data  

Page 32: Internet2 Support for Biomedical Research

32  –  6/7/13,  ©  2012  Internet2  

EasyGenomics:  BGI’s  Cloud  Solu0on  

Source:  Xu  Xing,  Managing  Big  Data:  The  Genome  Center  PerspecBve,  presented  at  Bio-­‐IT  World  Conference  &  Expo  ‘13,  Boston,  MA  

Page 33: Internet2 Support for Biomedical Research

•  XSEDE  –  NSF-­‐funded  –  Supercomputers  –  HPC  resources  

•  Internet2  –  220  universi0es  –  XSEDEnet  

•  NCGAS  –  Indiana  University  –  TACC  –  SDSC  –  PSC  

33  –  6/7/13,  ©  2012  Internet2  

Na0onal  Cyberinfrastructure  

Source:  hhps://www.xsede.org/networking  

Page 34: Internet2 Support for Biomedical Research

NSF-­‐Funded  or    XSEDE  Alloca0on  

Federally  Funded  

NCGAS  Galaxy    Portal  

POD  Galaxy    Portal  

5  PB    D.C.  

6  PB    Storage  

5.5  PB    Storage  

4  PB    Storage  

TACC  

SDSC  

PSC  

Mason  

POD  

Sequencing  Center   NCBI  

100  Gig    Internet2  

10  Gig    NLR  

NCGAS Virtual Instrument Indiana  University  

Source:  Barneh,  W.K.,  and  R.D.  LeDuc,  Next  GeneraBon  Cyberinfrastructures  for  Next  GeneraBon  Sequencing  and  Genome  Science,  presented  at  2013  AAMC  GIR  Conference,  Vancouver,  BC  

Page 35: Internet2 Support for Biomedical Research

Focused  Technical  Workshop  on  July  17-­‐  18,  2013  Lawrence  Berkeley  NaOonal  Laboratory  

Berkeley,  California    

•  Building  on  the  success  of  Joint  Techs,  mee0ng  will  bring  together  technical  experts  in  a  smaller  seyng  with  domain  scien0sts.    

•  Workshop  will  include  a  slate  of  invited  speakers  and  panels.  •  Format  to  encourage  lively,  interac0ve  discussions  with  the  goal  of  

developing  a  set  of  tangible  next  steps  for  suppor0ng  this  data-­‐intensive  science  community  

•  Four  sub-­‐topic  areas:    Network  Architectures,  Workflow  Engines,  Public  and  Private  Cloud  Architectures,  and  Data  Movement  Tools  

•  See:    hhp://events.internet2.edu/2013/mw-­‐life-­‐sciences/index.cfm  

35  –  6/7/13,  ©  2012  Internet2  

Networking  Issues  for  Life  Sciences  Research  

Page 36: Internet2 Support for Biomedical Research

•  The  Fourth  Paradigm  –  Data-­‐Intensive  Scien0fic  Discovery  –  http://research.microsoft.com/en-us/collaboration/fourthparadigm/  

•  Internet2  Network  and  Innova0on  PlaLorm  –  http://www.internet2.edu/network/  

•  Science  DMZ  –  http://fasterdata.es.net/science-dmz/  

•  perfSONAR  –  http://www.perfsonar.net/  

•  Internet2  Research  Support  Center  –  [email protected]

•  Internet2  Life  Sciences  –  Michael  Sullivan,  MD,  Associate  Director  –  [email protected]  

36  –  6/7/13,  ©  2012  Internet2  

Resources  

Contact  

Page 37: Internet2 Support for Biomedical Research

INTERNET2  SUPPORT  FOR  BIOMEDICAL  RESEARCH  AAMC  2013  Informa0on  Technology  in  Academic  Medicine  Conference  Vancouver  CA      June  5-­‐7,  2013    Michael  Sullivan,  M.D.  Associate  Director,  Health  Sciences,  Internet2  

Thank  You  

37  –  6/7/13,  ©  2012  Internet2