Upload
ed-dodds
View
1.135
Download
1
Tags:
Embed Size (px)
Citation preview
Internet2 Support for Biomedical Research
AAMC 2013 Informa0on Technology in Academic Medicine Conference Vancouver CA June 5-‐7, 2013 Michael Sullivan, M.D. Associate Director, Health Sciences, Internet2
Internet2 Research Support • Community and Network • Data-‐intensive Science • Interna0onal Collabora0on • Innova0on PlaLorm
Big Data Challenges
• Transport • Security • Storage and Compute
2 – 6/7/13, © 2012 Internet2
Overview
3 – 6/7/13, © 2010 Internet2
Internet2 Community 220 Universi0es 60 Corpora0ons 70 Government agencies 38 Regional and state networks 65 Interna0onal R&E networks
4 – 6/7/13, © 2010 Internet2
Advanced 100G Produc0on and Research Network
Physics Large Hadron Collider
5 – 6/7/13, © 2012 Internet2
Data Tsunami
Life Sciences Magne0c Resonance Imager (MRI)
Image by: CERN"
6 – 6/7/13, © 2012 Internet2
Visualizing Big Data
Physics LHC – Lead Ion Collision
Life Sciences MRI – Monkey Brain
Source: Van Wedeen, M.D., Martinos Center and Dept. of Radiology, Massachusetts General Hospital and Harvard University Medical School"
Source: CERN (ALICE detector)"
Illumina HiSeq 2500/1500
7 – 6/7/13, © 2012 Internet2
Sequencing: Smaller, Faster, Cheaper
Handheld USB Sequencer"
Image: Oxford Nanopore Technologies"Source: http://www.illumina.com/systems/hiseq_systems/hiseq_2500_1500.ilmn"
8 – 6/7/13, © 2012 Internet2
Democra0za0on of Sequencing 2,386 Genome Sequencers Worldwide – 30 May 2013
Source: Map of High-throughput Sequencers"
9 – 6/7/13, © 2012 Internet2
North American Genome Sequencers 998 Sequencers in NA – 30 May 2013
Source: Map of High-throughput Sequencers"
10 – 6/7/13, © 2012 Internet2
Sequencing in Vancouver 13 Sequencers at the Genome Science Center
Source: Map of High-throughput Sequencers"
11 – 6/7/13, © 2012 Internet2
Canarie Weathermap
12 – 6/7/13, © 2011 Internet2
US-‐based Interna0onal Exchange Points
US-‐based Exchange Points
StarLight, Chicago IL MAN LAN, New York NY NGIX-‐East, College Park MD Atlan0cWave (distributed) AMPATH, Miami FL PacificWave-‐S, Los Angeles CA PacificWave-‐N, Seahle WA
13 – 6/7/13, © 2011 Internet2
GEANT Interna0onal
14 – 6/7/13, © 2011 Internet2
APAN
14 – 6/7/13, © 2012 Internet2
15 – 6/7/13, © 2012 Internet2
Synchronized Genomic Repositories: NCBI, EBI, DDBJ
16 – 6/7/13, © 2012 Internet2
US – China 10 Gbps Link Fed Ex:
Internet + FTP: China-‐US 10G Link:
2 days 26 hours 30 seconds
Sample.fa (24GB)
Dr. Dawei Lin Dr. Lin Fang
100 GigE Layer 2 ConnecOon
www.internet2.edu
Innovation Platform
SDN Control Server
Performance Node
Switches, data stores for data-‐intensive science
TradiOonal L3 Campus Border Security
High-‐Performance Layer 2/3
Switch/Router
TradiOonal Campus
Border Router
Campus Enterprise Network
Science DMZ
For more informaOon, see fasterdata.es.net
SoWware Defined Networking GENI
Experiments
Dark Fiber
OpOcal System
GENI ? Dynamic Layer 2
IP Network Layer 3
StaOc Layer 2
R&E IP TR-‐CPS
InnovaOon Services TradiOonal Services
SoWware Defined Networking Substrate
TradiOonal Switch Substrate
Your Research Internet2 innovaOon backbone delivered as 100G L1
TradiOonal regional and commodity providers
17 – 6/7/13, © 2012 Internet2
18 – 6/7/13, © 2012 Internet2
Innova0on PlaLorm Pilot Sites
Transport • Science DMZ • PerfSONAR Toolkit • MaDDash Tes0ng Mesh • File Transfer Tools
Security • Science DMZ Hardening • Federated IdM: InCommon and NSTIC
Storage and Compute • Storage and Compute
19 – 6/7/13, © 2012 Internet2
Mee0ng the Big Data Challenges
20 – 6/7/13, © 2012 Internet2
Challenge #1: Transport
hhp://fasterdata.es.net/science-‐dmz/science-‐dmz-‐security/
Science DMZ
21 – 6/7/13, © 2012 Internet2
Performance Monitoring
22 – 6/7/13, © 2012 Internet2
MaDDash XSEDE Tes0ng Mesh
• scp, smp, rsync – poor choices for WAN (RTT > 25ms) • scp with HPN patch – beher but s0ll has limita0ons
• Globus Online – hhp://www.globusonline.org – Uses GridFTP with TCP op0miza0ons – Friendly GUI, Fire and Forget, Galaxy integra0on
• Aspera: hhp://www.asperasom.com/ • Annai Systems: hhp://www.annaisystems.com
23 – 6/7/13, © 2012 Internet2
File Transfer Tools
TCP – based Open Source
UDP – based Commercial
Unix LAN Tools
24 – 6/7/13, © 2012 Internet2
Tool Speeds
Berkeley, CA çè Argonne, IL RTT=53
Hardening the Science DMZ • ESnet Big Data design pahern • Internet2 Innova0on PlaLorm • NSF CC-‐NIE grants • University of Florida
– HIPAA alignment – Efficient encryp0on – Comprehensive logging – Robust authen0ca0on
25 – 6/7/13, © 2012 Internet2
Challenge #2: Security
Source: www.securearc.com
0
50
100
150
200
250
300
350
400
450
2004 2005 2006 2007 2008 2009 2010 2011 2012 (June)
Num
ber o
f Par
ticip
ants
26 – 6/7/13, © 2012 Internet2
Federated Iden0ty Management
• White House iniOaOve administered by NIST • Goal is to create an “IdenOty Ecosystem” • IDEGS – IdenOty Ecosystem Steering Group • Five awards for pilots spanning mulOple sectors:
– Resilient Network Systems, AMA, Aetna, ACC, NeHC, … – Criterion Systems, ID/DataWeb, AOL, Experian, Ping Iden0ty, … – Daon, Inc., AARP, PayPal, Purdue, … – American Assoc. of Motor Vehile Admins, Microsom, AT&A, etc… – Internet2, Carnegie Mellon, Brown, MIT, U. of Texas, U. of Utah…
27 – 6/7/13, © 2012 Internet2
NSTIC – Na0onal Strategy for Trusted Iden00es in Cyberspace
• Cloud CompuOng – many iniOaOves – Private: NCI bake-‐off to create Cancer Knowledge Clouds – Public/Private: AWS EC2 instances –– [100G] –– NCBI repository – Open Cloud: BioNimbus Protected Data Cloud – Proprietary: BGI EasyGenomics Cloud
• NaOonal Cyberinfrastructure – XSEDE – Internet2 – NCGAS
28 – 6/7/13, © 2012 Internet2
Challenge #3: Storage and Compute
29 – 6/7/13, © 2012 Internet2
NCI: Cancer Knowledge Cloud -‐ RFI
Summary of Community Input
hhps://wiki.nci.nih.gov/display/NCIPinput/Summary+of+Input+Request%3A+Computa0onal+Needs+to+Support+Large-‐Scale+Genomics+Inves0ga0ons
Reduced Data Size
Incrementally Transfer Large Files
High Speed Network
Connec0ons
Cloud Access and Support
30 – © 2013 Internet2
NCBI: Four Different Approaches
Source: Don Preuss, NCBI Experiences and Big Data Strategy, presented at 2013 Internet2 Annual Mee0ng, Arlington, VA
bionimbus.opensciencedatacloud.org
BioNimbus: An Open Cloud with Protected Data
32 – 6/7/13, © 2012 Internet2
EasyGenomics: BGI’s Cloud Solu0on
Source: Xu Xing, Managing Big Data: The Genome Center PerspecBve, presented at Bio-‐IT World Conference & Expo ‘13, Boston, MA
• XSEDE – NSF-‐funded – Supercomputers – HPC resources
• Internet2 – 220 universi0es – XSEDEnet
• NCGAS – Indiana University – TACC – SDSC – PSC
33 – 6/7/13, © 2012 Internet2
Na0onal Cyberinfrastructure
Source: hhps://www.xsede.org/networking
NSF-‐Funded or XSEDE Alloca0on
Federally Funded
NCGAS Galaxy Portal
POD Galaxy Portal
5 PB D.C.
6 PB Storage
5.5 PB Storage
4 PB Storage
TACC
SDSC
PSC
Mason
POD
Sequencing Center NCBI
100 Gig Internet2
10 Gig NLR
NCGAS Virtual Instrument Indiana University
Source: Barneh, W.K., and R.D. LeDuc, Next GeneraBon Cyberinfrastructures for Next GeneraBon Sequencing and Genome Science, presented at 2013 AAMC GIR Conference, Vancouver, BC
Focused Technical Workshop on July 17-‐ 18, 2013 Lawrence Berkeley NaOonal Laboratory
Berkeley, California
• Building on the success of Joint Techs, mee0ng will bring together technical experts in a smaller seyng with domain scien0sts.
• Workshop will include a slate of invited speakers and panels. • Format to encourage lively, interac0ve discussions with the goal of
developing a set of tangible next steps for suppor0ng this data-‐intensive science community
• Four sub-‐topic areas: Network Architectures, Workflow Engines, Public and Private Cloud Architectures, and Data Movement Tools
• See: hhp://events.internet2.edu/2013/mw-‐life-‐sciences/index.cfm
35 – 6/7/13, © 2012 Internet2
Networking Issues for Life Sciences Research
• The Fourth Paradigm – Data-‐Intensive Scien0fic Discovery – http://research.microsoft.com/en-us/collaboration/fourthparadigm/
• Internet2 Network and Innova0on PlaLorm – http://www.internet2.edu/network/
• Science DMZ – http://fasterdata.es.net/science-dmz/
• perfSONAR – http://www.perfsonar.net/
• Internet2 Research Support Center – [email protected]
• Internet2 Life Sciences – Michael Sullivan, MD, Associate Director – [email protected]
36 – 6/7/13, © 2012 Internet2
Resources
Contact
INTERNET2 SUPPORT FOR BIOMEDICAL RESEARCH AAMC 2013 Informa0on Technology in Academic Medicine Conference Vancouver CA June 5-‐7, 2013 Michael Sullivan, M.D. Associate Director, Health Sciences, Internet2
Thank You
37 – 6/7/13, © 2012 Internet2