HENP Networks and Grids for Global Virtual Organizations

  • Published on
    09-Jan-2016

  • View
    32

  • Download
    2

Embed Size (px)

DESCRIPTION

HENP Networks and Grids for Global Virtual Organizations. Harvey B. Newman California Institute of Technology TIP2004, Internet2 HENP WG Session January 25, 2004. The Challenges of Next Generation Science in the Information Age. Flagship Applications - PowerPoint PPT Presentation

Transcript

<ul><li><p> HENP Networks and Grids for Global Virtual Organizations Harvey B. Newman California Institute of Technology TIP2004, Internet2 HENP WG Session January 25, 2004</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>The Challenges of Next Generation Science in the Information AgeFlagship Applications High Energy &amp; Nuclear Physics, AstroPhysics Sky Surveys: TByte to PByte block transfers at 1-10+ Gbps eVLBI: Many real time data streams at 1-10 Gbps BioInformatics, Clinical Imaging: GByte images on demandHEP Data Example: From Petabytes in 2003, ~100 Petabytes by 2007-8, to ~1 Exabyte by ~2013-5.Provide results with rapid turnaround, coordinating large but limited computing and data handling resources, over networks of varying capability in different world regionsAdvanced integrated applications, such as Data Grids, rely on seamless operation of our LANs and WANsWith reliable, quantifiable high performancePetabytes of complex data explored and analyzed by 1000s of globally dispersed scientists, in hundreds of teams</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>First Beams: April 2007Physics Runs: from Summer 2007TOTEMpp, general purpose; HILHCb: B-physicsALICE : HI pp s =14 TeV L=1034 cm-2 s-1 27 km Tunnel in Switzerland &amp; FranceLarge Hadron Collider (LHC) CERN, Geneva: 2007 Start CMSATLASAtlas</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>Four LHC Experiments: The Petabyte to Exabyte ChallengeATLAS, CMS, ALICE, LHCB Higgs + New particles; Quark-Gluon Plasma; CP Violation Tens of PB 2008; To 1 EB by ~2015 Hundreds of TFlops To PetaFlops 6000+ Physicists &amp; Engineers; 60+ Countries; 250 Institutions</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>LHC: Higgs Decay into 4 muons (Tracker only); 1000X LEP Data Rate109 events/sec, selectivity: 1 in 1013 (1 person in a thousand world populations)</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>LHC Data Grid Hierarchy:Developed at CaltechTier 1Online System CERN Center PBs of Disk; Tape RobotFNAL CenterIN2P3 Center INFN Center RAL Center InstituteInstituteInstituteInstitute Workstations~100-1500 MBytes/sec2.5-10 Gbps0.1 to 10 GbpsTens of Petabytes by 2007-8. An Exabyte ~5-7 Years later.Physics data cache~PByte/sec~10 Gbps~2.5-10 GbpsTier 0 +1Tier 3Tier 4Tier 2ExperimentCERN/Outside Resource Ratio ~1:2 Tier0/( Tier1)/( Tier2) ~1:1:1Emerging Vision: A Richly Structured, Global Dynamic System</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>Bandwidth Growth of Intl HENP Networks (US-CERN Example) Rate of Progress &gt;&gt; Moores Law. (US-CERN Example) 9.6 kbps Analog (1985) 64-256 kbps Digital (1989 - 1994) [X 7 27] 1.5 Mbps Shared (1990-3; IBM) [X 160] 2 -4 Mbps (1996-1998) [X 200-400] 12-20 Mbps (1999-2000) [X 1.2k-2k] 155-310 Mbps (2001-2) [X 16k 32k] 622 Mbps (2002-3) [X 65k] 2.5 Gbps (2003-4) [X 250k] 10 Gbps (2005) [X 1M]A factor of ~1M over a period of 1985-2005 (a factor of ~5k during 1995-2005) HENP has become a leading applications driver, and also a co-developer of global networks; </p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>HEP is Learning How to Use Gbps Networks Fully: Factor of ~50 Gain in Max. Sustained TCP Thruput in 2 Years, On Some US+Transoceanic Routes 9/01 105 Mbps 30 Streams: SLAC-IN2P3; 102 Mbps 1 Stream CIT-CERN 5/20/02 450-600 Mbps SLAC-Manchester on OC12 with ~100 Streams 6/1/02 290 Mbps Chicago-CERN One Stream on OC12 9/02 850, 1350, 1900 Mbps Chicago-CERN 1,2,3 GbE Streams, 2.5G Link 11/02 [LSR] 930 Mbps in 1 Stream California-CERN, and California-AMS FAST TCP 9.4 Gbps in 10 Flows California-Chicago 2/03 [LSR] 2.38 Gbps in 1 Stream California-Geneva (99% Link Utilization) 5/03 [LSR] 0.94 Gbps IPv6 in 1 Stream Chicago- Geneva TW &amp; SC2003: 5.65 Gbps (IPv4), 4.0 Gbps (IPv6) in 1 Stream Over 11,000 km *</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>FAST TCP: Baltimore/Sunnyvale 1 flow 2 flows 7 flows 9 flows 10 flowsAverage utilization95%92%90%90%88%Measurements 11/02Std Packet SizeUtilization averaged over &gt; 1hr4000 km PathFast convergence to equilibriumRTT estimation: fine-grain timerDelay monitoring in equilibriumPacing: reducing burstinessFair Sharing Fast Recovery8.6 Gbps;21.6 TB in 6 Hours9G10G</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>Fall 2003: Transatlantic Ultraspeed TCP Tranfers Throughput Achieved: X50 in 2 yearsJuniper, HP Level(3) Telehouse Terabyte Transfers by the Caltech-CERN Team: Nov 18: 4.00 Gbps IPv6 Geneva-Phoenix (11.5 kkm) Oct 15: 5.64 Gbps IPv4 Palexpo-L.A. (10.9 kkm) Across Abilene (Internet2) Chicago-LA, Sharing with normal network traffic Peaceful Coexistence with a Joint Internet2-Telecom World VRVS Videoconference Nov 19: 23+ Gbps TCP: Caltech, SLAC, CERN, LANL, UvA, Manchester</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>HENP Major Links: Bandwidth Roadmap (Scenario) in GbpsContinuing the Trend: ~1000 Times Bandwidth Growth Per Decade; We are Rapidly Learning to Use Multi-Gbps Networks Dynamically</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p><p>Year</p><p>Production</p><p>Experimental</p><p>Remarks</p><p>2001</p><p>0.155 </p><p>0.622-2.5</p><p>SONET/SDH</p><p>2002</p><p>0.622</p><p>2.5</p><p>SONET/SDHDWDM; GigE Integ.</p><p>2003</p><p>2.5</p><p>10 </p><p>DWDM; 1 + 10 GigEIntegration</p><p>2005</p><p>10</p><p>2-4 X 10 </p><p>( Switch;( Provisioning</p><p>2007</p><p>2-4 X 10</p><p>~10 X 10; 40 Gbps</p><p>1st Gen. ( Grids</p><p>2009</p><p>~10 X 10or 1-2 X 40 </p><p>~5 X 40 or~20-50 X 10</p><p>40 Gbps (Switching</p><p>2011</p><p>~5 X 40 or</p><p>~20 X 10</p><p>~25 X 40 or ~100 X 10 </p><p>2nd Gen ( GridsTerabit Networks</p><p>2013</p><p>~Terabit</p><p>~MultiTbps</p><p>~Fill One Fiber </p></li><li><p>HENP Lambda Grids:Fibers for PhysicsProblem: Extract Small Data Subsets of 1 to 100 Terabytes from 1 to 1000 Petabyte Data StoresSurvivability of the HENP Global Grid System, with hundreds of such transactions per day (circa 2007) requires that each transaction be completed in a relatively short time. Example: Take 800 secs to complete the transaction. Then Transaction Size (TB) Net Throughput (Gbps) 1 10 10 100 100 1000 (Capacity of Fiber Today)Summary: Providing Switching of 10 Gbps wavelengths within ~2-4 years; and Terabit Switching within 5-8 years would enable Petascale Grids with Terabyte transactions, to fully realize the discovery potential of major HENP programs, as well as other data-intensive research.</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>National Light Rail FootprintNLRStarting Up NowInitially 4 10 Gb WavelengthsFuture: to 40 10Gb Waves Transition beginning now to optical, multi-wavelength R&amp;E networks. Also Note: XWIN (Germany); IEEAF/GEO plan for dark fiber in Europe</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>GLIF network 1Q2004:Global Lambda Integrated FacilityDWDM SURFnet lambda service pathIP service path10 Gbit/sSURFnet10 Gbit/sSURFnet10 Gbit/sIEEAF10 Gbit/sDwingelooASTRON/JIVEPragueCzechLight2.5 Gbit/sNSF10 Gbit/sLondonUKLightStockholmNorthernLight2.5 Gbit/s New YorkMANLANTokyoWIDE10 Gbit/s 10Gbit/s10 Gbit/s(2/29 ?)2x10 Gbit/sIEEAF10 Gbit/s2x10 Gbit/s10 Gbit/s2.5 Gbit/s 2.5 Gbit/sTokyoAPANGenevaCERNChicagoAmsterdam(2/29 )</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>Aarnet: SXTransport Project in 2004Connect Major Australian Universities to 10 Gbps BackboneTwo 10 Gbps Research Links to the USAarnet/USLIC Collaboration on Net R&amp;D Starting Now</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>GLORIAD: Global Optical Ring (US-Ru-Cn)Little Gloriad (OC3) Launched January 12; to OC192 in 2004</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>Germany: 2003, 2004, 2005GWIN Connects 550 Universities, Labs, Other InstitutionsGWIN: Q4/03GWIN: Q4/04PlanXWIN: Q4/05 (Dark Fiber Option)</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>The original Computational and Data Grid concepts are largely stateless, open systems: known to be scalable Analogous to the WebThe classical Grid architecture has a number of implicit assumptions The ability to locate and schedule suitable resources, within a tolerably short time (i.e. resource richness) Short transactions with relatively simple failure modesHENP Grids are Data Intensive &amp; Resource-Constrained 1000s of users competing for resources at 100s of sites Resource usage governed by local and global policies Long transactions; some long queuesHENP Stateful, End-to-end Monitored and Tracked Paradigm Adopted in OGSA, Now WS Resource FrameworkClassical, HENP Data Grids, and Now Service-Oriented Grids</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p><p>*Caltech GAE Team</p><p>The Grid Analysis Environment (GAE)The GAE: key to success or failure for physics &amp; Grids in the LHC era:100s - 1000s of tasks, with a wide range of computing, data and network resource requirements, and priorities</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>The Move to OGSA and then Managed Integration SystemsIncreased functionality,standardizationTimeCustomsolutionsOpen GridServices ArchGGF: OGSI, (+ OASIS, W3C)Multiple implementations,including Globus ToolkitWeb services + Globus ToolkitDefacto standardsGGF: GridFTP, GSIX.509,LDAP,FTP, App-specificServices~Integrated SystemsStateful; ManagedWeb Serv Resrc Framwk</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>Managing Global Systems: Dynamic Scalable Services ArchitectureMonALISA: http://monalisa.cacr.caltech.edu</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>Dynamic Distributed Services Architecture (DDSA)</p><p>Station Server Services-engines at sites host Dynamic Services Auto-discovering, Collaborative Scalable to thousands of service-InstancesServers interconnect dynamically; form a robust fabric Service Agents: Goal-Oriented, Autonomous, Adaptive Adaptable to Web services: many platforms &amp; working environments (also mobile)Caltech/UPB (Romania)/NUST (Pakistan) CollaborationSee http://monalisa.cacr.caltech.edu http://diamonds.cacr.caltech.edu</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p><p>*Caltech GAE Team</p><p>GAE ArchitectureAnalysis Clients talk standard protocols to the Grid Services Web Server, a.k.a. the Clarens data/services portal.Simple Web service API allows Analysis Clients (simple or complex) to operate in this architecture.Typical clients: ROOT, Web Browser, IGUANA, COJAC The Clarens portal hides the complexity of the Grid Services from the client, but can expose it in as much detail as reqd for e.g. monitoring.Key features: Global Scheduler, Catalogs, Monitoring, and Grid-wide Execution service. </p><p>Computing Model Progress CMS Internal Review of Software and Computing</p><p>*Caltech GAE Team</p><p>GAE ArchitectureThe GAE, based on Clarens and Web services, easily allows a Peer-to-Peer configuration to be built, with associated robustness and scalability features.Flexible: allows easy creation, use and management of highly complex VO structures.A typical Peer-to-Peer scheme would involve the Clarens servers acting as Global Peers, that broker GAE client requests among all the Clarens servers available worldwide.Structured Peer-to-Peer</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>UltraLight Collaboration:http://ultralight.caltech.eduIntegrated hybrid experimental network, leveraging Transatlantic R&amp;D network partnerships; packet-switched + dynamic optical paths 10 GbE across US and the Atlantic: NLR, DataTAG, TransLight, NetherLight, UKLight, etc.; Extensions to Japan, Taiwan, BrazilEnd-to-end monitoring; Realtime tracking and optimization; Dynamic bandwidth provisioningAgent-based services spanning all layers of the system, from the optical cross-connects to the applications.Caltech, UF, FIU, UMich, SLAC,FNAL, MIT/Haystack, CERN, UERJ(Rio), NLR, CENIC, UCAID, Translight, UKLight, Netherlight, UvA, UCLondon, KEK, Taiwan Cisco, Level(3) </p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p>ICFA Standing Committee on Interregional Connectivity (SCIC)Created by ICFA in July 1998 in Vancouver ; Following ICFA-NTFCHARGE: Make recommendations to ICFA concerning the connectivity between the Americas, Asia and Europe (and network requirements of HENP)As part of the process of developing these recommendations, the committee should Monitor traffic Keep track of technology developmentsPeriodically review forecasts of future bandwidth needs, and Provide early warning of potential problemsCreate subcommittees when necessary to meet the chargeThe chair of the committee should report to ICFA once per year, at its joint meeting with laboratory directors (Today)Representatives: Major labs, ECFA, ACFA, NA Users, S. America</p><p>Computing Model Progress CMS Internal Review of Software and Computing</p></li><li><p> ICFA SCIC in 2002-2004A Period of Intense ActivityStrong Focus on the Digital Divide ContinuingFive Reports; Presented to ICFA 2003 See http://cern.ch/icfa-scic Main Report: Networking for HENP [H. Newman et al.] Monitoring WG Report...</p></li></ul>