High Energy Physics: Networks & Grids Systems for Global Science High Energy Physics: Networks & Grids Systems for Global Science Harvey B. Newman Harvey

High Energy Physics: NetworksHigh Energy Physics: Networks& Grids Systems for Global Science& Grids Systems for Global Science

Harvey B. NewmanHarvey B. Newman California Institute of TechnologyCalifornia Institute of Technology

AMPATH Workshop, FIUAMPATH Workshop, FIUJanuary 31, 2003January 31, 2003

Next Generation Networks for Experiments: Goals and NeedsNext Generation Networks for Experiments: Goals and Needs

Providing rapid access to event samples, subsets Providing rapid access to event samples, subsets and analyzed physics results from massive data storesand analyzed physics results from massive data stores From Petabytes by 2002, ~100 Petabytes by 2007, From Petabytes by 2002, ~100 Petabytes by 2007,

to ~1 Exabyte by ~2012. to ~1 Exabyte by ~2012. Providing analyzed results with rapid turnaround, byProviding analyzed results with rapid turnaround, by

coordinating and managing the large but coordinating and managing the large but LIMITED LIMITED computing, computing, data handling and data handling and NETWORK NETWORK resources effectivelyresources effectively

Enabling rapid access to the data and the collaborationEnabling rapid access to the data and the collaboration Across an ensemble of networks of varying capabilityAcross an ensemble of networks of varying capability

Advanced integrated applications, such as Data Grids, Advanced integrated applications, such as Data Grids, rely on seamless operation of our LANs and WANsrely on seamless operation of our LANs and WANs With reliable, monitored, quantifiable high performanceWith reliable, monitored, quantifiable high performance

Large data samples explored and analyzed by thousands of Large data samples explored and analyzed by thousands of globally dispersed scientists, in hundreds of teamsglobally dispersed scientists, in hundreds of teams

All charged tracks with pt > 2 GeV

Reconstructed tracks with pt > 25 GeV

(+30 minimum bias events)

109 events/sec, selectivity: 1 in 1013 (1 person in 1000 world populations)

LHC: Higgs Decay into 4 muons LHC: Higgs Decay into 4 muons (Tracker only); 1000X LEP Data Rate(Tracker only); 1000X LEP Data Rate

Transatlantic Net WG (HN, L. Price) Bandwidth Requirements [*]

Transatlantic Net WG (HN, L. Price) Bandwidth Requirements [*]

2001 2002 2003 2004 2005 2006

CMS 100 200 300 600 800 2500

ATLAS 50 100 300 600 800 2500

BaBar 300 600 1100 1600 2300 3000

CDF 100 300 400 2000 3000 6000

D0 400 1600 2400 3200 6400 8000

BTeV 20 40 100 200 300 500

DESY 100 180 210 240 270 300

CERN BW

155-310

622 2500 5000 10000 20000

[*] BW Requirements Increasing Faster Than [*] BW Requirements Increasing Faster Than Moore’s LawMoore’s Law

See http://gate.hep.anl.gov/lprice/TANSee http://gate.hep.anl.gov/lprice/TAN

HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps

HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps

Year Production Experimental Remarks

2001 0.155 0.622-2.5 SONET/SDH

2002 0.622 2.5 SONET/SDH DWDM; GigE Integ.

2003 2.5 10 DWDM; 1 + 10 GigE Integration

2005 10 2-4 X 10 Switch; Provisioning

2007 2-4 X 10 ~10 X 10; 40 Gbps

1st Gen. Grids

2009 ~10 X 10 or 1-2 X 40

~5 X 40 or ~20-50 X 10

40 Gbps Switching

2011 ~5 X 40 or

~20 X 10

~25 X 40 or ~100 X 10

2nd Gen Grids Terabit Networks

2013 ~Terabit ~MultiTbps ~Fill One Fiber

Continuing the Trend: ~1000 Times Bandwidth Growth Per Decade;We are Rapidly Learning to Use and Share Multi-Gbps Networks

NLNLSURFnet

GENEVA

UKUKSuperJANET4

ABILENEABILENE

ESNETESNET

CALREN2

CALREN2

ItItGARR-B

GEANT

NewYork

FrFrINRIA

STAR-TAP

STARLIGHT

DataTAG ProjectDataTAG Project

EU-Solicited Project. EU-Solicited Project. CERNCERN, PPARC (UK), Amsterdam (NL), and INFN (IT);, PPARC (UK), Amsterdam (NL), and INFN (IT);and US (DOE/NSF: UIC, NWU and Caltech) partnersand US (DOE/NSF: UIC, NWU and Caltech) partners

Main Aims: Main Aims: Ensure maximum interoperability between US and EU Grid ProjectsEnsure maximum interoperability between US and EU Grid ProjectsTransatlantic Testbed for advanced network researchTransatlantic Testbed for advanced network research

2.5 Gbps Wavelength Triangle from 7/02; to 10 Gbps Triangle by Early 20032.5 Gbps Wavelength Triangle from 7/02; to 10 Gbps Triangle by Early 2003

Wave

Triangle

2.5 to 10G10

G

10G

AtriumAtriumVTHD

*

Also see http://www-iepm.slac.stanford.edu/monitoring/bulk/; and the Internet2 E2E Initiative: http://www.internet2.edu/e2e

Progress: Max. Sustained TCP Thruput on Transatlantic and US Links

Progress: Max. Sustained TCP Thruput on Transatlantic and US Links

8-9/01 105 Mbps 30 Streams: SLAC-IN2P3; 102 Mbps 1 Stream CIT-CERN8-9/01 105 Mbps 30 Streams: SLAC-IN2P3; 102 Mbps 1 Stream CIT-CERN 11/5/01 125 Mbps in One Stream (modified kernel): CIT-CERN11/5/01 125 Mbps in One Stream (modified kernel): CIT-CERN 1/09/02 190 Mbps for One stream shared on 2 155 Mbps links1/09/02 190 Mbps for One stream shared on 2 155 Mbps links 3/11/02 120 Mbps 3/11/02 120 Mbps Disk-to-DiskDisk-to-Disk with One Stream on 155 Mbps with One Stream on 155 Mbps

link (Chicago-CERN)link (Chicago-CERN) 5/20/02 450-600 Mbps SLAC-Manchester on OC12 with ~100 Streams5/20/02 450-600 Mbps SLAC-Manchester on OC12 with ~100 Streams 6/1/02 290 Mbps Chicago-CERN One Stream on OC12 (mod. Kernel)6/1/02 290 Mbps Chicago-CERN One Stream on OC12 (mod. Kernel) 9/02 850, 1350, 1900 Mbps Chicago-CERN 1,2,3 GbE Streams, OC48 Link9/02 850, 1350, 1900 Mbps Chicago-CERN 1,2,3 GbE Streams, OC48 Link 11-12/02 FAST: 11-12/02 FAST: 940 Mbps in 1 Stream SNV-CERN; 940 Mbps in 1 Stream SNV-CERN; 9.4 Gbps in 10 Flows SNV-Chicago 9.4 Gbps in 10 Flows SNV-Chicago

FAST (Caltech): A Scalable, “Fair” Protocol for Next-Generation Networks: from 0.1 To 100 Gbps

URL: netlab.caltech.edu/FAST

SC2002 11/02

Experiment

Sunnyvale Baltimore

Chicago

Geneva

3000km 1000km

70

00

km

C. Jin, D. Wei, S. LowFAST Team & Partners

Internet: distributed feedback system

Rf (s)

Rb’(s)

p

TCP AQM

Theory

22.8.02IPv6

9.4.021 flow

29.3.00multiple

Balt

imore

-Geneva

Baltim

ore-

Sunn

yval

e

SC20021 flow

SC20022 flows

SC200210 flows

I2 LSR

Sunnyvale

-Geneva

Standard Packet Size 940 Mbps single flow/GE card

9.4 petabit-m/sec 1.9 times LSR

9.4 Gbps with 10 flows37.0 petabit-m/sec 6.9 times LSR

22 TB in 6 hours; in 10 flowsImplementation Sender-side (only) mods Delay (RTT) based Stabilized Vegas

Highlights of FAST TCP

Next: 10GbE; 1 GB/sec disk to disk

netlab.caltech.edu

FAST TCP: Aggregate Throughput

1 flow 2 flows 7 flows 9 flows 10 flows

Average utilization

95%

92%

90%

90%

88%FAST Standard MTU Utilization averaged over > 1hr

HENP Lambda Grids:Fibers for Physics

HENP Lambda Grids:Fibers for Physics

Problem: Extract “Small” Data Subsets of 1 to 100 Terabytes Problem: Extract “Small” Data Subsets of 1 to 100 Terabytes from 1 to 1000 Petabyte Data Storesfrom 1 to 1000 Petabyte Data Stores

Survivability of the HENP Global Grid System, with Survivability of the HENP Global Grid System, with hundreds of such transactions per day (circa 2007)hundreds of such transactions per day (circa 2007)requires that each transaction be completed in a requires that each transaction be completed in a relatively short time. relatively short time.

Example: Take 800 secs to complete the transaction. ThenExample: Take 800 secs to complete the transaction. Then Transaction Size (TB)Transaction Size (TB) Net Throughput (Gbps)Net Throughput (Gbps) 1 101 10 10 10010 100 100 1000 (Capacity of 100 1000 (Capacity of

Fiber Today) Fiber Today) Summary: Providing Switching of 10 Gbps wavelengthsSummary: Providing Switching of 10 Gbps wavelengths

within ~3-5 years; and Terabit Switching within 5-8 yearswithin ~3-5 years; and Terabit Switching within 5-8 yearswould enable “Petascale Grids with Terabyte transactions”,would enable “Petascale Grids with Terabyte transactions”,as required to fully realize the discovery potential of major HENP as required to fully realize the discovery potential of major HENP programs, as well as other data-intensive fields.programs, as well as other data-intensive fields.

Grids NOW

Efficient sharing of distributed heterogeneous compute and storage resourcesVirtual Organizations and Institutional resource sharingDynamic reallocation of resources to target specific problemsCollaboration-wide data access and analysis environments

Grid solutions NEED to be scalable & robust Must handle many petabytes per year Tens of thousands of CPUs Tens of thousands of jobs

Grid solutions presented here are supported in part by the GriPhyN, iVDGL, PPDG, EDG, and DataTag We are learning a lot from these current efforts For Example 1M Events Processed using VDT Oct.-Dec. 2002

Data Intensive Grids Now: Large Scale Production

Beyond Peoduction: Web Services for Ubiquitous Data Access and Analysis by a Worldwide Scientific Community

Beyond Peoduction: Web Services for Ubiquitous Data Access and Analysis by a Worldwide Scientific Community

Web Services: easy, flexible, Web Services: easy, flexible, platform-independent access to dataplatform-independent access to data (Object Collections in Databases) (Object Collections in Databases)

Well-adapted to use by individual Well-adapted to use by individual physicists, teachers & studentsphysicists, teachers & students

SkyQuery ExampleSkyQuery Example JHU/FNAL/Caltech with Web Service JHU/FNAL/Caltech with Web Service based access to astronomy surveysbased access to astronomy surveys Can be individually or Can be individually or simultaneously queried via Web simultaneously queried via Web interfaceinterface SimplicitySimplicity of interface hides of interface hides considerableconsiderable server power server power (from stored procedures etc.)(from stored procedures etc.) This is a “Traditional” Web Service, This is a “Traditional” Web Service,

with no user authentication requiredwith no user authentication required

COJAC: CMS ORCA Java Analysis Component: Java3D Objectivity JNI Web

Services

COJAC: CMS ORCA Java Analysis Component: Java3D Objectivity JNI Web

Services

Demonstrated Caltech-Riode Janeiro and Chile in 2002Demonstrated Caltech-Rio

de Janeiro and Chile in 2002

Grid projects have been a step forward for HEP and Grid projects have been a step forward for HEP and LHC: a path to meet the “LHC Computing” challengesLHC: a path to meet the “LHC Computing” challenges

“ “Virtual Data” Concept: applied to large-scale automated Virtual Data” Concept: applied to large-scale automated data processing among worldwide-distributed regional centers data processing among worldwide-distributed regional centers

The original Computational and Data Grid concepts are largely The original Computational and Data Grid concepts are largely stateless, open systems: known to be scalablestateless, open systems: known to be scalable

Analogous to the WebAnalogous to the Web The classical Grid architecture has a number of implicit The classical Grid architecture has a number of implicit

assumptionsassumptions The ability to locate and schedule suitable resources,The ability to locate and schedule suitable resources,

within a tolerably short time (i.e. resource richness) within a tolerably short time (i.e. resource richness) Short transactions; Relatively simple failure modesShort transactions; Relatively simple failure modes

HEP Grids are data-intensive and resource constrainedHEP Grids are data-intensive and resource constrained Long transactions; some long queuesLong transactions; some long queues Schedule conflicts; policy decisions; task redirectionSchedule conflicts; policy decisions; task redirection A Lot of global system state to be monitored+trackedA Lot of global system state to be monitored+tracked

LHC Distributed CM: HENP Data LHC Distributed CM: HENP Data Grids Versus Classical GridsGrids Versus Classical Grids

Current Grid Challenges: SecureWorkflow Management and Optimization

Current Grid Challenges: SecureWorkflow Management and Optimization

Maintaining a Maintaining a Global ViewGlobal View of Resources and System State of Resources and System State Coherent end-to-end System MonitoringCoherent end-to-end System Monitoring Adaptive Learning: new algorithms and strategiesAdaptive Learning: new algorithms and strategies

for execution optimization (increasingly automated)for execution optimization (increasingly automated) Workflow:Workflow: Strategic Balance of Policy Versus Strategic Balance of Policy Versus

Moment-to-moment Capability to Complete TasksMoment-to-moment Capability to Complete Tasks Balance High Levels of Usage of Limited Resources Balance High Levels of Usage of Limited Resources

Against Better Turnaround Times for Priority JobsAgainst Better Turnaround Times for Priority Jobs Goal-Oriented Algorithms; SteeringGoal-Oriented Algorithms; Steering Requests Requests

According to (Yet to be Developed) MetricsAccording to (Yet to be Developed) Metrics Handling User-Grid Interactions: Guidelines; AgentsHandling User-Grid Interactions: Guidelines; Agents Building Higher Level Services, and an IntegratedBuilding Higher Level Services, and an Integrated

Scalable User Environment for the AboveScalable User Environment for the Above

Distributed System Services Architecture (DSSA): CIT/Romania/Pakistan

Distributed System Services Architecture (DSSA): CIT/Romania/Pakistan

Agents: Autonomous, Auto-Agents: Autonomous, Auto-discovering, self-organizing, discovering, self-organizing, collaborative collaborative

““Station Servers” (static) host Station Servers” (static) host mobile “Dynamic Services”mobile “Dynamic Services”

Servers interconnect dynamically; Servers interconnect dynamically; form a robust fabric in which form a robust fabric in which mobile agents travel, with a mobile agents travel, with a payload of (analysis) taskspayload of (analysis) tasks

Adaptable to Web services: Adaptable to Web services: OGSA; and many platformsOGSA; and many platforms

Adaptable to Ubiquitous, Adaptable to Ubiquitous, mobile working environmentsmobile working environments

StationStationServerServer



LookupLookupServiceService

LookupLookupServiceService

Proxy ExchangeProxy Exchange

Registration

Registration

Service Listener

Service Listener

Lookup Lookup Discovery Discovery

ServiceService

Remote Notification

Remote Notification

Managing Global Systems of Increasing Scope and Complexity, In the Service of Science and Society, Requires A New Generation of Scalable, Autonomous, Artificially Intelligent Software Systems

By I. Legrand (Caltech) Deployed on US CMS Grid Agent-based Dynamic

information / resource discovery mechanism

Talks w/Other Mon. Systems Implemented in

Java/Jini; SNMP WDSL / SOAP with UDDI

Part of a Global “Grid Control Room” Service

MonaLisa: A Globally Scalable Grid Monitoring System

MONARC SONN: 3 Regional Centres MONARC SONN: 3 Regional Centres “Learning” to Export Jobs (Day 9)“Learning” to Export Jobs (Day 9)

NUST20 CPUs

CERN30 CPUs

CALTECH25 CPUs

1MB/s ; 150 ms RTT

1.2 MB

/s

150 ms R

TT

0

.8 M

B/s

200

ms

RTT

Optimized: Day = 9

<E> = 0.73

<E> = 0.66

<E> = 0.83

Develop and build Develop and build Dynamic WorkspacesDynamic Workspaces

Build Build Private GridsPrivate Grids to support to support scientific analysis communitiesscientific analysis communities

Using Agent Based Using Agent Based Peer-to-peer Web ServicesPeer-to-peer Web Services

Construct Construct Autonomous Autonomous CommunitiesCommunities Operating Operating Within Global Collaborations Within Global Collaborations

Empower small groups of Empower small groups of scientists (Teachers and scientists (Teachers and

Students)Students) to profit from and to profit from and contribute to int’l big science contribute to int’l big science

Drive the democratization of Drive the democratization of sciencescience via the deployment of via the deployment of new technologiesnew technologies

NSF ITR: Globally EnabledNSF ITR: Globally EnabledAnalysis CommunitiesAnalysis Communities

Private Grids and P2P Sub-Private Grids and P2P Sub-Communities in Global CMSCommunities in Global CMS

14600 Host Devices; 7800 Registered Users in 64 Countries 45 Network Servers Annual Growth 2 to 3X

An Inter-Regional Center for Research, Education and Outreach, and CMS CyberInfrastucture

An Inter-Regional Center for Research, Education and Outreach, and CMS CyberInfrastuctureFoster FIU and Brazil (UERJ) Strategic Expansion Foster FIU and Brazil (UERJ) Strategic Expansion Into CMS Physics Through Grid-Based “Computing” Into CMS Physics Through Grid-Based “Computing”

Development and Operation for Science of International Development and Operation for Science of International Networks, Grids and Collaborative SystemsNetworks, Grids and Collaborative Systems Focus on Research at the High Energy frontierFocus on Research at the High Energy frontier

Developing a Scalable Grid-Enabled Analysis EnvironmentDeveloping a Scalable Grid-Enabled Analysis Environment Broadly Applicable to Science and EducationBroadly Applicable to Science and Education Made Accessible Through the Use of Agent-Based (AI) Made Accessible Through the Use of Agent-Based (AI)

Autonomous Systems; and Web ServicesAutonomous Systems; and Web Services Serving Under-Represented CommunitiesServing Under-Represented Communities

At FIU and in South AmericaAt FIU and in South America Training and Participation in the DevelopmentTraining and Participation in the Development

of State of the Art Technologies of State of the Art Technologies Developing the Teachers and Trainers Developing the Teachers and Trainers

Relevance to Science, Education and Society at LargeRelevance to Science, Education and Society at Large Develop the Future Science and Info. S&E Workforce Develop the Future Science and Info. S&E Workforce Closing the Digital DivideClosing the Digital Divide

Documents

High Energy Physics: Networks & Grids Systems for Global Science High Energy Physics: Networks & Grids Systems for Global Science Harvey B. Newman Harvey