Transcript
Page 1: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

HENP DATA GRIDS HENP DATA GRIDS and STARTAPand STARTAP

Worldwide Analysis at Regional CentersWorldwide Analysis at Regional Centers Harvey B. Newman (Caltech)Harvey B. Newman (Caltech)

HPIIS ReviewHPIIS ReviewSan Diego, October 25, 2000San Diego, October 25, 2000

http://l3www.cern.ch/~newman/hpiis2000.ppthttp://l3www.cern.ch/~newman/hpiis2000.ppt

Page 2: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Next Generation Experiments: Next Generation Experiments: Physics and Technical GoalsPhysics and Technical Goals

The extraction of small or subtle new “discovery” The extraction of small or subtle new “discovery” signals from large and potentially overwhelming signals from large and potentially overwhelming backgrounds; or “precision” analysis of large samplesbackgrounds; or “precision” analysis of large samples

Providing rapid access to event samples and subsets Providing rapid access to event samples and subsets from massive data stores, from ~300 Terabytes in 2001 from massive data stores, from ~300 Terabytes in 2001 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2010.Petabytes by ~2010.

Providing analyzed results with rapid turnaround, byProviding analyzed results with rapid turnaround, bycoordinating and managing the coordinating and managing the LIMITED LIMITED computing, computing, data handling and network resources effectivelydata handling and network resources effectively

Enabling rapid access to the data and the collaboration, Enabling rapid access to the data and the collaboration, across an ensemble of networks of varying capability, across an ensemble of networks of varying capability, using heterogeneous resources.using heterogeneous resources.

Page 3: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

The Large Hadron Collider (2005-)The Large Hadron Collider (2005-)

A next-generation particle collider A next-generation particle collider

the largest superconductor installation in the largest superconductor installation in the worldthe world

A bunch-bunch collision will take place A bunch-bunch collision will take place every 25 nanoseconds: each generating ~20 every 25 nanoseconds: each generating ~20 interactionsinteractions

But only one in a trillion may lead to a But only one in a trillion may lead to a major physics discovery major physics discovery

Real-time data filtering: Real-time data filtering: Petabytes per second to Gigabytes per Petabytes per second to Gigabytes per secondsecond

Accumulated data of many Petabytes/YearAccumulated data of many Petabytes/Year

Large data samples explored and analyzed Large data samples explored and analyzed by thousands of geographically dispersed by thousands of geographically dispersed scientists, in hundreds of teamsscientists, in hundreds of teams

Page 4: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Computing Challenges: Computing Challenges: LHC ExampleLHC Example

Geographical dispersion:Geographical dispersion: of people and resources of people and resources Complexity:Complexity: the detector and the LHC environment the detector and the LHC environment Scale: Scale: Tens of Petabytes per year of dataTens of Petabytes per year of data

1800 Physicists 150 Institutes 34 Countries

Major challenges associated with:Major challenges associated with:Communication and collaboration at a distanceCommunication and collaboration at a distance

Network-distributed computing and data resources Network-distributed computing and data resources Remote software development and physics analysisRemote software development and physics analysisR&D: New Forms of Distributed Systems: Data GridsR&D: New Forms of Distributed Systems: Data Grids

Page 5: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Four LHC Experiments: The Four LHC Experiments: The Petabyte to Exabyte Petabyte to Exabyte

ChallengeChallengeATLAS, CMS, ALICE, LHCBATLAS, CMS, ALICE, LHCB

Higgs + New particles; Quark-Gluon Plasma; CP ViolationHiggs + New particles; Quark-Gluon Plasma; CP Violation

Data written to tapeData written to tape ~25 Petabytes/Year and ~25 Petabytes/Year and UP;UP; 0.25 Petaflops and UP 0.25 Petaflops and UP

0.1 to 1 Exabyte (1 EB = 100.1 to 1 Exabyte (1 EB = 101818 Bytes) Bytes) (~2010) (~2015 ?) Total for the LHC Experiments(~2010) (~2015 ?) Total for the LHC Experiments

Page 6: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Higgs SearchLEPC September 2000

Page 7: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

All charged tracks with pt > 2 GeV

Reconstructed tracks with pt > 25 GeV

(+30 minimum bias events)

109 events/sec, selectivity: 1 in 1013 (1 person in a thousand world populations)

LHC: Higgs Decay into 4 muons LHC: Higgs Decay into 4 muons (tracker only); 1000X LEP Data Rate(tracker only); 1000X LEP Data Rate

Page 8: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

On-line Filter SystemOn-line Filter System Large variety of triggers Large variety of triggers

and thresholds: select and thresholds: select physics à la cartephysics à la carte

Multi-level triggerMulti-level trigger Filter out less Filter out less

interestinginterestingeventsevents

Online reduction 10Online reduction 1077

Keep highly selected Keep highly selected eventsevents

Result: PetabytesResult: Petabytesof Binary Compactof Binary CompactData Per YearData Per Year

Level 1Level 1 - Special Hardware

- Special Hardware

Level 2Level 2 - Processors

- Processors

40 MHz

40 MHz (1000 TB/sec) equivalent)

(1000 TB/sec) equivalent)

Level 3Level 3 – Farm of Commodity CPUs

– Farm of Commodity CPUs

75 KHz 75 KHz (75 GB/sec)fully digitised

(75 GB/sec)fully digitised5 KHz5 KHz (5 (5 GB/sec)GB/sec)100 Hz

100 Hz (100 MB/sec)

(100 MB/sec)

Data Recording &

Data Recording &

Offline Analysis

Offline Analysis

Page 9: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

LHC Vision: Data Grid HierarchyLHC Vision: Data Grid Hierarchy

Tier 1

Tier2 Center

Online System

Offline Farm,CERN Computer

Ctr > 20 TIPS

FranceCentre

FNAL Center Italy Center UK Center

InstituteInstituteInstituteInstitute ~0.25TIPS

Workstations

~100 MBytes/sec

~2.5 Gbits/sec

100 - 1000

Mbits/sec

Physicists work on analysis “channels”

Each institute has ~10 physicists working on one or more channels

Physics data cache

~PByte/sec

~0.6-2.5 Gbits/sec

Tier2 CenterTier2 CenterTier2 Center

~622 Mbits/sec

Tier 0 +1

Tier 3

Tier 4

Tier2 Center Tier 2

Experiment

Page 10: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Why Worldwide Computing? Regional Why Worldwide Computing? Regional Center Concept AdvantagesCenter Concept Advantages

Managed, fair-shared access for Physicists everywhereManaged, fair-shared access for Physicists everywhere Maximize total funding resources while meeting the Maximize total funding resources while meeting the

total computing and data handling needstotal computing and data handling needs Balance between proximity of datasets to appropriate Balance between proximity of datasets to appropriate

resources, and to the usersresources, and to the users Tier-N ModelTier-N Model

Efficient use of network: higher throughputEfficient use of network: higher throughput Per Flow: Local > regional > national > internationalPer Flow: Local > regional > national > international

Utilizing all intellectual resources, in several time zonesUtilizing all intellectual resources, in several time zones CERN, national labs, universities, remote sitesCERN, national labs, universities, remote sites Involving physicists and students at their home institutionsInvolving physicists and students at their home institutions

Greater flexibility to pursue different physics interests, Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by regionpriorities, and resource allocation strategies by region

And/or by Common Interests (physics topics, subdetectors,…)And/or by Common Interests (physics topics, subdetectors,…) Manage the System’s ComplexityManage the System’s Complexity

Partitioning facility tasks, to manage and focus resourcesPartitioning facility tasks, to manage and focus resources

Page 11: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Grid Services Architecture [*]Grid Services Architecture [*]

GridGridFabricFabric

GridGridServicesServices

ApplnApplnToolkitsToolkits

ApplnsApplns

Data stores, networks, computers, display Data stores, networks, computers, display devices,… ; associated local servicesdevices,… ; associated local services

Protocols, authentication, policy, resource Protocols, authentication, policy, resource management, instrumentation, discovery,etc.management, instrumentation, discovery,etc.

......RemotRemot

eevizviz

toolkittoolkit

RemotRemotee

comp.comp.toolkittoolkit

RemotRemotee

datadatatoolkittoolkit

RemotRemotee

sensorssensorstoolkittoolkit

RemotRemotee

collab.collab.toolkittoolkit

A Rich Set of HEP Data-Analysis A Rich Set of HEP Data-Analysis Related ApplicationsRelated Applications

[*] [*] Adapted from Ian FosterAdapted from Ian Foster

Page 12: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

SDSS Data Grid (In GriPhyN): SDSS Data Grid (In GriPhyN): A Shared VisionA Shared Vision

Three main functions:Three main functions: Raw data processing on a Grid (FNAL)Raw data processing on a Grid (FNAL) Rapid turnaround with TBs of dataRapid turnaround with TBs of data Accessible storage of all image dataAccessible storage of all image data

Fast science analysis environmentFast science analysis environment(JHU)(JHU)

Combined data access + analysis Combined data access + analysis of calibrated dataof calibrated data

Distributed I/O layer and processing Distributed I/O layer and processing layer; shared by whole collaborationlayer; shared by whole collaboration

Public data accessPublic data access SDSS data browsing for SDSS data browsing for

astronomers, and studentsastronomers, and students Complex query engine for the publicComplex query engine for the public

Page 13: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

RD45, RD45, GIODGIOD Networked Object DatabasesNetworked Object Databases Clipper/GC Clipper/GC High speed access to Objects or File data High speed access to Objects or File data FNAL/SAM FNAL/SAM for processing and analysisfor processing and analysis SLAC/OOFS SLAC/OOFS Distributed File System + Objectivity Distributed File System + Objectivity Interface Interface NILE, Condor:NILE, Condor: Fault Tolerant Distributed ComputingFault Tolerant Distributed Computing

MONARCMONARC LHC Computing Models: LHC Computing Models: Architecture, Simulation, Strategy, PoliticsArchitecture, Simulation, Strategy, Politics

ALDAPALDAP OO Database Structures & Access Methods OO Database Structures & Access Methods for Astrophysics and HENP Datafor Astrophysics and HENP Data

PPDGPPDG First Distributed Data Services and First Distributed Data Services and Data Grid System PrototypeData Grid System Prototype

GriPhyN GriPhyN Production-Scale Data GridsProduction-Scale Data Grids EU Data GridEU Data Grid

Roles of ProjectsRoles of Projectsfor HENP Distributed Analysisfor HENP Distributed Analysis

Page 14: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

GIOD: Globally InterconnectedGIOD: Globally InterconnectedObject DatabasesObject Databases

Hit

Track

Detector

MultiTB OO Database MultiTB OO Database Federation; used across Federation; used across LANs and WANsLANs and WANs

170 MByte/sec CMS 170 MByte/sec CMS MilestoneMilestone

Developed Java 3D OO Developed Java 3D OO Reconstruction, Analysis Reconstruction, Analysis and Visualization and Visualization Prototypes that Work Prototypes that Work Seamlessly OverSeamlessly OverWorldwide NetworksWorldwide Networks

Deployed facilities and Deployed facilities and database federations as database federations as testbedstestbeds for Computing for Computing Model studiesModel studies

Page 15: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

The Particle Physics Data Grid The Particle Physics Data Grid (PPDG)(PPDG)

First Round Goal: First Round Goal: Optimized cached read access to 10-100 Gbytes Optimized cached read access to 10-100 Gbytes drawn from a total data set of 0.1 to ~1 Petabytedrawn from a total data set of 0.1 to ~1 Petabyte

PRIMARY SITEPRIMARY SITEData Acquisition,Data Acquisition,

CPU, Disk, CPU, Disk, Tape RobotTape Robot

SECONDARY SITESECONDARY SITECPU, Disk, CPU, Disk, Tape RobotTape Robot

Site to Site Data Replication Service

100 Mbytes/sec

ANL, BNL, Caltech, FNAL, JLAB, LBNL, ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U.Wisc/CSSDSC, SLAC, U.Wisc/CS

Multi-Site Cached File Access Service

UniversityUniversityCPU, Disk, CPU, Disk,

UsersUsers

PRIMARY SITEPRIMARY SITEDAQ, Tape, DAQ, Tape,

CPU, CPU, Disk, RobotDisk, Robot

Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot

UniversityUniversityCPU, Disk, CPU, Disk,

UsersUsers

UniversityUniversityCPU, Disk, CPU, Disk,

UsersUsers

UniversityUniversityCPU, Disk, CPU, Disk,

UsersUsers

UniversityUniversityCPU, Disk, CPU, Disk,

UsersUsers

Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot

Matchmaking, Co-Scheduling: SRB, Condor, Globus services; HRM, NWSMatchmaking, Co-Scheduling: SRB, Condor, Globus services; HRM, NWS

Page 16: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

PPDG WG1: Request ManagerPPDG WG1: Request Manager

tape system

HRM

Replicacatalog

NetworkWeatherServicePhysical file

transfer requests GRID

RequestInterpreter

DiskCache

Event-file Index

DRM

DiskCache

RequestExecutor

Logical Set of Files Request

Planner(Matchmaking)DRMDisk

Cache

CLIENT CLIENT

Logical Request

REQUEST MANAGER

Page 17: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

LLNL

Earth Grid System Prototype Earth Grid System Prototype Inter-communication DiagramInter-communication Diagram

Disk

Client

Request Manager

ISIGSI-

wuftpd

Disk

SDSCGSI-pftpd

HPSS

LBNLGSI-wuftpd

Disk

ANLGSI-

wuftpd

Disk

NCARGSI-

wuftpd

Disk

LBNL

Diskon

Clipper

HPSS

HRM

ANLReplica Catalog

GIS with NWS

GSI-ncftp

GS

I-ncftpGSI-n

cftp

LDAP Script

LDAP C API or Script

GSI-ncftp

GSI-ncftpGSI-ncftp CORBA

Page 18: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Grid Data Management Grid Data Management Prototype (GDMP)Prototype (GDMP)

Distributed Distributed Job ExecutionJob Execution and and

Data Handling:Data Handling: TransparencyTransparency PerformancePerformance Security Security Fault ToleranceFault Tolerance AutomationAutomation

Submit job

Replicate data

Replicatedata

Site A Site B

Site C

Jobs are executed locally or

remotely Data is always

written locally Data is replicated

to remote sites

Job writes data locally

GDMP V1.1: Caltech + EU DataGrid WP2 Tests by CALTECH, CERN, FNAL, Pisa for CMS “HLT” Production 10/2000;

Integration with ENSTORE, HPSS, Castor

Page 19: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

GriPhyN: Grid Physics NetworkGriPhyN: Grid Physics Network

A New Form of Integrated Distributed SystemA New Form of Integrated Distributed System

Meeting the Scientific Goals Meeting the Scientific Goals of LIGO, SDSS and the LHC Experiments of LIGO, SDSS and the LHC Experiments

Focus on Tier2 Centers at UniversitiesFocus on Tier2 Centers at Universities In a Unified Hierarchical Grid of Five LevelsIn a Unified Hierarchical Grid of Five Levels

18 Centers; with Four Sub-Implementations 18 Centers; with Four Sub-Implementations 5 Each in US for LIGO, CMS, ATLAS; 3 for SDSS5 Each in US for LIGO, CMS, ATLAS; 3 for SDSS Near Term Focus on LIGO, SDSS handling of real data; Near Term Focus on LIGO, SDSS handling of real data;

LHC “Data Challenges” with simulated dataLHC “Data Challenges” with simulated data Cooperation with PPDG, MONARC and EU DataGridCooperation with PPDG, MONARC and EU DataGrid

http://www.phys.ufl.edu/~avery/GriPhyN/http://www.phys.ufl.edu/~avery/GriPhyN/

Data Intensive Data Intensive ScienceScience

Page 20: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

GriPhyN: PetaScale Virtual Data GridsGriPhyN: PetaScale Virtual Data Grids

Virtual Data Tools

Request Planning &

Scheduling ToolsRequest Execution & Management Tools

Transforms

Distributed resources(code, storage,

computers, and network)

Resource Management

Services

Resource Management

Services

Security and Policy

Services

Security and Policy

Services

Other Grid ServicesOther Grid

Services

Interactive User Tools

Production TeamIndividual InvestigatorWorkgroups

Raw data source

Page 21: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

EU DataGridEU DataGridhttp://www.cern.ch/gridhttp://www.cern.ch/grid

Organized by CERNOrganized by CERN HEP Participants: Czech Republic, France, Germany, HEP Participants: Czech Republic, France, Germany,

Hungary, Italy, Netherlands, Portugal, UK; (US)Hungary, Italy, Netherlands, Portugal, UK; (US) Industrial participationIndustrial participation Grid forum contextGrid forum context 12 Work Packages (One coordinator each)12 Work Packages (One coordinator each)

Middleware: Work scheduling; data management; Middleware: Work scheduling; data management; application monitoring; fabric management; application monitoring; fabric management; storage managementstorage management

Infrastructure: Testbeds and demonstrators; Infrastructure: Testbeds and demonstrators; advanced network servicesadvanced network services

Applications: HEP, Earth Observation; Biology Applications: HEP, Earth Observation; Biology

[*] [*] Basic Middleware Framework: GlobusBasic Middleware Framework: Globus

Page 22: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

WorkPackageNumber

Work Package title Leadcontractor

WP1 Grid Workload Management INFN

WP2 Grid Data Management CERN

WP3 Grid Monitoring Services PPARC

WP4 Fabric Management CERN

WP5 Mass Storage Management PPARC

WP6 Integration Testbed CNRS

WP7 Network Services CNRS

WP8 High Energy Physics Applications CERN

WP9 Earth Observation Science Applications ESA

WP10 Biology Science Applications INFN

WP11 Dissemination and Exploitation INFN

WP12 Project Management CERN

EU DataGrid ProjectEU DataGrid ProjectWork PackagesWork Packages

Page 23: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Emerging Emerging Data GridData Grid User Communities User Communities

NSF Network for Earthquake Engineering NSF Network for Earthquake Engineering Simulation (NEES)Simulation (NEES) Integrated instrumentation, collaboration, Integrated instrumentation, collaboration,

simulationsimulation Grid Physics Network (GriPhyN)Grid Physics Network (GriPhyN)

ATLAS, CMS, LIGO, SDSSATLAS, CMS, LIGO, SDSS World-wide distributed analysis World-wide distributed analysis

of Petascale dataof Petascale data Access Grid; VRVS: supporting Access Grid; VRVS: supporting

group-based collaborationgroup-based collaborationAndAnd

Genomics, Proteomics, ...Genomics, Proteomics, ... The Earth System Grid and EOSDISThe Earth System Grid and EOSDIS Federating Brain DataFederating Brain Data Computed MicroTomography Computed MicroTomography …… NVO, GVONVO, GVO

Page 24: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

GRIDs In 2000: SummaryGRIDs In 2000: Summary

Grids are changing the way we do science Grids are changing the way we do science and engineeringand engineering

From Computation to DataFrom Computation to Data Key services and concepts have been Key services and concepts have been

identified, and development has startedidentified, and development has started Major IT challenges remainMajor IT challenges remain

AnAn Opportunity & Obligation for HEP/CSOpportunity & Obligation for HEP/CSCollaborationCollaboration

Transition of services and applications to production Transition of services and applications to production use is starting to occuruse is starting to occur

In future more sophisticated integrated services and In future more sophisticated integrated services and toolsets (Inter- and IntraGrids+) could drive advances in toolsets (Inter- and IntraGrids+) could drive advances in many fields of science & engineeringmany fields of science & engineering

HENP, facing the need for Petascale Virtual Data, HENP, facing the need for Petascale Virtual Data, is both an early adopter, and a leading developer is both an early adopter, and a leading developer of Data Grid technologyof Data Grid technology

Page 25: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Bandwidth Requirements Bandwidth Requirements Projection (Mbps): ICFA-NTFProjection (Mbps): ICFA-NTF

1998 2000 2005

BW Utilized Per Physicist(and Peak BW Used)

0.05 - 0.25(0.5 - 2)

0.2 – 2(2-10)

0.8 – 10(10 – 100)

BW Utilized by a UniversityGroup

0.25 - 10 1.5 - 45 34 - 622

BW to a Home Laboratory OrRegional Center

1.5 - 45 34 - 155 622 - 5000

BW to a Central LaboratoryHousing One or More MajorExperiments

34 - 155 155 - 622 2500 - 10000

BW on a transoceanic Link 1.5 - 20 34 - 155 622 - 5000

Page 26: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

US-CERN BW Requirements US-CERN BW Requirements Projection Projection (PRELIMINARY)(PRELIMINARY)

2001 2002 2003 2004 2005 2006

Installed LinkBW in MbpsIncl. New SLACThroughput [*]

310

(120)

622

(250)

1600

(400)

2400

(600)

4000

(1000)

6500 [#]

(1600)

[#] Includes ~1.5 Gbps Each for ATLAS and CMS, Plus Babar, Run2 and Other[*] D0 and CDF at Run2: Needs Presumed to Be to be Comparable to BaBar

Page 27: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Daily, Weekly, Monthly and Yearly Statistics Daily, Weekly, Monthly and Yearly Statistics on the 45 Mbps US-CERN Linkon the 45 Mbps US-CERN Link

Page 28: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

HEP Network RequirementsHEP Network Requirementsand STARTAPand STARTAP

Beyond the requirement of adequate bandwidth, Beyond the requirement of adequate bandwidth, physicists in HENP’s major experiments depend on:physicists in HENP’s major experiments depend on:

Network and user software that will work together to Network and user software that will work together to provide high throughput and to manage the provide high throughput and to manage the bandwidth effectivelybandwidth effectively

A suite of videoconference and high-level tools for A suite of videoconference and high-level tools for remote collaboration that make data analysis from remote collaboration that make data analysis from the US (and from other world regions) effectivethe US (and from other world regions) effective

An integrated set of local, regional, national and An integrated set of local, regional, national and international networks that interoperate seamlessly, international networks that interoperate seamlessly, without bottleneckswithout bottlenecks

Page 29: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Configuration at Chicago with Configuration at Chicago with KPN/Qwest KPN/Qwest

Page 30: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

HEP Network RequirementsHEP Network Requirementsand STARTAPand STARTAP

The STARTAP, a professionally managed international The STARTAP, a professionally managed international peering point with an open HP policy, has been and peering point with an open HP policy, has been and will continue to be vital for US involvement in the LHC, will continue to be vital for US involvement in the LHC, and thus for the progress of the LHC physics program.and thus for the progress of the LHC physics program.

Our development of worldwide Data Grid systems,Our development of worldwide Data Grid systems,in collaboration with the European Union and otherin collaboration with the European Union and otherworld regions, will depend on the STARTAP for jointworld regions, will depend on the STARTAP for jointprototyping, tests and developments using next-prototyping, tests and developments using next-generation network, software and database technology.generation network, software and database technology.

A scalable and cost-effective growth path for the A scalable and cost-effective growth path for the STARTAP will be needed, as a central component STARTAP will be needed, as a central component of international networks for HENP, and other fields.of international networks for HENP, and other fields.

An optical STARTAP handling OC-48 and OC-192An optical STARTAP handling OC-48 and OC-192links, with favorable peering and transit arrangements links, with favorable peering and transit arrangements across the US would be well-matched to our future plans.across the US would be well-matched to our future plans.

Page 31: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

US-CERN line connection to Esnet:US-CERN line connection to Esnet:to HENP Labs Through STARTAPto HENP Labs Through STARTAP

Page 32: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

TCP throughput performance: TCP throughput performance: Caltech/CERN Via STARTAPCaltech/CERN Via STARTAP

From Caltech to CERN

From CERN to Caltech

Page 33: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Vancouver

Calgary

Regina Winnipeg

Ottawa

Montreal

Toronto

Halifax

St. John’s

Fredericton

Charlottetown

Chicago

Seattle

New York

Los Angeles Miami

Europe

Dedicated Wavelength

or SONET channel

OBGP switches

Optional Layer 3 aggregation service

Large channel WDM system

CA*net 4 Possible ArchitectureCA*net 4 Possible Architecture

Pasadena

Page 34: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Intermediate ISP

Tier 1 ISPTier 2 ISP

AS 1 AS 2 AS 3 AS 4

AS 5

Dual Connected

Router to AS 5

Optical switch looks like BGP router and AS1 is direct connected to Tier 1 ISP but still transits AS 5

Router redirects networks with heavy traffic load to optical switch, but routing policy still maintained by ISP

Bulk of AS 1 traffic is to Tier 1 ISP

For simplicity only data forwarding

paths in one direction shown

Red Default Wavelength

OBGP Traffic Engineering - PhysicalOBGP Traffic Engineering - Physical

Page 35: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Worldwide Computing IssuesWorldwide Computing Issues

Beyond Grid Prototype Components: Integration of Beyond Grid Prototype Components: Integration of Grid Prototypes for End-to-end Data TransportGrid Prototypes for End-to-end Data Transport

Particle Physics Data Grid (PPDG) ReqMParticle Physics Data Grid (PPDG) ReqM PPDG/EU DataGrid GDMP for CMS HLT ProductionsPPDG/EU DataGrid GDMP for CMS HLT Productions

Start Building the Grid System(s): Integration with Start Building the Grid System(s): Integration with Experiment-specific software frameworksExperiment-specific software frameworks

Derivation of Strategies (MONARC Simulation System) Derivation of Strategies (MONARC Simulation System) Data caching, query estimation, co-schedulingData caching, query estimation, co-scheduling Load balancing and workload management amongst Load balancing and workload management amongst

Tier0/Tier1/Tier2 sites (SONN by Legrand)Tier0/Tier1/Tier2 sites (SONN by Legrand) Transaction robustness: simulate and verifyTransaction robustness: simulate and verify

Transparent Interfaces for Replica ManagementTransparent Interfaces for Replica Management Deep versus shallow copies: Thresholds; Deep versus shallow copies: Thresholds;

tracking, monitoring and controltracking, monitoring and control

Page 36: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

VRVS Remote Collaboration VRVS Remote Collaboration System: StatisticsSystem: Statistics

0200400600800

1000120014001600180020002200240026002800300032003400

Ja

n-9

7F

eb

-97

Ma

r-9

7A

pr-

97

Ma

y-9

7J

un

-97

Ju

l-9

7A

ug

-97

Se

p-9

7O

ct-

97

No

v-9

7D

ec

-97

Ja

n-9

8F

eb

-98

Ma

r-9

8A

pr-

98

Ma

y-9

8J

un

-98

Ju

l-9

8A

ug

-98

Se

p-9

8O

ct-

98

No

v-9

8D

ec

-98

Ja

n-9

9F

eb

-99

Ma

r-9

9A

pr-

99

Ma

y-9

9J

un

-99

Ju

l-9

9A

ug

-99

Se

p-9

9O

ct-

99

No

v-9

9D

ec

-99

Ja

n-0

0F

eb

-00

Ma

r-0

0A

pr-

00

Ma

y-0

0J

un

-00

Ju

l-0

0A

ug

-00

Se

p-0

0

Months

Number of Machines and People registered in VRVS

Machines Registered People Registered

30 Reflectors52 Countries

Mbone, H.323, MPEG2Streaming, VNC

Page 37: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

VRVS: Mbone/H.323/QT VRVS: Mbone/H.323/QT SnapshotSnapshot

VRVS Future evolution/integration

VRVS Future evolution/integration (R&D)(R&D)

Wider Deployment and Support of

Wider Deployment and Support of

VRVS.VRVS.High Quality video and audio (MPEG1,

High Quality video and audio (MPEG1,

MPEG2,..).MPEG2,..).Shared virtual workspaces, applications,

Shared virtual workspaces, applications,

and and environment environment

Integration of H.323 ITU Standard

Integration of H.323 ITU Standard

Quality of Service (QoS) over the

Quality of Service (QoS) over the

networknetwork

Improved security, authentication and

Improved security, authentication and confidentialityconfidentiality

Remote control of video cameras via a

Remote control of video cameras via a

Java appletJava applet

Page 38: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

Demonstrations (HN, J. Bunn, Demonstrations (HN, J. Bunn, P. Galvez): CMSOO and VRVS P. Galvez): CMSOO and VRVS

CMSOO: CMSOO: Java 3D Java 3D Event Event

DisplayDisplayIGrid2000

Yokohama, July 2000

Page 39: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

STARTAP: Selected HENP STARTAP: Selected HENP Success Stories (1)Success Stories (1)

Onset of large scale optimized Production file Onset of large scale optimized Production file transfers, involving both HENP Labs & Universitiestransfers, involving both HENP Labs & Universities Babar, CMS, ATLASBabar, CMS, ATLAS Upcoming D0, CDF at FNAL/Run2; RHICUpcoming D0, CDF at FNAL/Run2; RHIC

Seamless remote access to Object databasesSeamless remote access to Object databases CMSOO demos: IGrid2000 (Yokohama)CMSOO demos: IGrid2000 (Yokohama) Now starting on distributed CMS ORCA OO Now starting on distributed CMS ORCA OO

(TB to PB) DB Access(TB to PB) DB Access CMS User Analysis Environment (UAE)CMS User Analysis Environment (UAE)

Worldwide Grid-enabled view of the data, Worldwide Grid-enabled view of the data, along along with visualizations, data with visualizations, data

presentation presentation and analysisand analysis A User-view across the Data GridA User-view across the Data Grid

Page 40: HENP DATA GRIDS and STARTAP HENP DATA GRIDS and STARTAP Worldwide Analysis at Regional Centers Harvey B. Newman (Caltech) HPIIS Review San Diego, October

STARTAP: Selected HENP STARTAP: Selected HENP Success Stories (2)Success Stories (2)

A Principal testbed to develop production A Principal testbed to develop production Grid systems, of worldwide scope Grid systems, of worldwide scope Grid Data Management Prototype (GDMP; US/EU) Grid Data Management Prototype (GDMP; US/EU) GriPhyN: 18-20 University facilities serving GriPhyN: 18-20 University facilities serving

CMS, ATLAS,LIGO and SDSS, CMS, ATLAS,LIGO and SDSS, Built on a strong foundation of grid security and Built on a strong foundation of grid security and

information infrastructure Foundationinformation infrastructure Foundation Deploying a Grid Virtual Data Toolkit (VDT)Deploying a Grid Virtual Data Toolkit (VDT)

VRVS: Worldwide-extensible videoconferencingVRVS: Worldwide-extensible videoconferencing and shared virtual spaces and shared virtual spaces

Future: Forward-looking view of Mobile Agent Future: Forward-looking view of Mobile Agent Coordination Architectures Coordination Architectures Survivable Loosely Coupled Systems withSurvivable Loosely Coupled Systems with

Unprecedented Scalability Unprecedented Scalability


Recommended