25
HENP, Grids and the Networks They Depend Upon Shawn McKee ([email protected]) March 2004 National Internet2 Day

HENP, Grids and the Networks They Depend Upon Shawn McKee ([email protected]) March 2004 National Internet2 Day

Embed Size (px)

Citation preview

HENP, Grids and the Networks They Depend Upon

Shawn McKee ([email protected])

March 2004

National Internet2 Day

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 2

Outline

• HENP: Why do physicist’s care about the network?

• GRIDs and networks in HENP

• Doing physics at the LHC

• Future and Conclusions

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 3

Physics and Networks

So, why do physicists care about networks?

• I will try to explain how physics will be done at LHC and the corresponding implications for the network needs

• Networks, like Internet2, are critical for the globally distributed, data intensive e-Science collaborations, like physics at the LHC

• Details to follow…

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 4

Four LHC Experiments: The Petabyte to Exabyte ChallengeATLAS, CMS, ALICE, LHCB

Higgs + New particles; Quark-Gluon Plasma; CP Violation

Data stores ~40 Petabytes/Year and UP;Data stores ~40 Petabytes/Year and UP;

CPUCPU 0.3 Petaflops and UP0.3 Petaflops and UP

0.1 to 1.0 Exabytes (1 EB = 100.1 to 1.0 Exabytes (1 EB = 101818 Bytes) Bytes) (2007) (~2012 ?) for the LHC Experiments(2007) (~2012 ?) for the LHC Experiments

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 5

How Much Data is Involved?

104104

103103

102102

Level 1 Rate (Hz)

High Level-1 Trigger(1 MHz)High Level-1 Trigger(1 MHz)

High No. ChannelsHigh Bandwidth(500 Gbit/s)

High No. ChannelsHigh Bandwidth(500 Gbit/s)

High Data Archive(PetaByte)High Data Archive(PetaByte)

LHCBLHCB

KLOEKLOE

HERA-BHERA-B

TeV IITeV II

CDF/D0CDF/D0

H1ZEUS

H1ZEUS

UA1UA1

LEPLEP

NA49NA49

ALICEALICE

Event Size (bytes)Event Size (bytes)

104104 105105 106106

ATLASCMSATLASCMS

106106

107107

Hans Hoffman

DOE/NSF

Review, Nov 00

105105

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 6

The Problem

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 7

The Solution

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 8

What is “The Grid”?

• There are many answers and interpretations

• The term was originally coined in the mid-1990’s (in analogy with the power grid) and can be described thusly: “The grid provides flexible, secure, coordinated

resource sharing among dynamic collections of individuals, institutions and resources (virtual organizations:VOs)”

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 9

Grid Perspectives

• Users Viewpoint: – A virtual computer which minimizes time to

completion for my application while transparently managing access to inputs and resources

• Programmers Viewpoint: – A toolkit of applications and API’s which provide

transparent access to distributed resources

• Administrators Viewpoint: – An environment to monitor, manage and secure access

to geographically distributed computers, storage and networks.

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 10

Network Exponentials• Network vs. computer performance

– Computer speed doubles every 18 months– Network speed doubles every 9 months– Difference = order of magnitude per 5 years

• 1986 to 2000– Computers: x 500– Networks: x 340,000

• 2001 to 2010– Computers: x 60– Networks: x 4000

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 11

The Network

• As can be seen in the previous transparency, it can be argued it is the evolution of the network which has been the primary motivator for the Grid.

• Ubiquitous, dependable worldwide networks have opened up the possibility of tying together geographically distributed resources

• The success of the WWW for sharing information has spawned a push for a system to share resources

• The network has become the “virtual bus” of a virtual computer.

Doing Physics at the LHC

ATLAS as an example

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 13

ATLAS• A Torroidal LHC Apparatus

• Collaboration– 150 institutes– 1850 physicists

• Detector– Inner tracker– Calorimeter– Magnet– Muon

• United States ATLAS– 29 universities, 3 national labs– 20% of ATLAS

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 14

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 15

ATLAS

eeZZH *

H

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 16

Discovery Potential for SM Higgs Boson

B

S

• Good sensitivity over the full mass range from ~100

GeV to ~ 1 TeV

• For most of the mass range at least two channels available

• Detector performance is crucial: b-tag, leptons, , E resolution, / jet separation, ...

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 17

HEP Data Analysis

• Raw data – hits, pulse heights

• Reconstructed data (ESD)– tracks, clusters…

• Analysis Objects (AOD)– Physics Objects– Summarized– Organized by physics topic

• Ntuples, histograms, statistical data

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 18

Data Flow from ATLAS

level 1 - special hardware

40 MHz (~PB/sec)level 2 - embedded processorslevel 3 - PCs

75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(200-400 MB/sec)

data recording &offline analysis

ATLAS: 10 PB/y~ one million PC hard drives!

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 19

HENP Grid/Network Projects

• Grid Physics Network (GriPhyN)– Enabling R&D for advanced data grid systems,

focusing in particular on Virtual Data concept

• iVDGL: A Global Grid Laboratory– A global grid laboratory to conduct grid test “at

scale”

• There a numerous other projects focused on various aspects of grids and networks in support of HENP physics…

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 20

• UltraLight is a program to explore the integration of cutting-edge network technology with the grid computing and data infrastructure of HEP/Astronomy

• The program intends to explore network configurations from common shared infrastructure (current IP networks) thru dedicated optical paths point-to-point.

• A critical aspect of UltraLight is its integration with two driving application domains in support of their national and international eScience collaborations: LHC-HEP and eVLBI-Astronomy

• The Collaboration includes:– Caltech– Florida Int. Univ.– MIT – Univ. of Florida– Univ. of Michigan

UltraLight: Exploring Future Networks for e-Science

― UC Riverside― BNL― FNAL― SLAC― UCAID/Internet2

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 21

The Move to OGSA and then Managed Integration Systems

Incr

ease

d f

un

ctio

nal

ity,

stan

dar

diz

atio

n

Time

Customsolutions

Open GridServices Arch

GGF: OGSI, …(+ OASIS, W3C)

Multiple implementations,including Globus Toolkit

Web services + …

Globus Toolkit

Defacto standardsGGF: GridFTP, GSI

X.509,LDAP,FTP, …

App-specificServices~Integrated Systems~Integrated Systems

Stateful; ManagedWeb Services

Resrc Framwk

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 22

Managing Global Systems: Dynamic Scalable Services Architecture

MonALISA: http://monalisa.cacr.caltech.edu

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 23

Grid Analysis Environment

Analysis Clients talk standard protocols to a simple API

The secure Clarens portal hides the complexity

Key features: Global Scheduler, Catalogs, Monitoring, and Grid-wide Execution service

The network underlies and enables this model

SchedulerCatalogs

Analysis Client

Grid ServicesWeb Server

ExecutionPriority

Manager

Grid WideExecutionService

DataManagement

Fully-ConcretePlanner

Fully-AbstractPlanner

Analysis Client

AnalysisClient

Virtual Data

Replica

ApplicationsMonitoring

Partially-AbstractPlanner

Metadata

HTTP, SOAP, XML/RPC

CLARENS: Web Services Architecture

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 24

Conclusions

• Networks form the critical basis for the future of e-Sciencee-Science

• LHC Physics will depend heavily on globally distributed resources => the NETWORK is critical!

• Future requirements for grids and networking in support of HENP physics is an open question which will need investigation to define, develop and deploy the needed infrastructure in a timely manner.

March 18, 2004 Internet2 Day - Shawn McKee - University of Michigan Physics 25

For More Information…• HENP Internet2 SIG

– henp.internet2.edu

• Global Grid Forum– www.ggf.org

• International Virtual Data Grid Laboratory– www.ivdgl.org

• Grid Physics Network– www.griphyn.org

• UltraLight: ultralight.caltech.eduQuestions?