18
T1-NREN Luca dell’Agnello CCR, 21-Ottobre-2005

T1-NREN Luca dell’Agnello CCR, 21-Ottobre-2005. The problem Computing for LHC experiments –Multi tier model (MONARC) –LHC computing based on grid –Experiment

Embed Size (px)

Citation preview

T1-NREN

Luca dell’Agnello

CCR, 21-Ottobre-2005

The problem

• Computing for LHC experiments

– Multi tier model (MONARC)

– LHC computing based on grid

– Experiment computing models

• LHC Computing Grid (LCG)

– Policy Executive Board (PEB)

– Grid Deployment Board (GDB)

• Transfer LHC data from CERN to external

computing facilities

– Players: TO, T1s, NREN, DANTE

Task forces & activities• Group of interest on network set up in 2004 by NRENs

– Some T1s, NRENs, CERN (D Foster), DANTE

• Service Challenge activity (December 2004 service phase)

– Progressive tests of data transfer chain in cooperation with experiments

– ramp up to LHC data taking phase

• Second group (LHCOPN wg) started by GDB in January 2005– T1s, CERN, NRENs, DANTE represented

• European NRENs, ESNET, CANARIE, ASNET

– Chair: D. Foster (CERN)

– Meetings: • 20-21 January 2005 (SARA/NIKHEF)

• 8 April 2005 (SARA/NIKHEF)

• 19 July 2005 (CERN)

• Next: November 14 (SC05, Seattle)

LHCOPN wg activity• Preparation and control of plans for implementation of WAN connectivity

requested from the LHC experiments’ Computing Models

• Ensure that individual agreements among T1s will provide a coherent infrastructure to satisfy the LHC experiments' Computing Models and that there is a credible solution for the management of the end-to-end network services

• First priority: to plan the networking for the Tier-0 and Tier-1 centres

– should also cover Tier-2 centres, as appropriate information on requirements becomes available

• The group reports regularly to the GDB

• Subgroups

– IP Addressing and Routing

– Monitoring

– Operations

– Security

IN2P3IN2P3

GridKaGridKa

TRIUMFTRIUMF

ASCCASCC

FermilabFermilab

BrookhavenBrookhaven

NordicNordic

CNAFCNAF

SARASARAPICPIC

RAL

T2s and T1s are inter-connectedby the general purpose research

networks

Dedicated10 Gbit links

Any Tier-2 mayaccess data atany Tier-1 T2T2T2

T2T2T2

T2T2T2T2T2T2

T2T2T2

T2T2T2

T2T2T2

T2T2T2

T2T2T2

T2T2T2

T0/T1/T2 InterconnectivityT0/T1/T2 Interconnectivity

LHC “instrument”

• The LHC and its data collection systems• The data processing and storage units at CERN, i.e.

T0• The data processing and storage sites called T1• The data processing and storage sites called T2

• Associated networking between all T0, T1, and T2 sites.

TO-T1s 10 Gbps permanent light paths form the Optical Private Network (OPN) for the LHC instrument

LHCOPN

Who does what

• CERN will

– provide interfaces for T1’s link terminations at CERN

– host T1's equipment for T0-T1 link termination at CERN (if requested)

• T1s will

– organise physical connectivity from the T1's premises to the T0

– make available network equipment for termination point of T1-T0 link at the T1 side

• light path termination will be at T1 premises (as in INFN-CNAF case) or at NREN POP

– be ready at full bandwidth not later than Q1 2006

• SCs need to test the system (from network to applications) up to full capacity production environment

Network architecture (1)• At least one dedicated 10 Gbit/s light path between T0 and each T1

– every T0-T1 link should handle only production LHC data

– T1 to T1 traffic via the T0 allowed BUT T1s encouraged to provision direct T1-T1 connectivity

– T1-T2 and T0-T2 traffic will normally be handled by the normal L3 connectivity provided by NRENs

• Backup through L3 paths across NRENs discouraged (potential heavy interference with general purpose Internet connectivity of T0 or the T1s)

Main connection

Backup connection

L3 Backbones

Tier0

Tier1s

Tier2s

main connection

backup connection

Network architecture (2)• LHC prefixes

– Every T1 and the T0 must allocate publicly routable IP address space (the "LHC prefixes”) aggregated into single CIDR block

– T1s and T2s to exchange traffic directly with the T0 must provide the T0 with the list of its LHC prefixes ( next slide)

– LHC prefixes should be dedicated to the LHC network traffic

• Routing among T0 and T1s sites will be achieved using eBGP (no static routes!)

– No default route must be used in T1-T0 routing

– Every T1 will announce its own LHC prefixes to T0

– T0 will accept from a T1 only it own LHC prefixes plus prefixes of any T1 or T2 for which that T1 is willing to provide transit for

– TO will announce its LHC prefixes and all the LHC prefixes received in BGP to every peering T1

• inter T1s traffic through T0 allowed but not encouraged

– T1 will accept T0's prefixes, plus, if desired, some selected T1's prefixes

– T0 and T1s should announce their LHC prefixes to their upstream continental research networks (GÉANT2, Abilene, ESnet) in order to allow connectivity towards the T2s

• Every Tier must make sure that any of its own machines within the LHC prefix ranges can reach any essential service (for instance the DNS system)

Network architecture: the T2s

• T2s usually upload and download data via a particular T1

• It is possibile to provide transit for a T2 to the T0, if a T1 announces the T2's prefixes to T0 and the T0 open all the security barriers for it

– BUT this assumes a “static” allocation of a T2 to a particular T1

• It is assumed that the T1-T2 and T0-T2 traffic will be handled by the normal L3 connectivity provided by NRENs.

• The announcement of prefixes associated with Tier 2 sites is deprecated and any site has the right ignore such announcements.

Security on LHCOPN: the basics

• Security contact person (+ deputy) needed for each site

– Note that a T1 has also security contact persons for its own NREN, LCG/EGEE and maybe at HEPix level

• Incident handling and reporting

– Local site procedures for security incident handling will take precedence

– Security incident at an OPN site will be reported to the LHCOPN representatives

– The report will provide a description of the incident with an assessment of the risk posed to other OPN sites

– A site receiving such an incident report, the nominated OPN Security representative will share this information and will abide by the requirements of the local Security officer

Security on LHCOPN: implementation

• It is not possible to rely on firewalls

– throughput and performance problems

• ACL-based network security acceptable

– It is not a general access network

– Low number of expected LHC prefixes

– Low number of expected (Grid) applications

• ACL general schema (discussion still on going) – Applied inbound and outbound at T0 and T1s borders

– Only traffic originating & directed to LHC prefixes allowed on LHCOPN

– IP based ACLs to control traffic from and to end points

– Transit T1-T1 through TO for LHC prefixes allowed

– Extended ACLs should also be used where source/destination port numbers can be associated with data flows

LHCOPN Operation

• Issues

– Circuits provided and managed by NRENs, GEANT,…

– IP termination points at T0/T1s

• Need for some sort of coordination and monitor of

LHCOPN (LHCOPN NCC ?)

– Model still under discussion

– Proactive IP level monitoring

• Coordination of multi circuit problems

• Escalation to circuit provider/NREN

– Tiers contact LHCOPN NCC

Italian LHCOPN Architecture

PoP Geant2 - IT

AT

GARR

CNAFRT1.BO1

RT.RM1

RT.MI1

CERN

eBGP

AS513 – AS137

T1

T0

T1

T2 T2

10GE-LAN10GE-

LAN

STM-64 IP access

10GE-LANlightpath access

GFP-F

STM-64

10G – leased λ’s

PoP Geant2 - CH

•LAN connectivity based on 10GE technology with capacity for 10 GE link aggregation

•Data flows will terminate on disk buffer system (possibly CASTOR, but also other SRM systems under evaluation)

•1 network prefix for LHCOPN

•Security model will be based on L3 filters (ACLs) within the L3 equipment

•Monitoring via snmp (presently v2)

INFN Tier1

GARR

T1

INFN-CNAF WAN connectivity

BDSiSi SiSi

CISCO 7600

CNAF

SiSi

Juniper GARR

10 Gbps1 Gbps

128.142.224.0/20

default

Link LHCOPN default

T2

131.154.128.0/17 131.154.0.0/16

GEANT

Glossary • LHC Network Traffic: The data and control traffic that flows

between T0, the T1s, and the T2s.

• LHC prefixes: IP address space allocated by the T0 and T1s and assigned to the machines connected to the LHC-OPN.

• Light path:

1. a point to point circuit based on WDM technology

2. a circuit-switched channel between two end points with deterministic behaviour based on TDM technology

3. concatenations of (1) and (2).

• NREN: usually National Research and Education Network; here used as the collective name for a network that service either the research community or the education community or both.