15
Site Lightning Report: MWT2 Mark Neubauer University of Illinois at Urbana-Champaign US ATLAS Facilities Meeting @ UC Santa Cruz Nov 14, 2012

Site Lightning Report: MWT2

  • Upload
    chul

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Site Lightning Report: MWT2. Mark Neubauer University of Illinois at Urbana-Champaign US ATLAS Facilities Meeting @ UC Santa Cruz Nov 14, 2012. Midwest Tier-2. 0101001011110…. The Team: - PowerPoint PPT Presentation

Citation preview

Page 1: Site Lightning Report: MWT2

Site Lightning Report: MWT2

Mark NeubauerUniversity of Illinois at Urbana-Champaign

US ATLAS Facilities Meeting @ UC Santa CruzNov 14, 2012

Page 2: Site Lightning Report: MWT2

Midwest Tier-2

2

Three site Tier-2 consortia

0101001011110…

The Team:Rob Gardner, Dave Lesny, Mark Neubauer, Sarah Williams, Illija Vukotic, Lincoln Bryant, Fred Luehring

Page 3: Site Lightning Report: MWT2

Midwest Tier-2

3

Focus of this talk:Illinois Tier-2

Page 4: Site Lightning Report: MWT2

Tier-2 @ Illinois

4

History of the project:– Fall 2007−: Development/operation of T3gs– 08/26/10: Torre’s US ATLAS IB talk– 10/26/10: Tier2@Illinois Proposal

submitted to US ATLAS Computing Mgmt– 11/23/20: Proposal formally accepted– 10/5/11: First successful test of ATLAS

production jobs run on Campus Cluster(CC)• Jobs read data from our Tier3gs cluster

Page 5: Site Lightning Report: MWT2

Tier-2 @ Illinois

5

History of the project (cont):– 03/1/12: Successful T2@Illinois Pilot• Squid proxy cache, Condor head node job

flocking from UC– 4/4/12: First hardware into Taub cluster• 16 compute nodes (dual x5650, 48 GB

memory, 160 GB disk, IB) 196 cores• 60 2TB drives in DDN array 120 TB raw

– 4/17/12: PerfSONAR nodes online

Page 6: Site Lightning Report: MWT2

Illinois Tier-2

6

T2onTaub

History of the project (cont)–4/18/12: T2@Illinois in production

Page 7: Site Lightning Report: MWT2

Illinois Tier-2

7

Stable operation: Last two weeks

Page 8: Site Lightning Report: MWT2

Illinois Tier-2

8

Last day on MWT2:

Page 9: Site Lightning Report: MWT2

Why at Illinois?

9

• National Center for Supercomputing Applications (NCSA)

• National Petascale Computing Facility (NPCF): Blue Waters

• Advanced Computation Building– 7000 sq. ft with 70” raised floor– 2.3 MW of power capacity– 250 kW UPS– 750 tons of cooling capacity

• Experience in HEP computing

NCSA Building

ACB

NPCF

Page 10: Site Lightning Report: MWT2

Tier-2 @ Illinois

10

• Deployed in a shared campus cluster (CC) in ACB– “Taub” first instance of CC– Tier2@Illinois on Taub in production within MWT2

• Pros (ATLAS perspective)– Free building, power, cooling, core

infrastructure support w/ plenty of room for future expansion

– Pool of Expertise, heterogeneous HW– Bulk Pricing important given DDD

(Dell Deal Demise)– Opportunistic resources

• Challenges– Constraints on hardware, pricing, architecture, timing

Page 11: Site Lightning Report: MWT2

Tier-2 @ Illinois

11

Current CPU and disk resources:• 16 compute nodes (taubXXX)– dual x5650, 48 GB memory, 160 GB disk, IB) 196

cores ~400 js

• 60 2TB drives in Data Direct Networks (DDN) array 120 TB raw ~70 TB usable

Page 12: Site Lightning Report: MWT2

Tier-2 @ Illinois

12

• Utility nodes / services (.campuscluster.illinois.edu):– Gatekeeper (mwt2-gt)• Primary schedd for Taub condor pool

– Flocks other jobs to UC and IU Condor Pools

– Condor Head Node (mwt2-condor)• Collector and Negotiator for Taub condor pool

– Accepts flocked jobs from other MWT2 Gatekeepers

– Squid (mwt2-squid)• Proxy cache for CVMFS, Frontier for Taub (backup for IU/UC)

– CVMFS Replica server (mwt2-cvmfs)• CVMFS replica server for Master CVMFS server

– dCache s-node (mwt2-s1)• Pool node for GPFS data storage (installed, dCache in progress)

Page 13: Site Lightning Report: MWT2

Next CC Instance (to be named) Overview

13

• Mix of Ethernet-only and Ethernet + InfiniBand connected nodes– assume 50-100% will be IB enabled

• Mix of CPU-only and CPU+GPU nodes– assume up to 25% of nodes will have GPUs

• New storage device and support nodes– added to shared storage environment– Allow for other protocols (SAMBA, NFS, GridFTP, GPFS)

• VM hosting and related services– persistent services and other needs directly related to

use of compute/storage resources

Page 14: Site Lightning Report: MWT2

Next CC Instance (basic configuration)

14

• Dell PowerEdge C8220 2-socket Intel Xeon E5-2670 – 8-core Sandy Bridge processors @ 2.60GHz– 1 “sled” : 2 SB processors– 8 sleds in 4U : 128 cores

• Memory configuration options: – 2GB/core, 4GB/core, 8GB/core

• Options:– InfiniBand FDR (GigE otherwise)– NVIDIA M2090 (Fermi

GPU) Accelerators– Storage via DDN SFA12000– can add in 30TB (raw) increments

Dell C8220 compute sled

Page 15: Site Lightning Report: MWT2

Summary and Plans

15

• New Tier-2 @ Illinois– Modest (currently) resource integrated into MWT2 and

in production use– Cautious optimism: Deploying an Tier-2 within a shared

campus cluster a success• Near term plans– Buy into 2nd campus cluster instance• $160k of FY12 funds with 60/40 CPU/disk split

– Continue dCache deployment– LHCONE @ Illinois due to turn on 11/20/12 – Virtualization of Tier-2 utility services– Better integration into MWT2 monitoring