Upload
others
View
16
Download
0
Embed Size (px)
Citation preview
CISL UpdatepOperations and ServicesCISL HPC Advisory Panel MeetingCISL HPC Advisory Panel Meeting
29 October 2009
Anke KamrathAnke [email protected]@ucar.edu
Operations and Services DivisionOperations and Services DivisionOperations and Services DivisionOperations and Services DivisionComputational and Information Systems LaboratoryComputational and Information Systems Laboratory
1 CHAP Meeting14 May 2009
Overview
• Staff Comings and Goings in OSD• Computing, Storage, Data and Software
Updates• University Workshops• University Workshops• Preparations for NWSC already underway
– Security Review– User Surveys– Accounting/Allocation Systems– HPSS MigrationHPSS Migration– RFP Process for NWSC Procurement
2 CHAP Meeting29 October 2009
Staff Comings and Goings…Ch• Changes– Director of Operations and Services
• Anke Kamrath, Joined NCAR July, 2009
– Consulting Services Group (CSG)• Si Liu, consultant replaced Wei Huang (moved to VETS)
– Computer Production Group (CPG)R ld R d O t l d K Alb t ( d t RAL)• Ronald Reed, Operator replaced Ken Albertson (moved to RAL)
– Data Analysis Systems Group (DASG)• Kendall Southwick, Student Assistant on Vapor
Supercomputer Systems Group (SSG)– Supercomputer Systems Group (SSG)• Irfan Elahi, promoted to group head• Russell Gonsalves• Nate RiniNate Rini
• Openings– CSG Consulting Position. SE III (Mike Page Replacement)
DASG Vapor Position SE I opening
3 CHAP Meeting29 October 2009
– DASG Vapor Position. SE I opening.– Co-location Manager Position
Computing Update
• Strong Demand for Bluefire– 2 Large memory bluefire nodes redeployed 2 Large memory bluefire nodes redeployed
into batch pool. Bluefire now has 120 32-processor batch nodes
– IPCC AR5 Ramping up (Sept 09-Dec 10)• 12M GAUs needed to complete• Will Generate 1.3PB data
– High Utilization, Low Queue Wait Times
• Decommissioning Lightning and Pegasus
• 3 Racks Added to Frost (BlueGene )System)
– TeraGrid resource (>20 Tflops)
4 CHAP Meeting29 October 2009
High Utilization, Low Queue Wait Times
80.0%
90.0%
100.0%
Bluefire System Utilization (daily average)
• Bluefire system utilization is routinely
40 0%
50.0%
60.0%
70.0%
utilization is routinely >90%
• Average queue-wait time < 1hr
10.0%
20.0%
30.0%
40.0%
Spring ML Powerdown May 3 Firmware& Software Upgrade
time < 1hr
14000
Average Queue Wait Times for User Jobs at the NCAR/CISL Computing Facility
0.0%
Nov
-08
Dec
-08
Jan-
09
Feb-
09
Mar
-09
Apr
-09
May
-09
Jun-
09
Jul-0
9
Aug-
09
Sep-
09
Oct
-09
Nov
-09
4000
6000
8000
10000
12000
# of
Job
s Premium
Regular
Economy
Standby
bluefire lightning blueice bluevistasince Jun'08 since Dec'04 Jan'07-Jun'08 Jan'06-Sep'08
Peak TFLOPs > 77.0 TFLOPs 1.1 TFLOPs 12.2 TFLOPs 4.4 TFLOPsall jobs all jobs all jobs all jobs
Premium 00:05 00:11 00:11 00:12
NCAR/CISL Computing Facility0
2000
<2m
<5m
<10m
<20m
<40m
<1h
<2h
<4h
<8h
<1d
d
Queue-Wait Time
5 CHAP Meeting29 October 20095
Regular 00:36 00:16 00:37 01:36Economy 01:52 01:10 03:37 03:51Stand-by 01:18 00:55 02:41 02:14
<2
>2d Queue
Archive UpdateArchive Update
• NCAR MSS exceeds 8 PBs total data stored –Sep 2009p
• MSS Transition to HPSS– Target date Jan 2011– Erich to present more on this.p
• AMSTAR Data Ooze (migrating data to new tape technology)– Optimized ooze software deployed July 2009. Streams tape to
tape, creates two copies in parallel.– Oozing on average 20+ TBs per day, 6 PBs of data to ooze of
which 4 PBs are uniqueE ti t d l ti d t M A 2010 ll h d f th – Estimated completion date Mar-Apr 2010 well ahead of the Dec 2010 EOL on Powderhorn tape libraries
6 CHAP Meeting29 October 2009
Storage Update:D t S i R d iData Services Redesign
• Pending management review• Near-term Goals:Near term Goals:
– Creation of unified and consistent data environment for NCAR HPC– High-performance availability of central filesystem from many projects/systems
(RDA, ESG, CAVS, Bluefire, TeraGrid)
• Longer term Goals:• Longer-term Goals:– Filesystem cross-mounting– Global WAN filesystems (GPFS, and/or Lustre)
• Tentative Schedule:– Phase 1: March 2010
8 CHAP Meeting29 October 2009
Data Collections Update:Major update of International Comprehensive Ocean-ajo upda e o e a o a Co p e e s e Ocea
Atmosphere Data Set (ICOADS) in the RDA
• Release 2.5 e ease 5covers 1662-2007
• Many data sources added sources added over the full period of record
• NEW monthly automated automated update procedure adds preliminary data, now for 2008 - Sept. 2009p
• ICOADS is a 25-year collaborative project with NOAA
9 CHAP Meeting29 October 2009
Annual distribution (1937-2007) of major platform types in Release 2.5 shown as millions of reports per year.
Software Update
• Version 1.5.0 released– Numerous new features Numerous new features
aimed at WRF community• Geo-referenced imagery• Integration with NCL• Moving domainsMoving domains
– Block structured AMR Grids– Ease of use improvements
• New funding streamsd– NSF XD Vis Award
• NCAR, TACC, Utah, Purdue• 2009-2012, $386k• Focus: petascale DAV This typhoon Jangmi image from MMM’s Bill Kuo and
– TeraGrid GIG Award• 2009-2011, $112k• Focus: progressive access
data models
yp g gWei Wang illustrates several new capabilities. A moving, nested domain is tracked by VAPOR. Satellite imagery and plots generated by NCL are correctly positioned in the 4D scene with geo-referencing.
10 CHAP Meeting29 October 200910
Support for Summer ‘09 Workshops
• Climate Modeling Primer, July 27-31– Organized by CGD
(Gettelman lead) and held at NCAR
– 41 participants, mostly graduate students
– Ran low resolution versions of CAM; some work with CCSM
– Provided 10-15 dedicated nodes and overview of using bluefire to participants
11 CHAP Meeting29 October 2009
Support for Summer ‘09 Workshops
• Scaling to Petascale, August 3-7– Organized by Great Lakes Consortium for Petascale
CComputation – Supported by the NSF through the Blue Waters Project (NCSA
and others)Participants joined in from four locations using high definition – Participants joined in from four locations using high-definition streaming video
– Ran a variety of applications– Provided 16 dedicated – Provided 16 dedicated
nodes during hands-on sessions
12 CHAP Meeting29 October 2009
Support for Summer ‘09 Workshops (continued)(continued)
• Ice Sheet Modeling Workshop, August 3-14– Organized by Christina Hulbe (Portland State Organized by Christina Hulbe (Portland State
University), Jesse Johnson (U of Montana), and Cornelis van der Veen (U of Kansas); supported by NSFBrought current and future ice sheet scientists – Brought current and future ice-sheet scientists together to develop better models for the projection of future sea-level rise
– Provided accounts for each participant and d di t d h bl fi i i dedicated share queue on bluefire since using single processor codes
– Some participants will continue to use bluefire based on Johnson’s NSF award for this workshop
http://websrv.cs.umt.edu/isis/index.php/Summer_Modeling_School
13 CHAP Meeting29 October 2009
Preparations for NWSC Already Underway
– Storage Environment• HPSS Migration• Data Services Redesign (DSR)
D t W kfl R i t A l i• Data Workflow Requirements Analysis
– Security Review• Balancing “Science and Security”• Phase 1: Underway for DSR• Phase 1: Underway for DSR• Phase 2: Global Filesystems
– Accounting/Allocation Systems• Current System built in 70s.Current System built in 70s.• Reviewing requirements• Leveraging other products
(e.g., Gold, AmieGold)
– RFP Process for NWSC Procurement• Tom Engel to Discuss Further
– User Survey
14 CHAP Meeting29 October 2009
User Survey Plans
Q1 2010 f ti Q1 2010 survey of active users– Use to baseline current environment for NWSC– Understand where we’re doing well and where we can g
improve• Investigate any area with
>10% dissatisfaction
– Evaluate future needs– Topics
• Ease of access within current security environment
• Ease of use (initially and on-going)• Consulting (include methodology of service delivery)• Documentation for users• Training desired and method of delivery
Follow-up survey ~4 months after NWSC computer
15 CHAP Meeting29 October 2009
p y pavailable to users
On the road to a PetaFlop Computer…On the road to a PetaFlop Computer…
• And 6-15 Petabyte Filesystem• And +100 Petabyte Archive
16 CHAP Meeting29 October 2009
Peak TFLOPs at NCAR (All Systems)
CURRENT NCAR COMPUTING >100TFLOPS
100
IBM POWER6/Power575/IB (bluefire)
IBM POWER6/Power575/IB (firefly)
IBM POWER5 / 575/HPS
ICESS Phase 2
80
IBM POWER5+/p575/HPS (blueice)
IBM POWER5/p575/HPS (bluevista)
IBM BlueGene/L (frost)
60IBM Opteron/Linux (pegasus)
IBM Opteron/Linux (lightning)
IBM POWER4/Federation (thunder)
ICESS Phase 1
20
40 IBM POWER4/Colony (bluesky)
IBM POWER4 (bluedawn)
SGI Origin3800/128
ARCS Phase 3
ARCS Phase 2
ARCS Phase 4
Linux
bluefire
0
20
J 00 J 01 J 02 J 03 J 04 J 05 J 06 J 07 J 08 J 09 J 10
IBM POWER3 (blackforest)
IBM POWER3 (babyblue)
lightning/pegasus
blueskyblackforest
frostbluevista
blueice
ARCS Phase 1
firefly
17 CHAP Meeting29 October 2009
Jan-00 Jan-01 Jan-02 Jan-03 Jan-04 Jan-05 Jan-06 Jan-07 Jan-08 Jan-09 Jan-10
NWSC HPC ProjectionP k PFLOP t NCAR
1.8
2.0
Peak PFLOPs at NCAR
NWSC HPC (uncertainty)
1.4
1.6
NWSC HPC
IBM POWER6/Power575/IB (bluefire)
1.0
1.2IBM POWER5+/p575/HPS (blueice)
IBM POWER5/p575/HPS (bluevista)
0 4
0.6
0.8 IBM BlueGene/L (frost)
IBM Opteron/Linux (pegasus)
0.0
0.2
0.4IBM Opteron/Linux (lightning)
IBM POWER4/Colony (bluesky)bluesky
ARCS Phase 4ICESS Phase 1
bluefire
ICESS Phase 2
frost
18 CHAP Meeting29 October 200918
Jan-04 Jan-05 Jan-06 Jan-07 Jan-08 Jan-09 Jan-10 Jan-11 Jan-12 Jan-13 Jan-14
NCAR Data Archive ProjectionTotal Data in the NCAR Archive (Actual and Projected)
120
Total Data in the NCAR Archive (Actual and Projected)
TotalUnique
80
100
60
80
Peta
byte
s
40
P
0
20
9 0 2 3 4 5 6 7 8 9 0 2 3 4 5
19 CHAP Meeting29 October 200919
Jan-
99
Jan-
00
Jan-
01
Jan-
02
Jan-
03
Jan-
04
Jan-
05
Jan-
06
Jan-
07
Jan-
08
Jan-
09
Jan-
10
Jan-
11
Jan-
12
Jan-
13
Jan-
14
Jan-
15