Upload
kenneth-gilmore
View
214
Download
0
Embed Size (px)
Citation preview
Information Technology
2
IT Briefing July 19, 2007
Core Website Redesign IT Sourcing High Performance
Computing Cluster NetCom Updates CTS Updates
John Mills Huron Consulting Keven Haynes Paul Petersen Karen Jenkins
Information Technology
4
IT Hardware Initiative Discussion
Kevin McClean, HuronJohn Scarbrough, Emory
David Wright, Emory
Information Technology
5
Discussion Outline
Background & Objectives Project Scope Next Steps Service Expectations & Concerns Questions
Information Technology
Background & Objectives
Background: Emory-wide initiative $10M in annual spend reviewed Scope: PCs, “small” departmental servers, printers, peripherals,
software Completed initial data analysis; identified opportunity Other
Objectives: Maintain or improve product quality and service levels Cost savings Leverage Emory-wide IT spend Evaluate current contract (Expires 1/08) Evaluate IT Hardware suppliers / industry Evaluate PC market & potential options Assess potential for further IT consolidation
6
Information Technology
Project Scope - Category Spend
($'s in 000's)
CategoryEst Annual
QuantityEst Annual
Spend% ofTotal
Desktops 3,532 $3,892 38%Notebooks 1,630 $2,553 25%Peripherals/Printers 12,642 $1,785 18%Software $800 8%Servers 83 $640 6%Other $455 4%
Total $10,125 100%
Source: Based on A/P & P-Card spend for University, Hospital, (April 06 – March 07) and Clinic (FY06), Supplier reporting
Information Technology
Finalize supplier strategy / Determine suppliers to engage
Send introduction letter with core requirement to select suppliers to solicit proposals - 7/20 Responses due: 8/03
Analyze initial supplier proposals Conduct supplier meetings to discuss proposals –
Week of 8/13 Determine need for additional supplier proposals
and meetings Finalize new agreement - 9/15
Next Steps
Information Technology
Service Expectations & Concerns
All bundles must meet minimum recommendations set by DeskNet
Dedicated technical account manager / support engineer On-site/local spares Web based ability to order parts / Next day delivery Escalated entry into support organization Option to expedite delivery (for set fee) MAC addresses emailed to requester on ship Load pre-defined image on system Option to change boot order (PXE boot) Quarterly review of product roadmap Evaluation of systems required prior to changing any bundle
agreement Consolidated packaging of system
Information Technology
What does High Performance Computing (HPC) mean?
Computing used for scientific research A.k.a, “Supercomputing” Highly calculation-intensive tasks (e.g.,
weather forecasting, molecular modeling, string matching)
Information Technology
What is an HPC cluster?
A (large?) collection of computers, connected via high speed network or fabric
Sometimes acts/viewed as one computer Sometimes share common storage Sometimes run identical instances of the
same operating system Definition of cluster is fluid
Information Technology
What is an HPC cluster, again?
Uses multiple CPUs to distribute computational load, aggregate I/O.
Computation runs in parallel. Not necessarily designed for fail-over, High
Availability (HA) or load-balancing Different from a Grid Work managed via a “job scheduler”
Information Technology
Our new cluster (overview):
256 dual-core, dual-socket AMD Opteron-based compute nodes - 1024 cores total
8 GB RAM/node, 2 GB RAM/core 250 GB local storage per node ~ 8 TB global storage (parallel file system) Gigabit Ethernet, with separate
management network 11 additional servers
Information Technology
Our cluster: Compute Nodes
256 Sun x2200s AMD Opteron 2218 processors CentOS 4 Linux (whitebox Red Hat) 8 GB DDR2 RAM, except “Fat” Nodes with
32 GB RAM, local 250 GB SATA drive Single gigabit data connection to switch Global filesystem (IBRIX) mounted
Information Technology
Our cluster: Networking
Separate Data and Management networks Data Network: Foundry BI-RX 16 Management network: 9 Foundry
stackables MRV console switches Why ethernet? Open, supported, easy,
cheap.
Information Technology
Our cluster: Cluster-wide Storage
Global, parallel file system: IBRIX Sun StorEdge 6140, five trays of 16 15Krpm
FC drives, connected via 4 GB fibre connections.
Five Sun x4100 file-system servers: one IBRIX Fusion Mgr, four Segment servers w/four bonded ethernet connections.
Information Technology
The IBRIX file system
Looks like an ext3 file system, because it is (not NFS 4) - Segmenting ext3.
Scales (horizontally) to thousands of servers, hundreds of petabytes
Efficient with both small and large I/O Partial online operation, dynamic load
balancing Will run on any hardware (Linux only)
Information Technology
The Scheduler: Sun Grid Engine
Users submit work to cluster via SGE (‘qsub’ command)and ssh
SGE can manage up to 200,000 job submissions
Distributed Resource Management (DRM) Policy-based resource allocation algorithms
(queues)
Information Technology
Cluster-based Work
Cluster designed as “beowulf-style”, for high-throughput “serial/batch” processing.
“Embarrassingly Parallel” jobs best MPI-based parallel processing possible, but
difficult due to multiple-core architecture
Information Technology
Applications
MATLAB Geant4 Genesis (Neuroscience) Soon: iNquiry (BioInformatics) Gcc compilers (soon: PGI compilers) More…
Information Technology
Performance
Estimated ~3 Teraflops at 80% efficiency (theoretical)
Achieved 2 GB/sec writes over the network 10 minutes of cluster operation = ~7 days
on a fast desktop 8.5 hours -> entire year of 24-hour days
Information Technology
Project Status
Cluster went “live” July 1st We are converting over billing
arrangements: Annual -> $/CPU hour Software installation, hardware
replacement, developing processes Much testing…
Information Technology
Contact Info
ELLISPE is managed by the HPC Group: Keven Haynes, [email protected] Michael Smith, [email protected] Ken Guyton, [email protected]
Website soon…
Information Technology
Agenda
Single Voice Platform Phase I Complete Phase II Starting
Backbone and Firewall Firewall Status Multicasting Border Changes
Wireless NATing iPhones
Information Technology
Single Voice Platform
Single Voice Platform Name given to the project which
consolidates Emory’s three phone switches to one
This project also sets Emory’s direction for VoIP/IP Telephony
Project began March 2006 with a formal RFQ process
Avaya was selected
Information Technology
Single Voice Platform
Phase 1 – Consolidate TEC & ECLH Switches Upgrade to the latest Avaya switch Upgrade to IP Connect (provides redundancy) Consolidate the TEC & ECLH switch databases Phase I completed on May 18th
Phase 2 – Convert the rest of EHC to SVP Transition Nortel phones in EHC (EUH & WHSCAB) to
Avaya Approved and Completely funded
Phase 3 – Convert remainder of Nortel phones to new Platform
Information Technology
Firewall and Backbone
Firewall ResNet Firewall – October 2006 HIPAA Firewall – March 2007 Academic Firewall – April 2007 Admin Core/DMZ Firewall – Attempted May 6th 5.4.eo5 Code
Premature Session Timeouts Layer2 Pointer Crash (lab only) ASIC Optimizations Software Policy Lookups Crash (lab only) SLU engine/ASIC Chip resets
Academic/ResNet Cluster Upgraded – July 12th
HIPAA Cluster Upgraded – July 19th
Information Technology
Multicasting
Multicasting with Virtual Routing Supported in version 3.5 of router code NetCom has been testing Beta version for a
month Also provides Hitless Upgrades Successfully imaged two workstations using
Ghost and multicasting across two router hops with the College
Official version of 3.5 to be released this week Tentatively scheduled to upgrade router core on
August 1st.
Information Technology
Border Changes
Converging Emory’s Border Network Merged Healthcare and University borders (4/25) Converted Internet2 to 10gig and changed AS (6/26) Moved Global Crossing to new border routers (7/10) Moved Level3 to new border router &changed AS (7/17) Next Steps:
Change in Global Crossings and Level3 contracts Atlanta Internet Exchange (AIX)
Information Technology
Wireless
NATing Wirless? Proliferation of Wireless Devices Strain on University IP Address space Downside – Lose some tracking abilities Testing with NetReg Goal would be to implement before start of
school
The iPhone Update on the problem at Duke WPA Enterprise/Guest Access Official statement on Support
Information Technology
HealthCare Exchange
32 scheduled seminars – over 700 attendees SMTP flip completed; GAL updated Information on project website continuing to
expand Problems with beta users (Zantaz & VDT) One outstanding Zantaz + VDT problem Current Schedule
Pre-Pilot ~7/23 Pilot ~8/6 Production ~8/13