View
227
Download
1
Category
Tags:
Preview:
Citation preview
OU Supercomputing OU Supercomputing Center forCenter for
Education & ResearchEducation & ResearchHenry Neeman, Director
OU Supercomputing Center for Education & ResearchOU Information Technology
University of Oklahoma
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 2
People
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 3
Things
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 4
What is Supercomputing?Supercomputing is the biggest, fastest computing
right this minute.Likewise, a supercomputer is one of the biggest,
fastest computers right this minute.So, the definition of supercomputing is constantly
changing.Rule of Thumb: a supercomputer is typically at
least 100 times as powerful as a PC.Jargon: supercomputing is also called High
Performance Computing (HPC).
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 5
What is Supercomputing About?
Size Speed
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 6
What is Supercomputing About? Size: many problems that are interesting to scientists
and engineers can’t fit on a PC – usually because they need more than a few GB of RAM, or more than a few 100 GB of disk.
Speed: many problems that are interesting to scientists and engineers would take a very very long time to run on a PC: months or even years. But a problem that would take a month on a PC might take only a few hours on a supercomputer.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 7
What is HPC Used For? Simulation of physical phenomena, such as
Weather forecasting Galaxy formation Oil reservoir management
Data mining: finding needles of information in a haystack of data, such as
Gene sequencing Signal processing Detecting storms that could produce tornados
Visualization: turning a vast sea of data into pictures that a scientist can understand
Moore, OKTornadic
Storm
May 3 1999[2]
[3]
[1]
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 8
What is OSCER? Multidisciplinary center Division of OU Information Technology Provides:
Supercomputing education Supercomputing expertise Supercomputing resources: hardware, storage, software
For: Undergrad students Grad students Staff Faculty Their collaborators (including off campus)
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 9
Who is OSCER? Academic Depts Aerospace & Mechanical Engr Biochemistry & Molecular Biology Biological Survey Botany & Microbiology Chemical, Biological & Materials Engr Chemistry & Biochemistry Civil Engr & Environmental Science Computer Science Economics Electrical & Computer Engr Finance History of Science
Industrial Engr Geography Geology & Geophysics Library & Information Studies Mathematics Meteorology Petroleum & Geological Engr Physics & Astronomy Radiological Sciences Surgery Zoology
More than 150 faculty & staff in 23 depts in Colleges of Arts & Sciences, Business, Engineering, Geosciences and Medicine – with more to come!
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 10
Who is OSCER? Organizations Advanced Center for Genome
Technology Center for Analysis & Prediction of
Storms Center for Aircraft & Systems/Support
Infrastructure Cooperative Institute for Mesoscale
Meteorological Studies Center for Engineering Optimization OU Information Technology Fears Structural Engineering
Laboratory Geosciences Computing Network Great Plains Network Human Technology Interaction Center Institute of Exploration &
Development Geosciences
Instructional Development Program Laboratory for Robotic Intelligence and
Machine Learning Langston University Mathematics
Dept Microarray Core Facility National Severe Storms Laboratory NOAA Storm Prediction Center OU Office of the VP for Research Oklahoma Climatological Survey Oklahoma EPSCoR Oklahoma Medical Research
Foundation Oklahoma School of Science & Math St. Gregory’s University Physics Dept Sarkeys Energy Center Sasaki Applied Meteorology Research
Institute
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 11
Center for Analysis & Prediction of Storms: daily real time weather forecasting
Oklahoma Center for High Energy Physics: particle physics simulation and data analysis using Grid computing
Biggest Consumers
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 12
Who Are the Users?Over 225 users so far: over 50 OU faculty over 50 OU staff over 100 students about 20 off campus users … more being added every month.Comparison: National Center for Supercomputing
Applications (NCSA), after 20 years of history and hundreds of millions in expenditures, has about 2100 users.*
* Unique usernames on cu.ncsa.uiuc.edu and tungsten.ncsa.uiuc.edu
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 13
What Does OSCER Do? Teaching
Science and engineering faculty from all over America learnsupercomputing at OU by playing with a jigsaw puzzle (NCSI @ OU 2004).
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 14
What Does OSCER Do? Rounds
OU undergrads, grad students, staff and faculty learnhow to use supercomputing in their specific research.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 15
Current OSCER Hardware Aspen Systems Pentium4 Xeon 32-bit Linux Cluster
270 Pentium4 Xeon CPUs, 270 GB RAM, 1.08 TFLOPs Aspen Systems Itanium2 cluster
66 Itanium2 CPUs, 132 GB RAM, 264 GFLOPs IBM Regatta p690 Symmetric Multiprocessor
32 POWER4 CPUs, 32 GB RAM, 140.8 GFLOPs IBM FAStT500 FiberChannel-1 Disk Server Qualstar TLS-412300 Tape Library
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 16
Coming OSCER Hardware (2005) NEW! Dell Pentium4 Xeon 64-bit Linux Cluster
1024 Pentium4 Xeon CPUs, 2240 GB RAM, 6.55 TFLOPs Aspen Systems Itanium2 cluster
66 Itanium2 CPUs, 132 GB RAM, 264 GFLOPs COMING! 2 x 16-way Opteron Cluster
16 AMD Opteron CPUs, 96 GB RAM, 128 GFLOPs COMING! Condor Pool: 750 student lab PCs COMING! National Lambda Rail Qualstar TLS-412300 Tape Library
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 17
Hardware: IBM p690 Regatta32 POWER4 CPUs (1.1 GHz)
32 GB RAM
218 GB internal disk
OS: AIX 5.1
Peak speed: 140.8 GFLOP/s*
Programming model: shared memory multithreading (OpenMP) (also supports MPI)*GFLOP/s: billion floating point
operations per second
sooner.oscer.ou.edusooner.oscer.ou.edu
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 18
270 Pentium4 XeonDP CPUs270 GB RAM~10,000 GB diskOS: Red Hat Linux Enterprise 3Peak speed: 1.08 TFLOP/s*
Programming model: distributed multiprocessing (MPI)*TFLOP/s: trillion floating point
operations per second
Hardware: Pentium4 Xeon Cluster
boomer.oscer.ou.eduboomer.oscer.ou.edu
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 19
56 Itanium2 1.0 GHz CPUs112 GB RAM5,774 GB diskOS: Red Hat Linux Enterprise 3Peak speed: 224 GFLOP/s*
Programming model: distributed multiprocessing (MPI)*GFLOP/s: billion floating point
operations per second
Hardware: Itanium2 Cluster
schooner.oscer.ou.eduschooner.oscer.ou.edu
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 20
1,024 Pentium4 Xeon CPUs2,240 GB RAM20,000 GB diskInfiniband & Gigabit EthernetOS: Red Hat Linux Enterp 3Peak speed: 6.5 TFLOPs*
Programming model: distributed multiprocessing (MPI)*TFLOPs: trillion calculations per sec
New! Pentium4 Xeon Cluster
topdawg.oscer.ou.edutopdawg.oscer.ou.eduDEBUTED AT #54 WORLDWIDE, #9 AMONG US UNIVERSITIES, #4 EXCLUDING BIG 3 NSF CENTERSwww.top500.org
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 21
Coming! National Lambda RailThe National Lambda Rail (NLR) is the next
generation of high performance networking.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 22
Coming! Condor Pool
Condor is a software package that allows number crunching jobs to run on idle desktop PCs.
OU IT is deploying a large Condor pool (750 desktop PCs) over the course of the Spring 2005.
When deployed, it’ll provide a huge amount of additional computing power – more than is currently available in all of OSCER today.
And, the cost is very very low.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 23
What is Condor?
Condor is grid computing technology: it steals compute cycles from existing desktop PCs; it runs in background when no one is logged in.Condor is like SETI@home, but better: it’s general purpose and can work for any
“loosely coupled” application; it can do all of its I/O over the network, not using
the desktop PC’s disk; it can use academic research community’s Grid
middleware such as Globus, but it doesn’t have to.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 24
Supercomputing at Night
Desktop PCs tend to be very active during the workday.
But at night, during most of the year, they’re idle.
So they can be used for number crunching.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 25
Condor
Condor is like SETI@home: it steals compute cycles from existing desktop PCs; it runs in background when no one is sitting at the
desk.Condor is better than SETI@home: it’s general purpose and can work for any
loosely coupled application; it can do all of its I/O over the network, not using
the desktop PC’s local disk.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 26
How is Condor Helpful? Low cost: either
nothing (if you have many Linux PCs), OR very little (if you have many Windows PCs).
Repurpose idle time on existing desktop PCs – you’ve already paid for the hardware, and most of the software is FREE!
Enable research that involves lots of computing. Share multiple independent Condor pools among
various institutions (flocking). It provides a quick Grid computing resource, to
get regional campuses used to Grids.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 27
Windows vs. Linux
Condor runs best on Unix flavors (e.g., Linux). The Windows version is clipped: it doesn’t support
automatic checkpointing or automatic job migration. However, desktop PC users demand a pure
Windows experience. So, to use Condor in OU IT student PC labs, we
need Windows and Linux at the same time. Solution: VMware
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 28
VMware
VMware is virtual machine software that allows a host OS and one or more guest OSes.
VMware is commercial software, not free. For Condor, OU runs Linux as the host OS and
Windows as the guest OS. This way, Condor can run directly on Linux and
therefore have the maximal set of features, but desktop users can have a pure Windows experience.
The lab PCs are configured to boot directly to the Windows login page, so desktop users never know.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 29
Linux/VMware/Windows/Condor
LinuxVMwareCondor
Windows
Desktop Applications
Science Applications
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 30
But I Don’t Want to Manage Linux!
WindowsVMware
Desktop Applications
Condor
Science Applications
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 31
Condor Setup at OU
LabPCs
Condor Users
Mgmt Nodes
FirewallDevice
Login Nodes
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 32
Current Status at OU
Pool of test machines in dorm PC lab Submit/management from Neeman’s desktop PC Rollout to multiple labs during summer Total rollout to 750 PCs by end of 2005
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 33
What Does OU Plan to Do w/Condor?
Loosely coupled problems: many small, independent jobs
High Energy Physics: D-Zero experiment Nanotechnology: Monte Carlo simulation Computational Chemistry: molecular dynamics Aerospace Engineering: parameter space searches ... and many others.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 34
NSF CI-TEAM Program
The NSF Cyberinfrastructure TEAM program is a brand new program.
It is providing grants of up to $250,000 for up to 2 years.
One of CI-TEAM’s goals is to expand Cyberinfrastructure – for example, supercomputing – to institutions and people that traditionally haven’t had much access.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 35
Our NSF CI-TEAM Project
The University of Oklahoma (OU) is leading an NSF CI-TEAM project, submitted May 27 2005.
The focus is on setting up Condor pools across the Great Plains region, and beyond.
The kickstart application will be BLAST (bioinformatics), but these Condor pools will be available for any appropriate application.
Most of the money in OU’s CI-TEAM proposal will go to institutions other than OU, for VMware licenses and PCs to manage the Condor pools.
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 36
CI-TEAM Participants So Far At OU
OSCER/IT Arts & Sciences: Botany & Microbiology; Chemistry & Biochemistry;
Mathematics; Physics & Astronomy; Zoology Engineering: Aerospace & Mechanical Engineering; Civil Engineering &
Environmental Science; Chemical, Biological & Materials Engineering; Computer Science; Electrical & Computer Engineering, Industrial Engineering
Medicine: Surgery, Radiological Sciences Other Academic Institutions in Oklahoma: Langston U. (minority
serving), Oklahoma Baptist U. (4 year), Oklahoma School of Science & Mathematics (high school), St. Gregory’s U. (4 year), U. Central Oklahoma (Masters-granting)
Academic Institutions outside Oklahoma: Contra Costa College of CA (2 year), Kansas State U. (PhD), U. Arkansas Fayetteville (PhD), U. Arkansas Little Rock (PhD), U. Kansas (PhD), U. Nebraska (PhD), U. Northern Iowa (Masters)
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 37
An Added Bonus
OSCER’s user eligibility policy is that anyone can have access to OSCER’s supercomputers, if they are on a project that has an OU faculty or staff member as the Principal or Co-Principal Investigator.
So, as a bonus, everyone at participating institutions who is on the CI-TEAM project – and their students – will get FREE access to OSCER’s supercomputers!
OU Supercomputing Center for Education & ResearchOneNet CIO Meeting Thursday August 11 2005 38
Join our CI-TEAM!
If OU’s CI-TEAM proposal is fully funded, it will create one of largest Condor flocks in the world: almost 4,000 PCs.
YOU CAN JOIN US! All you need is to commit some PCs – you create
your own Condor pool with OU’s help, and then let the rest of the CI-TEAM institutions use them.
CI-TEAM will pay for the VMware licenses, and for small schools also a PC as Condor manager.
Recommended