View
214
Download
0
Category
Preview:
Citation preview
Batch Scheduling at LeSC with Sun Grid Engine
David McBride <dwm@doc.ic.ac.uk>
Systems ProgrammerLondon e-Science Centre
Department of Computing, Imperial College
Overview
● End-user requirements● Brief description of compute hardware● Sun Grid Engine software deployment● Tweaks to the default SGE configuration● Future changes● References for more information and questions.
End-User Requirements
● We have many different users: high-energy physicists, bioinfomaticians, chemists, parallel software researchers.
● Jobs are many and varied:– Some users run relatively few long running tasks, others submit
large clusters of shorter jobs.
– Some require several cluster nodes to be co-allocated at runtime (16, 32+ MPI hosts), others simply use a single machine.
– Some require lots of RAM.. (1, 2, 4, 8GB+ per machine)
● In general users are fairly happy so long as they get a reasonable response time.
Hardware
● Saturn: 24-way 750Mhz UltraSparcIII Sun E6800 – 36GB RAM, ~20TB online RAID storage, – 24TB tape library to support long-term offline backups.– Running Solaris 8
● Viking cluster: 260 Node dual P4 Xeon 2Ghz+– 128 machines with Fast Ethernet; 2x64 machines also with Myrinet– 2 front-end nodes & 2 development nodes.– Running RedHat Linux 7.2 (plus local additions and updates)
● Mars cluster: 204 Node dual AMD Opteron 1.8Ghz+– 128 machines with Gigabit Ethernet; 72 machines also with Infiniband.– Running RedHat Enterprise Linux 3 (plus local refinements)– 4 front-end interactive nodes.
Sun Grid Engine Deployment
● Two separate logical SGE installations– Saturn acts as the master node for both cells.– However, Viking is running SGE 5.3 and Mars is
running SGE 6.0.● Mars is still ‘in beta’; Viking is still providing the main
production service.● When Mars’s configuration is finalized, end-users will be
migrated to Mars – Viking will then be reinstalled with the new configuration.
Changes to Default Configuration
● Issue 1: – If all the available worker nodes are running long-lived jobs, then
a new short-lived job added to the queue will not execute until one of the long-lived jobs has completed. (SGE does not provide a job checkpoint-and-preempt facility.)
– Resolution: A subset of nodes are configured to only run short-lived jobs.
– Trades slightly reduced cluster utilization for shorter average-case response time for short-lived jobs.
– End users only benefit if they specify the job will finish quickly at submission-time.
Changes to Default Configuration
● Issue 2:– Clusters are internally heterogenous; eg some have more
memory, faster processors, bigger local disks than others.
– Sometimes a low-requirement job will be allocated to one of these more capable machines unnecessarily because the submitter has not specified the job’s requirements.
– This can prevent a job which does have high requirements from being run as quickly.
– Experiment with changing the SGE configuration so that a job will, by default, only require the resources of the least-capable node.
– Again, places onus on user to request extra resources if needed.
Changes to Default Configuration
● Issue 3:– If a job is submitted that requires the co-allocation of
several cluster nodes simultaneously (eg for a 16-way MPI job) then that job can be starved by a larger number of single-node jobs.
– Resolution: Manually intervene to manipulate queues so that the large 16-way job will be scheduled. (SGE 5.3)
– Resolution: Upgrade to SGE 6 which uses a more advanced scheduling algorithm (advance reservation with backfill.)
Future Changes: LCG
● We are participating in the Large Hadron Collider Compute Grid as part of the London Tier-2.
● This has been non-trivial; the standard LCG distribution only supports PBS-based clusters.– We’ve developed SGE-specific Globus JobManager and
Information Reporter components for use with LCG.– We have also been working with the developers to address issues
with running on 64bit Linux distributions.● Currently deploying front-end nodes (CE, SE, etc.) to
expose Mars as an LCG compute site.● We are also joining the LCG Certification Testbed to
provide a SGE-based test site to help ensure future support.
References
● London e-Science Centre homepage: – http://www.lesc.ic.ac.uk/
● SGE intergration tools for Globus Toolkit 2, 3, 4 and LCG:– http://www.lesc.ic.ac.uk/projects/sgeindex.html
Q&A
Recommended