11
Batch Scheduling at LeSC with Sun Grid Engine David McBride <[email protected]> Systems Programmer London e-Science Centre Department of Computing, Imperial College

Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Embed Size (px)

Citation preview

Page 1: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Batch Scheduling at LeSC with Sun Grid Engine

David McBride <[email protected]>

Systems ProgrammerLondon e-Science Centre

Department of Computing, Imperial College

Page 2: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Overview

● End-user requirements● Brief description of compute hardware● Sun Grid Engine software deployment● Tweaks to the default SGE configuration● Future changes● References for more information and questions.

Page 3: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

End-User Requirements

● We have many different users: high-energy physicists, bioinfomaticians, chemists, parallel software researchers.

● Jobs are many and varied:– Some users run relatively few long running tasks, others submit

large clusters of shorter jobs.

– Some require several cluster nodes to be co-allocated at runtime (16, 32+ MPI hosts), others simply use a single machine.

– Some require lots of RAM.. (1, 2, 4, 8GB+ per machine)

● In general users are fairly happy so long as they get a reasonable response time.

Page 4: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Hardware

● Saturn: 24-way 750Mhz UltraSparcIII Sun E6800 – 36GB RAM, ~20TB online RAID storage, – 24TB tape library to support long-term offline backups.– Running Solaris 8

● Viking cluster: 260 Node dual P4 Xeon 2Ghz+– 128 machines with Fast Ethernet; 2x64 machines also with Myrinet– 2 front-end nodes & 2 development nodes.– Running RedHat Linux 7.2 (plus local additions and updates)

● Mars cluster: 204 Node dual AMD Opteron 1.8Ghz+– 128 machines with Gigabit Ethernet; 72 machines also with Infiniband.– Running RedHat Enterprise Linux 3 (plus local refinements)– 4 front-end interactive nodes.

Page 5: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Sun Grid Engine Deployment

● Two separate logical SGE installations– Saturn acts as the master node for both cells.– However, Viking is running SGE 5.3 and Mars is

running SGE 6.0.● Mars is still ‘in beta’; Viking is still providing the main

production service.● When Mars’s configuration is finalized, end-users will be

migrated to Mars – Viking will then be reinstalled with the new configuration.

Page 6: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Changes to Default Configuration

● Issue 1: – If all the available worker nodes are running long-lived jobs, then

a new short-lived job added to the queue will not execute until one of the long-lived jobs has completed. (SGE does not provide a job checkpoint-and-preempt facility.)

– Resolution: A subset of nodes are configured to only run short-lived jobs.

– Trades slightly reduced cluster utilization for shorter average-case response time for short-lived jobs.

– End users only benefit if they specify the job will finish quickly at submission-time.

Page 7: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Changes to Default Configuration

● Issue 2:– Clusters are internally heterogenous; eg some have more

memory, faster processors, bigger local disks than others.

– Sometimes a low-requirement job will be allocated to one of these more capable machines unnecessarily because the submitter has not specified the job’s requirements.

– This can prevent a job which does have high requirements from being run as quickly.

– Experiment with changing the SGE configuration so that a job will, by default, only require the resources of the least-capable node.

– Again, places onus on user to request extra resources if needed.

Page 8: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Changes to Default Configuration

● Issue 3:– If a job is submitted that requires the co-allocation of

several cluster nodes simultaneously (eg for a 16-way MPI job) then that job can be starved by a larger number of single-node jobs.

– Resolution: Manually intervene to manipulate queues so that the large 16-way job will be scheduled. (SGE 5.3)

– Resolution: Upgrade to SGE 6 which uses a more advanced scheduling algorithm (advance reservation with backfill.)

Page 9: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Future Changes: LCG

● We are participating in the Large Hadron Collider Compute Grid as part of the London Tier-2.

● This has been non-trivial; the standard LCG distribution only supports PBS-based clusters.– We’ve developed SGE-specific Globus JobManager and

Information Reporter components for use with LCG.– We have also been working with the developers to address issues

with running on 64bit Linux distributions.● Currently deploying front-end nodes (CE, SE, etc.) to

expose Mars as an LCG compute site.● We are also joining the LCG Certification Testbed to

provide a SGE-based test site to help ensure future support.

Page 10: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

References

● London e-Science Centre homepage: – http://www.lesc.ic.ac.uk/

● SGE intergration tools for Globus Toolkit 2, 3, 4 and LCG:– http://www.lesc.ic.ac.uk/projects/sgeindex.html

Page 11: Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Q&A