Upload
jodie
View
64
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Resource Management and Balancing. ECI, July 2005. RMS – Overview. Resource management Job management Monitoring Resource balancing Information dissemination. Job Management. The need Operating system offers job and resource management service for a single computer - PowerPoint PPT Presentation
Citation preview
Resource Managementand Balancing
ECI, July 2005ECI, July 2005
ECI – July 2005 2
RMS – Overview
Resource management Job management Monitoring Resource balancing Information dissemination
ECI – July 2005 3
Job Management
The need Operating system offers job and resource
management service for a single computer The batch job control on multi-user mainframes
was performed outside the operating system Main advantages are:
Structured resource utilization planning and control
Abstraction, easy-to-{understand,use} for user Provide a vendor independent user interface
ECI – July 2005 4
Manager vs. Scheduler
Resource manager Locating and allocate resources Authentication Process creation and migration
Resource scheduler Queuing applications Drive manager (enforce policy)
ECI – July 2005 5
Job Management - Requirements
A typical job management system offers Heterogeneous support Batch support Parallel support Interactive support Checkpointing and process migration Load balancing Job run-time limits GUI
ECI – July 2005 6
RMS Architecture
Prerequisites Multi-user & multitasking capabilities Homogeneous OS are not a restriction
In practice “Similar” operating systems run on all
machines UNIX (in all variants) is very customary in the
context of using RMS
ECI – July 2005 7
Resource Description
Requirements Easy to generate simple description Powerful to generate complex description Portable representation Attributed components
RDL: Language to specify resources Administrator: describe what’s available User: describe what’s required Hierarchical
ECI – July 2005 8
RDL Example
A 1024 nodes transputer with unix front-endDECLARATION
BEGIN PROC Transputer DYNAMIC; EXCLUSIVE;
DECLARATION BEGIN PROC Backend DECLARATION { PROC; CPU=T8; MEMORY=4; SPEED=30; REPEAT=1024; } { PORT; REPEAT=4; } END PROC
BEGIN PROC Frontend DECLARATION { PROC; OS=Unix; Repeat = 4; } END PROC
CONNECTION FOR I = 0 to 3 DO Backend LINK i Frontend LINK i ODEND PROC
ECI – July 2005 9
RMS Components
User interface At the minimum - command line user interface GUI becoming indispensable Typical commands
Job submission to register for execution status display to monitor progress or failure of a job Job deletion to cancel jobs no longer needed
ECI – July 2005 10
RMS Components (contd)
Administrative environment Specify nodes characteristics Define feasible job classes and map to hosts Define user access permissions Specify resource limitations for users and jobs Specify policies for the assignment of jobs Control and ensure proper operation of the
RMS Analyze accounting data to tune the system
ECI – July 2005 11
RMS Entities
Queues Queues bound to hosts, jobs assigned to
queues Hosts
Compute hosts, control hosts Users
Capabilities, permissions, priorities Jobs Resources Policies
ECI – July 2005 12
RMS Entities – Jobs
Job: collection of computational tasks A single program, or several interacting
programs In the context of RMS
Batch Jobs: require no manual interaction as soon as started
Interactive Jobs: require input during runtime Parallel Jobs: subtasks spread across several
hosts in a cluster Check-pointing Jobs: periodically save status
to the file system and can be aborted anytime
ECI – July 2005 13
RMS Entities – Jobs
Batch jobs Dispatch jobs according to policy and
availability Suspend/Resume & checkpoint/restart
Interactive jobs Need to maintain a terminal connection “Watchdog” monitor withdraw from pool
Parallel jobs Need to integrate with parallel environment Scheduling policy is more complex
ECI – July 2005 14
RMS Entities - Resources
Available memory, CPU time, network bandwidth, and peripheral devices, licenses
Jobs declare resource requirements RMS enforces resource consumption
ensures quality of service prevents over-subscription detects over-usage
ECI – July 2005 15
RMS Entities - Policies
Abstract mechanisms to automate control imbalanced load is common in clusters important/urgent work starved unauthorized users may take advantage users may exceed desired resource usage over
time Resource Utilization Policies
Monitor resource consumption Dispatch of new jobs
Scheduling Policies Dispatch of new jobs Relocation of jobs
ECI – July 2005 16
Resource Utilization Policies
Share based Resource “credit” is assigned to users, depts… Hierarchical share tree defines sharing Establish entitlements within time frame Fair distribution of resources
Functional Assignment by functional importance (priority) Past usage is not taken into account
Deadline Time-critical applications
Manual override Administrators like power…
ECI – July 2005 17
Scheduling Policies
Dispatch time – who, where First-Come-First-Served Select-Least-Loaded Select-Fixed-Sequence Combinations above
Relocation – who, when, where Dynamic resource balancing
ECI – July 2005 18
Scheduling of Parallel Processes
Gang scheduling Requires tight-coupling (MPP’s)
Co-scheduling Demand-based
False priority Concurrent applications
Implicit Busy wait to not relinquish cpu
ECI – July 2005 19
RMS Challenges
Open Interfaces Export load balancing/distributed capabilities Export status info (load, job status, queues) Control/assistance from application Integration with other environments (MPI) Extend functionality for special cases API must be: simple, usable, abstract, robust
ECI – July 2005 20
RMS Example: CODINE
CODINE/GRD cod_qmaster: master daemon cod_schedd: scheduler daemon cod_execd: execution daemon
Continuously match utilization with policies GRD monitors and adjusts resource usage
correlated to all processes of a job Feedback to adjust shares towards changing
requirements
ECI – July 2005 21
Static Scheduling Scheme
ECI – July 2005 22
Dynamic Scheduling Scheme
ECI – July 2005 23
RMS Example: PBS
Portable Batch Sysetm Scheduler – job to node mapping, queues Server – communications, logs Control daemon (per node) – executive agent
Scope – single node Job arrays Task Management interface
ECI – July 2005 24
RMS Example: Condor
Condor: a distributed job scheduler Harvest idle workstations Job scheduling and migration Advertising mechanism
Both job and W/S advertise presence Jobs advertise requirements (job description
file) W/S advertise their capabilities
ECI – July 2005 25
Condor: Example JDF
universe = vanilla # select runtime environmentexecutable = some_jobrequirements = (Arch=="INTEL" && OpSys=="LINUX") rank = (Memory * 10000) + KFlops #targetarguments = -verboseinput = in.dat # redirect to stdinoutput = out.dat # redirect to stdoutlog = log.txtQueue # add job to queue
ECI – July 2005 26
RMS: Condor (contd)
Universe Vanilla: sequential apps (shared FS) MPI, PVM: integrated with parallel
environment Globus: grid computing environment Standard: enables process migration
Process migration Reschedule higher priority job User reclaims her W/S Must be linked with a special library
ECI – July 2005 27
RMS: Condor (contd)
Access to data Shared file system Condor file transfer mechanism
Automatically prefetch, postfetch Remote I/O calls (in standard universe)
Architecture Central manager Server on each node
ECI – July 2005 28
Known Condor Pools
ECI – July 2005 29
Monitoring
Design choices Centralized decentralized Periodic request driven Flat hierarchical Resolution of information Focused view
ECI – July 2005 30
Monitoring Example: Parmon
Features: Online creation of Node and Group database Component, Node, Group, or entire Cluster
level Monitoring of CPU, memory, disk and network,
processes, log files etc Facility to define events & automatic
notification Misc: message broadcast, remote admin, GUI
ECI – July 2005 31
Load Balancing
Application system Static dynamic adaptive Centralized decentralized Receiver initiated sender initiated Parallel applications On-line nature
ECI – July 2005 32
LB: Application Level
Application level Round robin Randomized Recursive bisection Other optimization
Hard to estimate execution times Indeterminate no. of steps Unpredictable load Communication delays
ECI – July 2005 33
LB: System Level
System level Round robin Randomized
Estimate run time Specified by job description Estimate from past experience
ECI – July 2005 34
LB Example: MOSIX
Decentralized Symmetric Deterministic Responsive Stable Competitive Resources: CPU, memory, I/O
ECI – July 2005 35
Load Balancing Over Network
Distribute workload or network traffic load across the cluster Nodes may be interconnected among
themselves Must be connected to the balancing device
Processing nodes provide status information current processor load the application system load number of active users the availability of network protocol buffers other specific resources
ECI – July 2005 36
Load Balancing Over Network
Balancing device monitors the status of all nodes dictates where to direct the next job a single unit or a group in tree
hierarchy use one or more algorithms or methods static or dynamic setting
Decide which node gets the next incoming connection request
ECI – July 2005 37
Factors in Network Balancing
Wire-speed processing Node operating system limitation
Packet processing, no. of connections, interrupts
Balancing device limitations Tables, memory
Session based traffic, non-session UDP Application dependencies (affinity)
ECI – July 2005 38
Simple Balancing Methods
Weighting Assign weights to the nodes of different capacities
Randomization Works good in identical node environment
Round-Robin Commonly used by itself in DNS (address caching) Effective where all the nodes in the cluster are
identical in capacity and performance Hashing
Packets from the same source address will always get assigned to the same server
ECI – July 2005 39
Simple Balancing Methods
Least Connections Assigns to the node which currently has the least
connections ( ≠ least load ) Minimum Misses
Assign to the nodes which has processed the least number of incoming request in its history
Fastest Response Assigns to the node with the fastest response Requires active monitoring of the individual nodes
Sending ICMP packets with the ‘ping’ command Proprietary mechanism based upon UDP packets
ECI – July 2005 40
Advanced Balancing Methods
Primary optimization vectors Node traffic – predict volume of traffic Network traffic – monitor node state Node-load based balancing – (which
load ?) DNS load balancing - simple Topology-based – reduce latency Application-specific performance Policy based optimization
Application ,bandwidth, admin, security
ECI – July 2005 41
Common Errors
There are four common errors Overflow Underflow Routing errors Induced network errors
May destabilize efficient network clustering
ECI – July 2005 42
Information Dissemination
Central Decentralized Load incurred on system
Processing load Network load
Partial knowledge – gossip algorithms Example: finding average load