View
214
Download
0
Embed Size (px)
Citation preview
Runtime Environment DB
• Problem:– Many Grid jobs assume that the binaries and libraries
are already available– Even if this is the case – where are those files
placed?
• Solution– Build a database for maintaining runtime
environments– Define rules for environment settings– Allow automatic testing for correctness of those
settings
Runtime Environment DB
• Example:– POV-Ray– Define POV_RAY_EXE_PATH
• /usr/local/bin/pov34/bin
– Define POV_RAY_LIB_PATH• /usr/local/bin/pov34/include
– Test correctness• which povray34 = /usr/local/bin/povray34/bin/povray34?• find $POV_RAY_LIB_PATH colors.inc =
lib/colors.inc
Remote File Access Proxy
• Problem– Not all systems has HTTPS access from the
nodes
• Solution– Assuming that the nodes and the front-end
can communicate - place a proxy-server on the front-end
Remote File Access Proxy
• Issues– Security token handling– Performance
• Potential– Caching– Prefetching without CPU interference
Resource specification detection
• Problem– A resource is defined by a large set of parameters:
• Architecture, memory, diskspace,…• Access rights, user-id, node-access, queue-system• Runtime enviroments
• Solutions– Have the sysadm add all the information automatically– Run a program that identifies as many components as
possible
Resource specification detection
• Examples– OS = `uname`– if OS==‘Linux’
• cat /proc/cpuinfo | grep CPU | awk ‘{print $4}’• tempdrive = `mount | grep /tmp’• if tempdrive = ‘’ tempdrive = ‘/’• space = df $tempdrive• gcc_ver = `gcc –v’• if gcc_ver != ‘’ gcc_env = gcc_ver
Monitor
• We would like to make nice presentations of the state of MiG– # users– #jobs– #Resources
• ID of resources that are not anonymous
– Estimated time to start execution
• All sorted, filtered and presented as the users requests
Accounting
• We need to do realiable accounting
• When a job is submitted to a queue the server must ask a bank to deposit credits corresponding to the maximum use
• After execution the server must ask to be given the credits corresponding to the resources that were actually used
Accounting
Server
Bank
Job: 10h 1GB mem
Reserve (10,1)
Run Job
Actual use (<=(10,1)
Con
firm
Debit (x,y) C
onfi
rm
Accounting
• Secure
• Reliable
• What happens if– The job crashes– The server crashes– The Bank Crashes
• ?
Grid Units
• We need to be able to define the performance of a system– Processing speed– IO performance– Networking performance
• Units:– Generic single CPU
• Balanced CPU speed and IO
– Generic MPP• Balanced of all 3
– Individual of the 3
Grid Units
• The definition of a system should be determined automatically by a program
• A user should be able to run his applications and get an idea of the Grid units it uses– time a.out
• Tells us disk need and CPU need• Determining network dependency is harder!!!
Dalton
• Very important application i chemistry
• Fairly small input files
• Fairly small output files
• Huge runtime
• Local expertise
• Very well suited for a Web-portal!!!
POV-Ray
• Popular
• Simple
• Can be parallelized using Grid
• Fairly small input
• Medium to small output
• Very well suited for a Web-portal!!!
BLAST
• Very important– Right now these guys eat a lot of the time on
Horseshoe
• National expertise
• Large input files
• Small output-files
• Should be scriptable
• But portals are also interesting
Shared data-structures for Grid
• There are many scenarios where Grid jobs could communicate through shared data-structures
• Examples– Single variables– Bounded buffers– Arrays– Objects
• All access must be secure!!!
Interfacing with other Grid Implementations
• It is interresting for MiG to accept other Grids as– Users– Resources
• Examples:– NorduGrid– gLite– Gridbus– Unicore– OfficeGrid
Supporting more Queuing systems
• Different resources use different queuing systems
• Examples– PBS/Torque– LSF– LoadLeveler– OfficeGRID
Programmers API
• It is interesting for programmers to be able to Grid enable their applications directly– Access Grid files– Submit jobs– Retrieve results
• For this a library with these features must be designed and implemented
Statistics
• Just like monitoring it is interresting to obtain statictics on Grid
• Examples– Usage– #Users– Turn-over-time– Activation time– #Resources– etc…