Upload
patrick-marshall
View
219
Download
2
Tags:
Embed Size (px)
Citation preview
GridChem Refactoring: Workflows in GridChem
Sudhakar [email protected]
Acknowledgements
• Rion Dooley, TACC• Suresh Marru, IU• Yang Liu, NCSA• Raman Sandhu, IU• Nikhil Singh, NCSA• Gaurang Mehta, ISI
NSF and TeraGrid
GridChem RefactoringOutline
• GridChem functioning• New needs and requirements• Refactoring• ParamChem• Future usage
GridChem Functioning
• CCG Virtual Organization - Allocations, Consulting, User services• Three Tier Client, Middleware and Resources Architecture• Job Based interactions with services• Data archived on Storage resources
Previous talk has some details
http://teragridforum.org/mediawiki/images/e/ed/Sudhakar.NCSA12.4.08.ppt
GridChem Usage
>3000 3000 2500 2000 15000
102030405060708090
100
CCG Jobs Using more than 1000 SUs
1000800
600400
2000
200400600800
100012001400
CCG Jobs Using 100-1000 SUs
Series1
100
1000
10000
Jobs Using less than 100 SUs
Jobs
3rd 05
1st 06
3rd 06
1st 07
3rd 07
1st08
3rd 08
1st 09
3rd 09
0
5000
10000
15000
20000
25000
30000
35000
Quarterly Increase in Jobs
Jobs
3rd 05
1st 06
3rd 06
1st 07
3rd 07
1st08
3rd 08
1st 09
3rd 09
0200000400000600000800000
10000001200000140000016000001800000
Quartherly Increase in Usage(Sus)
Usage(Sus)
Scince from GridChem GatewayDiversity of User Research
NH3 on Si Surfaces
CytP450 Catalysis
ZeoliteChemistry
Phosphinoboranepercyclics
Semiquinonereactions
Si Surface IR Disulfide clevageby P-
Thiolate –SS interchange
V in photocatalysts
PES of diphenylbutadienes
FTIR of Heptanedione on Si
GridChem New Needs
• New projects and CollaborationsMultisclae modeling for Material ScienceProf. Duane Johnson, UIUC QM-QMC Coupled applications
ParamChem, Profs. MacKerrel(UMB), Roitberg(UF), Connolly(Uky), Pamidighantam(UIUC/NCSA)
QM-MM-MD Coupled applications
Advanced Users requiring Potential Energy Hypercurface Computations, Ab initio MD, and Parameter Sweeps.
Workflows
GridChem Refactoring• Code refactoring is the process of changing a computer program's source code
without modifying its external functional behavior in order to improve some of the nonfunctional attributes of the software. Advantages include improved code readability and reduced complexity to improve the maintainability of the source code, as well as a more expressive internal architecture or object model to improve extensibility.
• “ By continuously improving the design of code, we make it easier and easier to work with. This is in sharp contrast to what typically happens: little refactoring and a great deal of attention paid to expediently adding new features. If you get into the hygienic habit of refactoring continuously, you'll find that it is easier to extend and maintain code. ” —- Joshua Kerievsky, Refactoring to Patterns [1]
http://en.wikipedia.org/wiki/Code_refactoring
Client New Features• New client has no server dependencies.
• The client library containing the new service stubs which need axis2 dependencies.
• Axis2 handles fast communication with the service and streaming data.
• DTO (Data Transfer Object) classes now implemented as the service beans that hold information sent from the service. – They’re a little less redundant. – For example in the in the new software bean, the client no longer has to
parse through abstract data formats to figure out the multitude of application/hpc resource combinations.
GridChem Client Nanocad 3D
GridChem RefactoringPreprocessing Tools
GDIS
Tubegen
JMol
MolGen
GridChem-Xbaya Composer
Server New Features
• Axis2 implementation makes it also available via HTTP, SOAP, and RPC.
• The database underwent a cleanup and schema changes.
This leads to performance in terms of quicker services.
Server New Features
• Single service distribution into Tomcat 5.5.
• Uses JNDI and Tomcat’s connection pooling for much better stability. JNDI provides standardised naming for Directory Services
• Removed GAT, now uses Java cogkit based GridFTP client directly for performance increase on remote directory listings, file transfers, and i/o operations.
Server New Features• Multiple file upload support with service-side user cache. -Several application and workflows require multiple files for execution and
• Users can upload input files, browse previously uploaded files, and retrieve previously uploaded files.
• Job queue prediction via qbets is available on the client side. QBets provides a way to select resources automatically.
• Resource monitoring plugs into the TeraGrid gpir and iis instances for accurate, effort-free resource monitoring, discovery, and access
GPIR and IIS provide system and service information for various resources ( HPC systems).
Server New Features• Job updates are done via a RESTful trigger service. The batch scripts callout to
the service with a secret service key and the service updates the job status immediately.
This provides State and Session preservation
• Database access times on all queries has been improved to sub-second performance.
• The overall service memory footprint has been cut down 2 orders of magnitude from the current production version.
• Support for ingesting TeraGrid users, their profile, project, and resource information, and allowing them to use the CCG infrastructure with their current allocations has been added.
This will provide these services across TeraGrid user communities.
Server New Features
• CGI scripts are now bundled with the service and deployed into tomcat rather than an apache server.
• Software access control has been implemented. The BlackList table holds a list of users who are denied access to a Software record.
Workflow Selection/Execution
• XBaya is a graphical client program for workflow composition, monitoring, and more.
• Different Web services can be invoked at different steps.
• Data from intermediate steps stored in databases for future reuse.
• Each step on the workflow can be monitored.
Integration of Xbaya and GridChem Data
Paramchem middleware Requirements
• Broad Goals Parameterization User (community) management
Paramteterization process management Workflows
Data management Archival and Retrieval requirements
Cyberenvironments for ParameterizationComputational Reference Data Generation
Molecular Force Field CyberenvironmentsParameter Initialization and optimization Workflow
Paramater definitions
Workflow For Empirical Parameter Optimization
Model/Reference Data Definition
Merit Function Specification
Consistency Checker
Optimization Methods Choice
Optmization Job Launcher
Update Parameter Database with new set
Job Manager
Optimization Incomplete?
Paramater testing Model
Successful Testing
Optimization Monitor
Optimization Job Completed?
Paramater Sensitivity Analysis
Notification of End of Workflow
Expert Interface
Parameterization Menus and Data structures
Charmm-Gaussian Workflow in Xbaya-workflow management system
ParamChem Web Services
Client Objects Database Interaction
WSResources
DTO
Objects Hibernate
Databasehb.xml
Client
DTO (Data Transfer Object)Serialize transfer through XML
DAO (Data Access Object) How to get the DB objectshb.xml (Hibernate Data Map)
describes obj/column data mapping
BusinessModel
DAO
ParamChem Data Models
Users ParamProjects Resources
UserProjectResource
SoftwareResources
ComputeResources
NetworkResources
StorageResources
Resources
resoruceIDTypehostNameIPAddresssiteID
userIDparamprojectIDresourceIDloginNameSUsLocalUserUsed
WorkflowsWFIDWFNameuserIDprojIDRegWFIDcost
Users Resources
JobIDJobNameuserIDprojIDsoftIDcost
WF Node/Job
DataResources
ParamChem Resourcefollowing CCG Class Dependencies
ParamChem Middleware Services(PMS) Use Cases
• Authentication• Workflow Selection/Creation• Workflow Configuration• Workflow Submission• Workflow Resource Monitoring• Workflow Monitoring• Data Resource Monitoring/Organization• Workflow Results Retrieval/Organization• …
PMS Authentication(follows GridChem Middleware Services)
• WSDL (Web Service Definition Language) is a language for describing how to interface with XML-based services. It describes network services as a pair of endpoints operating on messages with either document-oriented or procedure-oriented information.
• The service interface is called the port type • WSDL FILE: <?xml version="1.0" encoding="UTF-8"?> <definitions name="MathService"
targetNamespace="http://www.globus.org/namespaces/examples/core/MathService_instance" xmlns="http://schemas.xmlsoap.org/wsdl/" …
http://www.gridchem.org:8668/space/GMS/usecase
Retrieve UserProjects(GetResourceProperty Port Type [PT])
Contact PMSCreates Session, Session RP and EPR, Ind.User.CommSends EPR ( Like a Cookie, but more than that)
Login Request(username:passwd)
Validates, Loads UserProjects,Data,WFRegistriesSends acknowledgement
ParamChem Client PMS
PMS Authenticationfollows GMS_WS
http://www.gridchem.org:8668/space/GMS/usecase
Selects projectLoadVO port type(w. MAC address)
Verifies user/project/MACaddrLoad UserResources RP
Retrieve UserResources[as userVO/ Profile](GetResourceProperty port Type PT)
ParamChem Client PMS
Validates, Loads UserProjects,Data.WFsSends acknowledgement
Sends acknowledgement
PMS Workflow Submission
Create WF objectPredictWFStartTime PT + WF DTO ----> Node/Job DTOs
Node/JobStart Prediction RP
PT = portType RP = Resource PropertiesDTO = Data Transfer Object
Completion:Email from batch systemto PMS servercron@PMS DB
SubmissionXbaya GFACCoGKitGAT“gsi-ssh”
If decision OK,Submit Workflow PT + WFDTO and JobDTOs
Create WF objectAPI—SubmitStore WF Object
Send Acknowledgement
Need to check to make sure allocation-time is available
The workflow is sane and executable.
ParamChem Client PMS
ParamChem Middleware Services Monitoring
Parse XML,Display
PT = portType RP = Resource PropertiesDTO = Data Transfer ObjectDB = Data Base
cron@PMS serverXbaya Monitoring Servercron@HPC ServersJob Launcher NotificationsVO Admin emailparses email DB(status + cost)
Request for Job,Resource StatusAlloc. Balance
UserResource RP Updated from DB
ParamChem Client PMS Resources/Kits/DB
Send info
Discover Applications (Software Resources)
Discover Data (Data Resources)
Monitor Workflow Schedulers (Primary)
Monitor System (Secondary)
Monitor Queues (Tertiary)
Workflow Status Updates(automated) Node/Job StatusDisplay
PMS Workflow Status
Workflow Status WFDTO.status WFXBaya Launcher
Status Update
Estimate Start time
Scheduler emails/ notifications
Notifications: Client, Management, email, IM
ParamChem Client PMS Resources/Kits/DB
PMS DATA Organization Retrieval (MSS)
GetResourceProperty PTFileDTO(?)LoadFile PT(project folder+job)
Validates projectfolder owned by user.Send new listing
PT = portType RP = Resource PropertiesDTO = Data Transfer ObjectMSS = Mass Storage System
Job Completion,Workflow CompletionSend Output to MSS
LoadFile PT MSS queryUserFiles RP +FileDTO object
Retrieve Root Dir. Listing on MSS withCoGKit orGAT or“gsi-ssh”
API file requestStore locallyCreate FileDTOLoad into UserData RP
RetrieveFiles PT(+file rel.path)
Retrieve file:CoGKit orGAT or“gsi-ssh”
GetResourceProperty PT
ParamChem Client PMS Resources/Kits/DB
ParamChem File Retrieval
PT = portType RP = Resource PropertiesDTO = Data Transfer ObjectMSS = Mass Storage System
Create FileDTO (?)Load into UserData RP
RetrieveJobOutput PT(+JobDTO)
Job Record fromDB.Running: from ResourceComplete: from MSS
Retrieve file:CoGKit orGAT or“gsiftp”
GetResourceProperty PT
ParamChem Client
PMS Resources/Kits/DB
ParamChem Web ServicesWSRF (Web Services Resource Framework) Compliant
WSRF Specifications:WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF)
%ps -aux | grep ws/usr/java/jdk1.5.0_05/bin/java \-Dlog4j.configuration=container-log4j.properties \-DGLOBUS_LOCATION=/usr/local/globus \-Djava.endorsed.dirs=/usr/local/globus/endorsed \-DGLOBUS_HOSTNAME=derrick.tacc.utexas.edu \-DGLOBUS_TCP_PORT_RANGE=62500,64500 \-Djava.security.egd=/dev/urandom \-classpath /usr/local/globus/lib/bootstrap.jar: /usr/local/globus/lib/cog-url.jar: /usr/local/globus/lib/axis-url.jar org.globus.bootstrap.Bootstrap org.globus.wsrf.container.ServiceContainer -nosec
Logging ConfigurationWhere to find Globus
Where to get random seedfor encryption key generation
Classpath (required jars)
model
dto
credential
jobnotification
file file.taskjob.task
user
exceptions
resource
persistence
synchquery
test
util
dao
gpir
cryptenumeratorsgatproxy
GMS_WS
client
audit
pms Classes for WSRF service implementation (PT)Cmd line tests to mimic client requests
Data Access Obj – queries DB via persistent classes (hibernate)Data Transfer Obj – (job,File,Hardware,Software,User) XML
How to handle errors (exceptions)CCG Service business mode (how to interact)Contains user’s credentials for job sub. file browsing,…
“Oversees correct” handling of user data (get/putfile).Define Job & util & enumerations (SubmitTask, KillTask,…)
CCGResource&Util, Synched by GPIR, abstract classesNetworkRes., ComputeRes., SoftwareRes., StorageRes., VisualizationRes.
User (has attributes – Preference/Address)DB operations (CRUD), OR Maps, pool mgmt,DB session,
Classes that communicate with other web services
Periodically update DB with GPIR info (GPIR calls)JUnit service test (gms.properties): authen. VO retrieval, Res.Query,Synch, Job Mgmt, File Mgmt, NotificationContains utility and singleton classes for the service.
Encryption of login passwordMapping from GMS_WS enumeration classes DBGAT util classes: GATContext & GAT Preferences generation
Classes deal with CoGKit configuration.
Autonomous notification via email, IM, textmesg.