Upload
marjorie-jordan
View
222
Download
0
Embed Size (px)
Citation preview
IT-SDC : Support for Distributed Computing
The Data Bridge
Laurence FieldIT/SDC
6 March 2015
2IT-SDC
BOINC and Virtualization
3IT-SDC
Test4Theory Model
Avoid restarting the VM for every job Reduces CVMFS related network traffic
CoPilot support challenges Dependencies not available in the standard repositories
Reduce the operational cost New standardized components available for some functions
Separation of VM management and job management Inline with the cloud model
Can reuse cloud related tooling
VM
BOINC ServerVolunteer
Agent
Job Wrapper
Co Pilot
Job Agent
Storage Agent
Job Description
Data I/O
4IT-SDC
The Challenge
Workload Manager
VO(X509/VOMS)
VM
Volunteer
Job Wrapper
GridGrid
InfrastructureVolunteer
BOINC Auth
5IT-SDC
Authentication
How to authenticate BOINC users? In the VM, credential provided via /dev/fd0
BOINC_ID BOINC_AUTHENTICATOR
BOINC Project DB is the user Identity Provider (IDP) MySQL User Table
mod_auth_mysql Maps username/password to DB table
AuthMysqlUserTable user AuthMySQLNameField id AuthMySQLPasswordField authenticator
Enables reuse of apache-based HTTP technology
6IT-SDC
The Architecture
Workload ManagerMessaging
Service
Data Bridge
VO(X509/VOMS) Infrastructure
VolunteerBOINC Auth
VM
Volunteer
Agent
Job Wrapper
Job Description
FTS
GridGrid
PullPushPluginData I/O
7IT-SDC
The Data Bridge
Spans authentication domains BOINC user’s credential Grid x509 credentials
Scalable data I/O With sandboxing capabilities
Data isolation
Simple apache-based prototype Supports HTTP PUT/GET
mod_auth_mysql to validate BOINC user’s credential mod_auth_ssl to validate WMS x509 credential
HTTP Federation Possibility to reuse standard DM tools
8IT-SDC
Dynamic HTTP Federations
Dynafed implements federated storage over HTTP In testing in LHCb and Canada (ATLAS) Federates WebDAV or S3 enabled storage systems Apache front end
Can be used as a data bridge S3 storage backend(s) Acts as a security gateway between X509 or BOINC Auth
Clients then redirected directly to the storage Great scalability potential
Global system, smart replica selection (availability, proximity)
http://svnweb.cern.ch/trac/lcgdm/wiki/Dynafeds
9IT-SDC
Apache
The Data Bridge
ssl
FTS
S3S3
mysql
WMS BOINCUser
PUT/GET PUT/GET
HTTP redirect & sign
HTTP redirect & sign
PUT/GET PUT/GET
GridGrid
DynaFed
10IT-SDC
Message Queue
Messaging service does not support BOINC authentication Not clear if it is possible or worthwhile to provide functionality
Standard apache Web server approach mod_auth_mysql to validate BOINC user’s credential mod_auth_ssl to validate WMS x509 credential
Two simple cgi scripts put-job.cgi get-job.cgi
Simple file-based queue python-dirq
Job descriptions from the WM Supports arbitrary file types
Garbage in, Garbage out Extensible
Web Serverdirq
put
Get
11IT-SDC
Implementation
Workload Manager
Message Queue
Data Bridge
VO(X509/VOMS) Infrastructure
VolunteerBOINC Auth
VM
Volunteer
Agent
Job Wrapper
GET
PUTPlugin
PUT
FTS
S3S3
GridGrid
12IT-SDC
Adoption
Building upon vLHC@home and the data bridge as a platform Require:
WM to POST job description to the data bridge message queue Stage input data to the data bridge’s input bucket (if needed) CernVM3 Image
Contextualized including CVMFS configuration Credentials read from /dev/fd0
BOINC_ID and BOINC_AUTHENTICATOR
Job Agent GET the job description GET the input from the data bridge Run job PUT output on the data bridge
Read data from the data bridge’s output bucket Similar to HLT
Fitter bad data etc.
13IT-SDC
Rollout Plans
CMS@Home Pioneered the adoption of the data bridge Will hopefully enter the beta testing phase soon
Test4Theory Plan to migrate to the data bridge within the next 2
months To address co-pilot support issues
Beauty@Home Currently integrating the data bridge
Will solve their x509 credential distribution issue Open the project up to the public
14IT-SDC
Summary
The Data Bridge spans auth domains Grid and Volunteer computing
Reuses HTTP federation component for S3 Added BOINC authentication
A simple message delivery function For the job description
Just provide an image along with a job agent And interact with the data bridge
Towards a platform for volunteer computing