13
SMBL and Blast Joe Rinkovsky Unix Systems Support Group Indiana University

SMBL and Blast Joe Rinkovsky Unix Systems Support Group Indiana University

Embed Size (px)

Citation preview

SMBL and Blast

Joe Rinkovsky

Unix Systems Support GroupIndiana University

Introduction

IU has around 2000 Windows PCs in public Student Technology Centers

Condor is used to harvest unused cycles

Simple Message Brokering Library(SMBL) used for parallelizing applications on Windows

Web portal for user interaction

Project History

SETI@home Was used as initial test of Condor SMBL was created to address the lack of a

general purpose parallel library on Windows that could tolerate sporadically available systems

FastDNAml was ported to SMBL Web portal created Other apps ported to SMBL(MEME,BLAST)

System Architecture

Condor “server” running on Linux BLAST databases served via Samba

on a second Linux machine Apache/MySQL/PHP web portal Windows “clients”

What is SMBL?

Simple Message Brokering Library Open Source(http://smbl.sf.net) Uses master / worker model Process and Port Manager(PPM) manages

SMBL servers and master processes Number of master /foreman processes is

different for each application SMBL workers contact the SMBL master to

get work SMBL server terminates workers when

they are no longer needed

Condor and SMBL Condor is used as the scheduling and delivery

system for SMBL workers SMBL workers contact the SMBL server when they

start running to begin receiving work. SMBL server seperates the work to be into smaller

pieces depending on the number of workers Work is redistributed if a worker is “lost” SMBL server terminates workers when there is no

work left

Applications using SMBL

FastDNAml – Generates phylogenic trees from molecular data

MEME – Detects patterns in nucleotide and protein sequences

NCBI BLAST(blastall) – Query molecular sequences against sequence databases

The Challenges of porting BLAST to SMBL

BLAST relies on the availability of large database files Files too large for efficient delivery via

Condor Local copies of databases on pool

machines would be difficult to manage Sharing DB files via Samba is the best

solution Samba was moved to a seperate server to

increase perfomance

The Challenges of porting BLAST to SMBL(cont.)

BLAST jobs take more time to complete than FastDNAml and MEME Dissapearing worker problem

Pool machines would end up in CLAIMED/IDLE state Size of our Condor pool made the problem hard to

track Only jobs taking more than 30 minutes were

affected Problem was determined to be state table

“sessions” timing out on the machine room firewall. Machines were removed from firewall and switched

to host-based iptables firewall.

Web portal

Apache/MySQL/PHP based Jobs are submitted via portal ONLY Condor submit files are dynamically

generated based on user input Status of jobs can be checked using

the portal Results retrieved from the portal

Questions?