Upload
mort
View
32
Download
0
Embed Size (px)
DESCRIPTION
Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based on Knoppix Instant Computing Technology. RIKEN Genomic Science Center Fumikazu KONISHI. Background. - PowerPoint PPT Presentation
Citation preview
Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based
on Knoppix Instant Computing Technology
RIKEN Genomic Science Center
Fumikazu KONISHI
Condor Week 2006
Background• Biologists need a high perfo
rmance computing system for their research process. However, they do not know how to build a cluster system by themselves.
Condor Week 2006
Meet Chie-san.
She is a biologist
with a big problem.
I borrowed slides from Condor.
Condor Week 2006
Chie-san’s Application …
Run a Sequence Sweep of InterProScan for Mouse cDNAs of a total of 103,000 clones .– InterProScan takes on the
average 1 minute to compute on a “typical” workstation (total = 103000 × 1 = 103000 minutes = 1716 hours )
– InterProScan requires 6G bytes Public Database set for each.
http://www.ebi.ac.uk/interpro/README1.html
Condor Week 2006
Policy BarrierTechnical Skill Barrier
I have 103,000 sequences to search a gene functional domain. And I am Non-Cluster Experts.
Who will help me?
Condor Week 2006
Getting Knoppix for InterProScan High Throughput computing
Edition• Available as a free download from
Google Search “fumikazu”.Download the image file.The image includes:
• InterProScan4.1 • Condor 6.6.10 • PVFS2 1.2• Ganglia 3.0.1
Condor Week 2006
Chie-san can boot up by an image of Instant High Throughput Computing with an
Application on lab’s machines…
She can borrow lab’s computers on weekend without any software installation.
Condor Week 2006
Goal
• This research goal is to provide an instant high performance bioinformatics research workbench for all biology researchers, and allow us easy setup in collaborative project without side effect to local system.
Condor Week 2006
Instant Setup Technologies
• Install-Based Deploy System– RPM-Based automatic configuration technology (Red
hat)– NPACI Rocks toolkits (UCSD)
• Image-Based Deploy System– Live-CD technology (Knoppix)
Condor Week 2006
Key Solutions
• Knoppix – A GNU/Linux distribution that construct
a machine without hard disk instillation.
• Parallel File System– PVFS is intended a high-performance
parallel file system for cluster computing. This system provides high bandwidths access and huge volume storage area.
Condor Week 2006
Parallel File System on RAM Disk
Knoppix for InterProScan4.1 High Throughput Computing Edition
Condor Week 2006
500MByte500MByte
500MByte
500MByte500MByte
500MByte
Private Network
Intra Network
Inter Network
Service Sharing
Head Node
Worker Node
PXE Boot
Database download server
Condor Week 2006
Step 1: Booting image
Boot the head node, IP address leased by the DHCP server is displayed after the boot sequence.
Condor Week 2006
Step 2: after the successful, two setup options—EASY and ADVANCED—are
displayed on the screen.
Condor Week 2006
Step 3: Boot work nodes
All nodes must support PXE boot; The system must automatically assess whether sufficient resources are available for the database arrangement of InterProScan4.1.
Condor Week 2006
Step 4: building cluster system
Condor Week 2006
Condor Week 2006
Download InterProScan database set
Condor Week 2006
Testing
The system submits a single test job. The test jobs are completed in a few minutes. The condor job status is displayed on the browser, and Ganglia provides a large amount of information on all nodes. All configurations can be tested in this phase.
Condor Week 2006
Results
Condor Week 2006
Condor Week 2006
Web site
http://big.gsc.riken.jp/index_html/Members/fumikazu/htc
Condor Week 2006
Questions