24
Improving the Research Bootstrap of Condor High Throughput Computing for Non- Cluster Experts Based on Knoppix Instant Computing Technology RIKEN Genomic Science Center Fumikazu KONISHI

RIKEN Genomic Science Center Fumikazu KONISHI

  • Upload
    mort

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based on Knoppix Instant Computing Technology. RIKEN Genomic Science Center Fumikazu KONISHI. Background. - PowerPoint PPT Presentation

Citation preview

Page 1: RIKEN Genomic Science Center Fumikazu KONISHI

Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based

on Knoppix Instant Computing Technology

RIKEN Genomic Science Center

Fumikazu KONISHI

Page 2: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Background• Biologists need a high perfo

rmance computing system for their research process. However, they do not know how to build a cluster system by themselves.

Page 3: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Meet Chie-san.

She is a biologist

with a big problem.

I borrowed slides from Condor.

Page 4: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Chie-san’s Application …

Run a Sequence Sweep of InterProScan for Mouse cDNAs of a total of 103,000 clones .– InterProScan takes on the

average 1 minute to compute on a “typical” workstation (total = 103000 × 1 = 103000 minutes = 1716 hours )

– InterProScan requires 6G bytes Public Database set for each.

http://www.ebi.ac.uk/interpro/README1.html

Page 5: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Policy BarrierTechnical Skill Barrier

I have 103,000 sequences to search a gene functional domain. And I am Non-Cluster Experts.

Who will help me?

Page 6: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Getting Knoppix for InterProScan High Throughput computing

Edition• Available as a free download from

Google Search “fumikazu”.Download the image file.The image includes:

• InterProScan4.1 • Condor 6.6.10 • PVFS2 1.2• Ganglia 3.0.1

Page 7: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Chie-san can boot up by an image of Instant High Throughput Computing with an

Application on lab’s machines…

She can borrow lab’s computers on weekend without any software installation.

Page 8: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Goal

• This research goal is to provide an instant high performance bioinformatics research workbench for all biology researchers, and allow us easy setup in collaborative project without side effect to local system.

Page 9: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Instant Setup Technologies

• Install-Based Deploy System– RPM-Based automatic configuration technology (Red

hat)– NPACI Rocks toolkits (UCSD)

• Image-Based Deploy System– Live-CD technology (Knoppix)

Page 10: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Key Solutions

• Knoppix – A GNU/Linux distribution that construct

a machine without hard disk instillation.

• Parallel File System– PVFS is intended a high-performance

parallel file system for cluster computing. This system provides high bandwidths access and huge volume storage area.

Page 11: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Parallel File System on RAM Disk

Page 12: RIKEN Genomic Science Center Fumikazu KONISHI

Knoppix for InterProScan4.1 High Throughput Computing Edition

Page 13: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

500MByte500MByte

500MByte

500MByte500MByte

500MByte

Private Network

Intra Network

Inter Network

Service Sharing

Head Node

Worker Node

PXE Boot

Database download server

Page 14: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Step 1: Booting image

Boot the head node, IP address leased by the DHCP server is displayed after the boot sequence.

Page 15: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Step 2: after the successful, two setup options—EASY and ADVANCED—are

displayed on the screen.

Page 16: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Step 3: Boot work nodes

All nodes must support PXE boot; The system must automatically assess whether sufficient resources are available for the database arrangement of InterProScan4.1.

Page 18: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Page 19: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Download InterProScan database set

Page 20: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Testing

The system submits a single test job. The test jobs are completed in a few minutes. The condor job status is displayed on the browser, and Ganglia provides a large amount of information on all nodes. All configurations can be tested in this phase.

Page 21: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Results

Page 22: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Page 23: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Web site

http://big.gsc.riken.jp/index_html/Members/fumikazu/htc

Page 24: RIKEN Genomic Science Center Fumikazu KONISHI

Condor Week 2006

Questions