36
Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson [email protected] http://www.cs.wisc.edu/ \ ~thomas/madlug

Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson [email protected]

Embed Size (px)

Citation preview

Page 1: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin - Madison Department of Computer Sciences

Linux Clusters

David Thompson

[email protected]

http://www.cs.wisc.edu/ \ ~thomas/madlug

Page 2: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Overview

• The Computer Systems Lab (CSL)

• Clusters

• The condor/db cluster

• Scalable Linux Administration

Page 3: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Overview

• The Computer Systems Lab (CSL)

• Clusters

• The condor/db cluster

• Scalable Linux Administration

Page 4: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Computer Systems Lab

• Purpose

• Staff

• Resources

Page 5: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin - Madison Department of Computer Sciences

Purpose

“To support the research and teaching missions of the Department of

Computer Sciences”

Page 6: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Staff

• 8 Full Time

• 12 - 20 Part Time

Page 7: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Responsibilities

• Networks– Gigabit, 100BaseT, ATM, FDDI– Cisco, Foundry routers– 3com, HP, Cisco switches

Page 8: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Responsibilities (cont.)

• Operating Systems– Solaris, Linux, Digital Unix, AIX, IRIX, NT

• Applications– compilers, dbs, simulators, email, image

processing....

Page 9: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Responsibilities (cont.)

• 641 software packages installed– 69 Gbytes– multiple version– each package installed for several architectures– several thousand builds

Page 10: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Responsibilities - (cont.)

• Workstations– 600 PCs (including cluster)– 200 Sparcs– 15 Alphas– others

• 5600 User home directories– 69 Gbytes

Page 11: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Responsibilities (more)

• AFS– 1 Tbyte of ubiquitous file space– 14 File Servers, 3 db Servers– 95% client cache hit rates

• Backups– 2 week epoch cycle (1 Tb)– Daily incs

Page 12: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Overview

• The Computer Systems Lab (CSL)

• Clusters

• The condor/db cluster

• Scalable Linux Administration

Page 13: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Clusters

• Definitions

• Architectures

• Example

• Applications

Page 14: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Definitions

• NOW - Network of workstations

• COW - Cluster of workstations– “Some degree of network isolation”– “Dedicated function”

Page 15: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Architectures

• N-dimensional arrays– “previous & next” neighbor– hypercube

• Simple Network

Page 16: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Architectures

• Distributed – MPI– PVM– condor

Page 17: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Examples

• The Hive

• http://newton.gsfc.nasa.gov/thehive/

Page 18: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Examples - The Hive

Page 19: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Examples - The Hive (cont.)

Page 20: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Redundant Networks

http://einstein.drexel.edu/beowulf/Beowulf.html

Page 21: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

http://www.cs.nmsu.edu/pcl/

Page 22: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Cluster Applications

• Image Analysis– http://newton.gsfc.nasa.gov/thehive/

thehive_dir/tilton.html

• Parallel Virtual File System (PVFS)– http://ece.clemson.edu/parl/pvfs/

• Speech Recognition– http://noel.feld.cvut.cz/magi/

Page 23: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Cluster Applications (cont.)

• Physics– Viscoelasticity

• http://www.meca.ucl.ac.be/memawww/deepflow/

– Seismology• http://weland.esd.mun.ca/index.html

– Big Bang• http://www.phy.duke.edu/~muller/BRAHMA/

index.html

Page 24: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Cluster Applications (cont.)

• Physics (cont.)– Laser Interferometer Gravitational-Wave

Observatory (LIGO)• http://www.ligo.caltech.edu/

– NA49 (??)• http://na49info.cern.ch/

– Large Acceptance Hadron Detector for an Investigation of Pb-induced Reactions at the CERN SPS

Page 25: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Overview

• The Computer Systems Lab (CSL)

• Clusters

• The condor/db cluster

• Scalable Linux Administration

Page 26: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Computer Science Cluster

• Two connected clusters– Dual Xeon 550mHz, 512k cache, 1 Gig RAM,

Ultra 2 SCSI 9 Gig boot disk, tulip network– 64 node compute cluster– 36 node db cluster with 4 extra 9 Gig disks and

GNIC-II Gigabit ethernet– Red Hat Linux 6.1, kernel 2.2.12

Page 27: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Cluster Architecture

Page 28: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Cluster Picture

Page 29: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Overview

• The Computer Systems Lab (CSL)

• Clusters

• The condor/db cluster

• Scalable Linux Administration

Page 30: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Scalable Linux Administration

• What

• Why

• Installation

• Maintenance

Page 31: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Scalable Admin - What

• Leverage

• Control systems

• Remote monitoring

• Operating system upgrades

• Centralized Services– kerberos, afs, logging

Page 32: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Scalable Admin - Why

• Consistent user view– Available applications– Stability

• Predictable Admin Environment

• Security

Page 33: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Scalable Admin - Installation

• Red Hat Kickstart– Configuration file

• network config, nfs locations, disk layout, RPMs to install

– Boot disk, nfs, or bootp/dhcp– Post-install script– redhat-6.1/i386/doc/HOWTO/KickStart-HOWTO

Page 34: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Sample Kickstart Script

# $Id: ks.cfg,v 1.3 1999/10/07 18:57:24 thomas Exp $lang en_USnetwork --bootproto bootpnfs --server pinstall.cs.wisc.edu --dir

/install/redhat-6.0/i386keyboard uszerombr yesclearpart --allpart / --size 100#part /tmp --size 300part /var --size 75part /usr --size 570part swap --size 127part /var/vice/cache --size 120part /local --size 2 --grow --maxsize 4000

Page 35: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin Madison Department of Computer Sciences

Scalable Admin - Maintenance

• Update RPMS– Create list of RPMs, versions, and files to install– Each computer updates based on list

• Special files– package (afs)– cfengine (gnu)– config files (filedist)

Page 36: Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson thomas@cs.wisc.edu

Computer Systems LabThe University of Wisconsin - Madison Department of Computer Sciences

Linux Clusters

David Thompson

[email protected]

http://www.cs.wisc.edu/~thomas/madlug