Introduction to HPCC for CEM956
Chun-Min ChangResearch Consultant
Institute for Cyber-Enabled Research
Download this presentation: https://wiki.hpcc.msu.edu/display/TEAC/2016-09-13+CSM+956
How this workshop works
• We are going to cover some basics with hands on exampes.
• Exercises are denoted by the following icon in this presents:
• Bold italic means commands which I expect you type on your terminal in most cases.
Green and Red Sticky
• Use the sticky notes provided to help me help you.- No Sticky = I am working
- Green = I am done and ready to move on (yea!)
- Red = I am stuck and need more time and/or some help
If you have not followed the set up instructions: https://wiki.hpcc.msu.edu/x/uQViAQ
Please do now
Agenda
• Introduction – iCER & MSU HPC system
• How to Use the HPC– Get on – Load your supply– Work on board
• Summary– Terms– Commands– Websites
Agenda
• Introduction – iCER & MSU HPC system
• How to Use the HPC– Get on – Load your supply– Work on board
• Summary– Terms– Commands– Websites
iCER Overview
• iCER is a research unit at MSU. We:– Maintain the university’s supercomputer– Organize Training/workshops– Provide 1-on-1 consulting– Help with grant proposal
iCER Overview
• What does iCER stand for?– Institute for Cyber-Enabled Research– Established in 2009 to encourage and support the
application of advanced computing resources and techniques by MSU researchers.
• Mission– iCER’s mission to help researchers with computational
components of their research
Funding From…
• The Vice President office for Research and Graduate Studies (VPRGS)
• Engineering College, College of Natural Science and College of Social Science
• This allows us to provides services and resources for FREE!!!
Why use HPCC? (Latency and Throughput)
• Your own computer is too slow• You have large amounts of data to process (hundreds of
gigabytes)• Your program requires massive amounts of memory ( 64gb - 6 TB )• Your work could be done in parallel • You need specialized software (in a Linux environment)• You need licensed software (Stata, Matlab, Gaussian, etc)• Your work takes a long time (days, weeks)• You want to collaborate using shared programs and data• Software that can take advantage of accelerator cards
(Graphics Processors)
HPC vs PCEach computing node in a cluster is a single computer
Your Computer
2014 Cluster 2016 Cluster
Processors 1 2 per node 2 per node
Cores per processor
4-8 20 per node 28 per node
Speed 2.5 - 3 ghz 2.5 ghz 2.4 ghz
Connection to other computers
Campus ethernet1,000 MBit/sec
"Infiniband" 50,000 MBit/sec
"Infiniband" 50,000 MBit/sec
Computers(nodes)
1 => 8 cores total
223 =>4,676 cores total
367 =>10,276 cores total
Users 1 ~1,200 ~1,200
Schedule On Demand 24/7, Queue 24/7, Queue
High Performance => Running work in parallel, for long periods of time
MSU HPC System
• Large Memory Nodes (up to 6TB!)• GPU Accelerated cluster (K20, k80)• Phi Accelerated clusters (5110p)• Over 590 nodes, 15000 computing cores• New 2PB high speed parallel scratch file space• 50GB replicated file spaces (upgragtable to 1TB)• Access to large open-source software stack
Shared !!! (& More nodes coming)
Cluster Resources (for Computing Nodes)
Year Name Description ppn Memory Nodes Total Cores
2016 intel16 Intel Xeon E5-2680 v4 (2.4 GHz) 28 128GB+ 367 10,276
2011 intel11 Intel Xeon E7-8837 (2.67 GHz) 32 512GB 2 64
32 1TB 1 32
64 2TB 2 128
2014 intel14 Intel Xeon E5-2670 v2 (2.5 GHz) 20 64GB 128 2560
20 256GB 23 460
2 NVIDIA K20 GPUs 20 128GB 39 780
2 Xeon Phi 5110P 20 128GB 27 540
Large Memory 48 or 96 1 - 6 TB 6 336
595 15,176
Available Software
• Compilers, debuggers, and profilers– Intel compilers, Impi, openmpi, mvapich, totalview, DDD,
DDT…
• Libraries– BLAS, FFTW, LaPACK, MKL, PETSc, Trilinos…
• Commercial Software– Gaussian, GaussView, Mathematica, FLUENT, Abaqus,
MATLAB…
NOTE: Not include software installed privately under users’ own space
Online Resources
• icer.msu.edu: iCER Home– hpcc.msu.edu: HPCC Home
• wiki.hpcc.msu.edu: HPCC User Wiki– Documentation and user manual
• For workshops and registration details, visit:http://icer.msu.edu/events
Agenda
• Introduction – iCER & MSU HPC system
• How to Use the HPC– Get on – Load your supply– Work on board– Available Services
• Summary– Terms– Commands– Websites
Student Accounts
• Accounts created for classroom use are added to a Class group (cem956_fs16_s1). These accounts are usually kept active for 6 months and then automatically disabled unless a membership change is requested by a faculty member. When students have dropped the course. should have their HPCC accounts disabled.
• TA/faculty will answer students’ questions. • Students can attend iCER monthly Introduction to HPCC
workshop www.icer.msu.edu/upcoming-workshops
Types of space: Home space /mnt/home/$USER
Scratch space /mnt/scratch/$USER
Get general Accounts
• PIs can request accounts (for each group member) at http://contact.icer.msu.edu/account
• Each account has access up to:– 50 GB of replicated file space (/mnt/home/userid)– 520 processing cores– 2 PB of high-speed scratch space (/mnt/scratch/userid)
• Also available: shared group folder upon request by PI
Three types of space: Home space /mnt/home/$USERGroup space /mnt/research/<group>Scratch space /mnt/scratch/$USER
/mnt/ls15/scratch/groups/<group>
Connecting to the HPCC using ssh
• Windows: we recommand “mobaXterm” as connection client.– Insert the thumb drive– Open appropriate folder– Looking for MobaXterm– Install MobaXterm– Server: hpcc.msu.edu or rdp.hpcc.msu.edu– Username=your netid; password = your netid password
• OSX: – Access spotlight from the menu bar icon (or ⌘ SPACE)– Type terminal– In terminal, type: ssh [email protected]
Our supercomputer
is a “remote service”
Log in to gateway
dev-intel16
dev-intel16-k80
dev-intel14-k20
dev-intel14-phi
dev-intel14
HPCC message
Hostname is “Gateway”
Command prompt at bottom
(Note: file list on left side does not present in Mac terminal)
Successful connection HPCC Gateway
Gateway Nodes
• Shared: accessible by anyone with an MSU-HPCC account.
• Shared resource -- hundreds of users on the gateway nodes
• Only means of accessing HPCC compute resources.• Useful information (status, file space, messages)• ** DO NOT RUN ANYTHING ON THESE NODES! **
Our supercomputer
is a “remote service”
ssh to a dev node
dev-intel16
dev-intel16-k80
dev-intel14-k20
dev-intel14-phi
dev-intel14
All the board!
• If you’re not connected to the HPCC, please do so now.• From gateway, please connect to a development node:
ssh dev-intel16-k80 or ssh dev-intel14-phi• ssh == secure shell (allows you to securely connect to
other computers)
• Take a look at what you have in your account (ls)
• Take a look at what available on system (ml)
• Jump to other dev-node and exit
MobaXterm (ssh to a dev node)
Developement Nodes
• Shared: accessible by anyone with an MSU-HPCC account
• Meant for testing / short jobs.• Currently, up to 2 hours of CPU time• Development nodes are “identical” to compute
nodes in the cluster • Node name descriptive (name= feature, #=year)• http://wiki.hpcc.msu.edu/x/pwNe (HPCC Quick Reference Sheet)
Compute Nodes
• Use them by job submission (reservation)
• Dedicated: you request # cores, memory, walltime(also for advanced users: accelerators, temporary file space, licenses)
• Queuing system: when those resources are available for your job, those resources are assigned to you.
• Two modes– batch mode: generate a script that runs a series of
commands (use pbs job scription)– Interactive mode (qsub –I)
Nodes & Clusters
A “node” is a single computer, usually with a few processors and many cores. Each node has it’s own host name/address like 'dev-intel14'
A “cluster” is composed of dozens (or hundreds) of nodes, tied together with high speed network
Types of Nodes: gateway : computer available to outside world for you to login ; doorware
only; not enough to run programs. From here, one can connect to other nodes inside the HPCC network
dev-node : computer you choose and log-in to do setup, development, compile, and testing your programs using similar configuration as cluster nodes for the most accurate testing. dev-nodes includes dev-intel10, dev-intel14, dev-intel14-k20 (which includes k20 GPU add-in card).
compute-nodes: run actual computation; accessible via the scheduler/queue; can’t log in directly
gpu, phi: nodes within a cluster that have special-purpose hardware. ‘fat-nodes’ have large RAM
Basic navigation commands: ls
Use the Command with the Options to look into your home directory:
ls list files and directories
Some options for ls command
-a list all files and directories
-h print sizes in human readable format (e.g., 1k, 2.5M, 3.1G)
-l list with a long listing format
-t sort by modification time
Basic navigation commands: cd
Use the Command with the Options to try:
cd directory_name change to named directory
cd change to home-directory
cd ~ change to home-directory
cd .. change to parent directory
cd - change to the previous directory
pwd display the path of the current directory
Agenda
• Introduction – iCER & MSU HPC system
• How to Use the HPC– Get on – Load your supply– Work on board– Available Services
• Summary– Terms– Commands– Websites
Old fashion
• scp source destination
scp == secure copy
SFTP
Use hostname: rsync.hpcc.msu.edu
File Transfer & Loading Software
• Transfer to/from your computer– MobaXterm– Cyberduck (thumb drive)
– Mapping to HPCC https://wiki.hpcc.msu.edu/display/hpccdocs/Mapping+HPC+drives+to+a+campus+computer
• Load to/from HPCC– Module – Installation– From shared space
MobaXterm (Sessions -> New session -> SFTP)
Transferring Files
• SFTP program– download and install
cyberduck– https://cyberduck.io/
• Hostname: rsync.hpcc.msu.edu
• Username: your netid• Password: your netid password• Port : 22
Cyberduck
Mapping HPC drives to a campus computer
You can connect your laptop to the HPCC home & research directories Can be very convenient and work on Windows 10!
For more information see : https://wiki.hpcc.msu.edu/x/VYCe
1. determine your file system using the command df -h ~Filesystem Size Used Avail Use% Mounted onufs-10-b.i:/b10/u/home/billspat
2. server path Mac: smb: //ufs-10-b.hpcc.msu.edu/billspat Windows: \\ufs-10-b.hpcc.msu.edu\billspat
3. connectMac : Finder | connect to server | put in server path | netid and PasswordWindows : My Computer | Map Network Drive | select letterWindows fix: user name = hpcc\netid (same password)
Summary of MSU HPCC File SystemsGLOBAL function path (mount point) Env
Home directories your personal /mnt/home/$USER $HOME
Research Space Shared with working group /mnt/research/<group>
Scratch temporary computational workspace
/mnt/scratch/$USER/mnt/ls15/scratch/groups/<group>
$SCRATCH
Software (Gaussian) software installed /opt/software $PATH
LOCAL
Operating system
local disk, don't use these /etc, /usr/lib /usr/bin
temp or local fast in each single node only, some programs use /tmp
/tmp or /mnt/local
Directories
/mnt/home/$USERbin data mnt tmp
/
root
home scratch research
colbrydi gmason doortiCMICH
Example: File Manipulation
• Try Commands
• See the contents of your “.bashrc” file> cat .bashrc
• Make a directory called “hpccworkshop”, change to that directory and list the contents.> mkdir hpccworkshop
> cd ./hpccworkshop
mkdir make directorycp copy filecat display contents of text filerm remove file
Exercise!
Please type (in your terminal):module load powertoolsgetexample intro_workshop
•This will copy some example files (for this workshop) to your directory. •Exercise: try to find the file ‘youfoundit.txt’ in the “hidden” directory.
(Advanced Tip)
• Tab Completion short cut– Typing out directory/file/program names is
time consuming.– When you start typing out the name, hit tab
key. The shell will try to fill in the rest of the directory name
– E.g., return to home directorycd type “cd i”, then press tab
(Advanced tip 2)
• Access previous command by pressing up arrows• See full history by typing
history• Repeat previous command in history by using the
exclamation sign, e.g.[ongbw@dev-intel10] % history…302 cd scripts
!302 will call the command “cd scripts”
Examining Files
• Print partial contents of a file using head, tailhead filename tail filename
• When the file is really big, use “less”less filename
– Use arrow keys to scroll by line– Space to go to the next page– “b” to go backwards– “g” to go to the beginning– “G” to go to the end– “q” to quit– “/” to search
Creating / Editing text files
• You “could” create/edit files on your local system and transfer them using SFTP– Downside: slow– Need to run command “dos2unix” if you are
running a windows system• Much “faster” in the long term to learn how to
edit files in the command line
Editors
• Choice of editor is really a religion– emacs (my favorite, powerful)– vi / vim (popular, powerful) – Nano (simple, easy to learn)
• Today, we will learn nano. To start, typenano newfile.txt
• Commands are in the bottom. (^ == control key)• Create a file that says: this is a text file!
Nano
• Nano does not leave any output on the screen after it exists.
• But ls now shows that we have a new file called newfile
• Lets tidy up by deleting this new file:rm newfile
** NOTE: no undelete in Linux (unlike windows)
File Systems and static link
Files can have aliases or shortcuts, so one file or folder can be accessed multiple way, e.g.
/mnt/scratch is actually a static link to
/mnt/ls15/scratch/users
try this: ls –l /mnt/scratchcd /mnt/scratch/$USER # replace netid with YOUR netidpwd
What do you see? Note that "ls15" stands for "Lustre Scratch 2015",
meaning the scratch lustre file system installed in 2015.
Home directories are private; Research folders are shared by group cd ~ls -al # -rw------- 1 billspat staff-np 3360642 Apr 9 10:35 myfiles.sam
Group Example: (you may not have a research group)ls -al /mnt/research/holekamplab #this won't work for you-rw-rw-r-- 1 billspat holekamplab 3360642 Apr 9 10:35 eg1.sam
The Difference is read and write (rw) by group
# /mnt/scratch/netid may be public to system unless you make it privateTo share a file using your scratch directory:
touch ~/testfilecp ~/testfile /mnt/scratch/<netid>/testfilechmod g+r /mnt/scratch/<netid>/testfile
Private and Shared files
Can copy files from system to laptop with secure ftp = sftp or secure copy = scp
Example1. create a file to copy, please type
cd ~/classecho 'Hello from $USER' > greeting.txtecho 'Created on `date`' >>greeting.txtcat greeting.txt# make sure you are in the directory with pwd command, and it’s lower case2. exit from the HPCCexit; exit3. in terminal or Moba issue following command scp => secure copyscp [email protected]:class/testfile.txt testfile.txtopen testfile.txt
Windows: MobaXTerm auto connected, files are listed on the side
Copying files to/from HPCC
Loading Software
• Load to/from your computer– MobaXterm– Cyberduck– Mapping to HPCC
https://wiki.hpcc.msu.edu/display/hpccdocs/Mapping+HPC+drives+to+a+campus+computer
• Load to/from HPCC– Module – Installation– From shared space
Module System
• For maximizing the different types of software and system configurations that are available to the users, HPCC uses a Module system to set Paths and Libraries.
• Key Commands– module list : List currently loaded modules– module load modulename : Load a module– module unload modulename : Unload a module– module spider keyword : Search modules for a keyword– module show modulename : Show what is changed by a
module – module purge : Unload all modules
Using Modules
module list # show modules currently loaded in your profile
module purge # clear all modules
module list # none loaded
# which version of Matlab do you need?module spider Gaussian
module spider Python/2.7.2
# after purging (no modules), find modules using module list
module load Gaussian/g09
module unload Gaussian
Some Software requires other base programs to be loaded
module purge # remove all modulesmodule load R/3.1.1# … error: why? base modules not loaded
# to find out what is needed, please use spider command for detailed outpu
module spider R/3.1.1
module load gnumodule load openmpimodule load R/3.1.1
Default Modules re-load every time you connect
module purge # remove all modulesmodule listexit # return to gatewayssh dev-intel10module list
# … what do you see? default modules
module load Mathematica; module listssh dev-intel14module list
Need to load all modules you need for every session or jobor look into auto loading of modules using profile (covered later)
Powertools
• Powertools is a collection of software tools and examples that allows researchers to better utilize High Performance Computing (HPC) systems.
• module load powertools
• quota # list your home directory disk usage • poweruser # Set up your account to load powertools by default• priority_status #IF part of a buy-in group; Shows the status of an individuals
(buy-in nodes)• userinfo <$USER> # get details of a user (from public MSU directory)
Load supply from HPCC
• PleaseLoad Software modules type (in your terminal):module load powertoolsgetexample intro2hpcc
See the changes before and after module load•This will copy some example files (for this workshop) to your directory. •Exercise: try to find the file ‘youfoundit.txt’ in the “hidden” directory.
getexample
• You can obtain a lot of examples through getexample. Take advantage of it!
Useful Powertools commands
• getexample : Download user examples• powertools : Display a list of current powertools or a
long description• licensecheck : Check software licenses• avail : Show currently available node resources
• For more information, refer to this webpage (HPCC Powertools)
• https://wiki.hpcc.msu.edu/x/p4Bn
getexample
• If you do not load powertools, please do it now:• Download the helloworld example using
getexample• Check what you downloaded. What is the biggest
file?
ssh dev-intel10mkdir classcd classmodule load powertoolsgetexample getexample helloworldcd helloworldlscat READMEcat hello.ccat hello.qsub
Using MSU HPCC 'getexample' tool
connect to dev nodecreate directorychange directoryHPCC scriptslist all examplescopy one examplechange directorylist filesshow README fileshow c programshow qsub program
We will test using a simple program from HPCC examples.So, after connecting to hpcc, please type :
The README file has instructions for compiling and runninggcc hello.c -o hello # compile and output to new program
'hello'./hello # test run this hello program. Does it work?qsub hello.qsub # submit to queue to run on cluster
This 'job' was assigned the 'job' id 29444624 and put into the 'queue.'Now, we wait for it to be scheduled, run, and output files created...
Compile, Test, and Queue
Agenda
• Introduction – iCER & MSU HPC system
• How to Use the HPC– Get on – Load your supply– Work on board– Available Services
• Summary– Terms– Commands– Websites
CLI vs. GUI
• CLI: Command Line Interface: So far, we used CLI
• GUI: Graphical User Interface
Run Matlab on HPCC
• Run Matlab in command line mode• Can you Run Matlab in GUI ?
If you have problem open GUI, you may need some more tools
What is X11 and what is needed for X11?
• Method for running Graphical User Interface (GUI) across a network connection
SSH
X11
What is needed for X11?
• X11 server running on your personal computer• SSH connection with X11 enabled• Fast network connection
– Preferably on campus
Run X11 and RDP on Windows
• Install MobaXterm• http://mobaxterm.mobatek.net/download.html• at MobaXterm,
– ssh -X [email protected]
Run remote desktop onrdp.hpcc.msu.edu
Run X11 & rdp on Mac (OSX)
• Mac users:• Install XQuartz from iCER thumb drive• Run XQuartz• Quit terminal• Run terminal again• ssh -X [email protected]
Download remote desktop from Apple store (free) and run it on
rdp.hpcc.msu.edu
Test GUI using X11
• run X11• Try one of the following commands
xeyesor/andfirefox
Batch vs. Interactive
• Send job to scheduler and run in batch queue
• Interactive
Running Jobs on the HPC
• The developer (dev) nodes are used to compile, test and debug programs
• Two ways to run jobs– submission job scripts– interactive way
• Submission scripts are used to run (heavy/many) jobs on the cluster.
Advantages of running on dev-nodes
• Yo do not need to write a submission script• Yo do not need to wait in the queue• You can provide input to and get feedback from
your programs as they are running
Disadvantages of running on dev-nodes
• All the resources on developer nodes are shared between all uses
• Any single process is limited to 2 hours of cpu time. If a process runs longer than 2 hours it will be killed.
• Programs that overutilize the resources on a developer node (preventing other to use the system) can be killed without warning.
Programs that can use GUI
• GaussView• MATLAB• Mathematica• totalView : C/C++/fortran debugger especially for
multiple processors• DDD (Data Display Debugger) : graphical front-end
for command-line debugger• Etc, etc, and etc
Batch commands
• We’ve focused so far on working interactively (command line)
• Need dedicated resources for a computationally intensive task? – List resource requests in a job script– List commands in the job script– Submit job to the scheduler– Check results when job is finished
The Queue
The system is busy and used by many peoplemay want to run one program for a long timemay want to run many programs all at once on multiple nodes or
cores
To share we have a 'resource manager' that takes in requests to run programs, and allocates them
you write a special script that tells the resource manager what you need, and how to run your program, and 'submit to the queue' (get in line) with the qsub command
We submitted our job to the queue using the command qsub hello.qsub
And the 'qsub file' had the instructions to the scheduler on how to run
A Submission Script is a mini program for the scheduler that has :
1.List of required resources2.All command line instructions needed to run the computation
including loading all modules3.Ability for script to identify arguments4.collect diagnostic output (qstat -f)
Script tells the scheduler how to run your program on the cluster. Unless you specify, your program can run on any one of the 7,500 cores available.
Submission of Job to the Queue
Submission Script
• List of required resourceshttps://wiki.hpcc.msu.edu/display/hpccdocs/Scheduling+Jobs
• All command line instructions needed to run the computation
Typical Submission Script
#!/bin/bash -login
### define resources needed:#PBS -l walltime=00:01:00#PBS -l nodes=5:ppn=1#PBS -l mem=2gb
### you can give your job a name for easier identification#PBS -N name_of_job
### reserve license for conmercial software#PBS –W x=gres:MATLAB
### load necessary modules, e.g.module load MATLAB
### change to the working directory where your code is locatedcd ${PBS_O_WORKDIR}
### call your executablematlab –nodisplay –r “test(10)”
# Record Job execution informationqstat –f ${PBS_JOBID}
Submit a job
• Go to the helloworld directorycd ~/helloworld
• Create a simple submission scriptnano hello.qsub
• See next slide to edit the file…
hello.qsub
#!/bin/bash –login#PBS –l walltime=00:05:00#PBS –l nodes=1:ppn=1
cd ${PBS_O_WORKDIR}./hello
qstat –f ${PBS_JOBID}
Details about job script
• “#” is normally a comment except– “#!” special system commands– #!/bin/bash– #PBS instructions to the scheduler#PBS -l nodes=1:ppn=1#PBS -l walltime=hh:mm:ss#PBS -l mem=2gb (!!! Not per core but total)
• http://wiki.hpcc.msu.edu/x/Np-T (Scheduling Jobs)
Submitting and monitoring
• Once job script created, submit the file to the queue: qsub hello.qsub
• Record job id number (######) and wait around 30 seconds
• Check jobs in the queue with: qstat -u netid & showq -u netid
• What about using the Commands qs & sq ?• Status of a job: qstat –f jobid• Delete a job in a queue: qdel jobid
Common Commands
• qsub “submission script”– Submit a job to the queue
• qdel “job id”– Delete a job from the queue
• showq - u “user id”– Show the current job queue of the user
• checkjob -v “job id”– Check the status of the current job
• showstart -e all “job id”– Show the estimated start time of the job
Scheduling Priorities
• NOT First Come First Serve!• Jobs that use more resources get higher priority
(because these are hard to schedule)• Smaller jobs are backfilled to fit in the holes created
by the bigger jobs• Eligible jobs acquire more priority as they sit in the
queue• Jobs can be in three basic states: (check states by showq)
– Blocked, eligible or running– Jobs in BatchHold
Scheduling Tips
• Requesting more resources does not make a job runfaster unless you are running a parallel program.
• The more resources you request, the “harder” it is for the scheduler to reserve those resources.
• First time: over-estimate how many resources you need, and then modify appropriately.
• (qstat –f ${PBS_JOBID} at the bottom of your scripts will give you resources information when the job is done)
• Job finished time = Job waiting time + Job running time
Advanced Scheduling Tips
• Resources– A large proportion of the cluster can only run jobs that are
four hours or less– Most nodes have at least 24 GB of memory– Half have at least 64 GB of memory– Few have more than 64 GB of memory– Maximum running time of jobs: 7 days (168 hours)– Maximum memory that can be requested: 6 TB
• Scheduling Limit– total 520 cores using on HPCC at a time– 15 eligible jobs at a time– 1000 submitted jobs
Job Completion
• By default the job will automatically generate two files when it completes:– Standard Output:
• Ex: jobname.o5945571– Standard Error:
• Ex: jobname.e5945571• You can combine these files if you add the join
option in your submission script:#PBS -j oe
• You can change the output file name#PBS -o /mnt/home/netid/myoutputfile.txt
Other Job Properties
• resources (-l)– Walltime, memory, nodes, processor, network, etc.https://wiki.hpcc.msu.edu/display/hpccdocs/Advanced+Specification+of++Resources
#PBS –l feature=gpgpu,gbe#PBS –l nodes=2:ppn=8:gpu=2#PBS –l mem=16gb
• Email address (-M)#PBS –M [email protected]
- Email Options (-m)#PBS –m abe
Many others, see the wiki: http://wiki.hpcc.msu.edu/
Advanced Environment Variables
• The scheduler adds a number of environment variables that you can use in your script:
https://wiki.hpcc.msu.edu/display/hpccdocs/HPCC+Quick+Reference+Sheet
PBS_JOBID• The job number for the current job.
PBS_O_WORKDIR• The original working directory which the job was
submittedhttps://wiki.hpcc.msu.edu/display/hpccdocs/Advanced+Scripting+Using+PBS+Environment+Variables
• Examplemkdir ${PBS_O_WORKDIR}/${PBS_JOBID}
Agenda
• Introduction – iCER & MSU HPC system
• How to Use the HPC– Get on – Load your supply– Work on board
• Summary– Terms– Commands– Websites
Terms
• ICER, HPC• Account: login, directory (folder) • Node: Laptop, gateway, dev-nodes, comp-nodes• Cores: nodes=2:ppn=4 • Memory: total, per node, mem=2gb• Storage: home, research, scratch, disk• Module: list, load, spider, unload, purge, show,……• Files: software, data, job script • Path: • Queue: qsub, qstat, qdel.• Jobs: JobID,
Commands
• ssh• cd, ls, pwd, • module sub_comand• rm, mv, cp, scp• more, less, cat • nano, vi, • qsub, qstat, qdel, showq• echo, export • Mkdir, • tar, unzip, zip
Websites
• HPCC user wiki:http://wiki.hpcc.msu.edu
• Training informationhttp://icer.msu.edu/upcoming-workshops
Linux Commands:• An exhaustive list
http://ss64.com/bash/• A useful cheatsheet
http://fosswire.com/post/2007/08/unixlinux-command-cheat-sheet/• Explain a command given to you
http://explainshell.com/
Q & A
Thanks!
When would I use the HPC?
• Takes too long for computation• Runs out of memory• Needs licensed software• Needs advanced interface (visualization/database)• Read/write Lots of data
(Latency and Throughput)
Goals of this short workshop : Understanding of...
• the nature of high-performance computing (HPC) and when to use it;• the overall structure the MSU HPC Center (HPCC);• Understand the different ways (Xterm, ssh, RDP) of connecting to the
HPCC• the steps to run software on the MSU HPCC (workflow, module system);• managing files and data• working environment, using or developing software, and testing;• how to run programs on the HPC in 'batch' via the scheduler;• using GUI programs on the HPC for dev or computation• how to get help, training, or consultation
In short, Use Computation to "Reduce the Mean Time to Science" - Dirk Colbry, former director MSU HPCC
Connecting to the HPCC using ssh
Transferring Files
• Hostname: rsync.hpcc.msu.edu• Username: your netid• Password: your netid password• Port : 22
Exercise: download this presentation from/mnt/scratch/changc81/intro2hpcc/intro2hpcc.pdfto your local system