Slurm
From ScorecWiki
Contents |
Basic Usage
- sbatch - Submit batch jobs. (Similar to qsub in SGE.)
- sinfo - Show information on available machines and machine status.
- squeue - Show information on running and pending jobs. (Similar to qstat in SGE.)
- scancel - Cancel a running or pending job. (Similar to qdel in SGE.)
- salloc - Request an interactive allocation. (Similar to qsh in SGE.)
- execute compute jobs with srun, otherwise they will execute on the master node
Comprehensive man pages for these and more are available on the clusters.
Quick Walkthrough
Submit a job to the normal queue, requesting 16 processors, and running a job script called slurmrun.sh:
sbatch -p normal -n 16 slurmrun.sh
Check on the job status:
squeue sinfo
Cancel it if necessary:
scancel <jobid from squeue>
Example Job Script for MPICH
MPICH requires a hostfile to launch. The following example script will build the hostfile for you, and launch a job with mpirun:
#!/bin/bash -x HOSTFILE=/tmp/hosts.$SLURM_JOB_ID srun hostname -s > $HOSTFILE if [ -z "$SLURM_NPROCS" ] ; then if [ -z "$SLURM_NTASKS_PER_NODE" ] ; then SLURM_NTASKS_PER_NODE=1 fi SLURM_NPROCS=$(( $SLURM_JOB_NUM_NODES * $SLURM_NTASKS_PER_NODE )) fi /usr/local/mpich/latest/ch_p4/bin/mpirun -machinefile $HOSTFILE -np $SLURM_NPROCS /bigtmp/wickbt/a.out rm /tmp/hosts.$SLURM_JOB_ID
Submit the job to run with 8 processors (-n processors):
sbatch -p normal -n 8 slurmjob.sh
Please note that extra processors on a node will not be assigned to other jobs. Only one job is allowed on a node at a time to avoid resource contention. With this in mind, please always use a multiple of 4 processors (on borg) or 8 processors (on hydra) when submitting your job to avoid wasting resources.
Chaining Jobs Together
Slurm has a variety of methods to link jobs together. man sbatch gives more details on specific usage.
#!/bin/bash HOSTFILE=/tmp/hosts.$SLURM_JOB_ID srun hostname -s > $HOSTFILE if [ -z "$SLURM_NPROCS" ] ; then if [ -z "$SLURM_NTASKS_PER_NODE" ] ; then SLURM_NTASKS_PER_NODE=1 fi SLURM_NPROCS=$(( $SLURM_JOB_NUM_NODES * $SLURM_NTASKS_PER_NODE )) fi export MPICH_PROCESS_GROUP=no echo 'Starting job' date time /usr/local/mpich/latest/ch_p4/bin/mpirun -machinefile $HOSTFILE -np $SLURM_NPROCS /bin/sleep 60 echo 'Job completed' date
You could then submit this job multiple times with:
sbatch -P singleton (more options) (job script name)
and only one would be run at a time.
Partitions (Queues)
Both Hydra and Borg have debug and normal partitions.
- The debug queue is the default (if you do not use a -p <partition name> argument with sbatch or salloc), and has a one hour time limit.
- The normal queue has a 12-hour time limit.
These time limits are automatically enforced. Please make sure that your jobs will complete within the allocated time. If your simulation will not be able to complete under these limits please email help@scorec.rpi.edu to arrange access to longer-running queues.
Other References
- http://wiki.ccni.rpi.edu/index.php/SLURM - Slurm Documentation at CCNI
- http://schedmd.com/slurmdocs/ - Slurm Home Page
- http://schedmd.com/slurmdocs/quickstart.html - Slurm Quickstart Users Guide