Skip to main content

SLURM: Scheduling and Managing Jobs

For a printable list of SLURM commands, download the ACCRE Cheat Sheet. SchedMD, the creators of SLURM, have a printable reference as well.

SLURM (Simple Linux Utility for Resource Management) is a software package for submitting, scheduling, and monitoring jobs on large compute clusters.  This page details how to use SLURM for submitting and monitoring jobs on ACCRE’s Vampire cluster. New cluster users should consult our Getting Started pages, which is designed to walk you through the process of creating a job script, submitting a job to the cluster, monitoring jobs, checking job usage statistics, and understanding our cluster policies. SLURM has been in use for job scheduling since early 2015; previously Torque and Moab were used for that purpose.

This page describes the basic commands of SLURM. For more advanced topics, see the page on GPUs, Parallel Processing and Job Arrays. ACCRE staff have also created a number of utilities to assist you in scheduling and managing your jobs.

All the examples on this page can be downloaded from ACCRE’s Github page by issuing the following commands from a cluster gateway:

module load GCC git
git clone https://github.com/accre/SLURM.git

Batch Scripts

The first step for submitting a job to SLURM is to write a batch script, as shown below. The script includes a number of #SBATCH directive lines that tell SLURM details about your job, including the resource requirements for your job. For example, the example below is a simple Python job requesting 1 node, 1 CPU core, 500 MB of RAM, and 2 hours of wall time. Note that specifying the node (#SBATCH --nodes=1 ) and CPU core ( #SBATCH --ntasks=1 ) count must be broken off into two lines in SLURM.

#!/bin/bash
#SBATCH --mail-user=myemail@vanderbilt.edu
#SBATCH --mail-type=ALL
#SBATCH --nodes=1    # comments allowed 
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
#SBATCH --mem=500M
#SBATCH --output=python_job_slurm.out

# These are comment lines
# Load the Anaconda distribution of Python, which comes
# pre-bundled with many of the popular scientific computing tools like
# numpy, scipy, pandas, scikit-learn, etc.
module load Anaconda2

# Pass your Python script to the Anaconda2 python intepreter for execution
python vectorization.py

Note that a SLURM batch script must begin with the #!/bin/bash directive on the first line. The subsequent lines begin with the SLURM directive #SBATCH followed by a resource request or other pertinent job information. Email alerts will be sent to the specified address when the job begins, aborts, and ends. Below the #SBATCH directives are the Linux commands needed to run your program or analysis. Once your job has been submitted via the sbatch command (details shown below), SLURM will match your resource requests with idle resources on the cluster, run your specified commands on one or more compute nodes, and then email you (if requested in your batch script) when your job begins, ends, and/or fails.

Here is a list of basic #SBATCH directives:

#SBATCH DirectiveDescription
--nodes=[count]Node count
--tasks-per-node=[count]Processes per node
--ntasks=[count]Total processes (across all nodes)
--cpus-per-task=[count]CPU cores per process
--nodelist=[nodes]Job host preference
--exclude=[nodes]Job host to avoid
--time=[min] or --time=[dd-hh:mm:ss]Wall clock limit
--mem=[count]RAM per node
--mem-per-cpu=[count][M or G]RAM per CPU core
--output=[file_name]Standard output file
--error=[file_name]Standard error file
--array=[array_spec]Launch job array
--mail-user=[email_address]Email for job alerts
--mail-type=[BEGIN or END or FAIL or REQUEUE or ALL]Email alert type
--account=[account]Account to charge
--depend=[state:job_id]Job dependency
--job-name=[name]Job name
--constraint=[attribute]Request node attribute (westmere, sandy_bridge, haswell, eight, twelve, sixteen)
--partition=[name]Submit job to specified partition (production (default), debug, maxwell, fermi)

Note that the --constraint option allows a user to target certain processor families.

Partitions (Queues)

All non-GPU groups on the cluster have access to the production and debug partitions. The purpose of the debug partition is to allow users to quickly test a representative job before submitting a larger number of jobs to the production partition (which is the default partition on our cluster). Wall time limits and other policies for each of our partitions are shown below.

PartitionMax Wall TimeMax Running JobsMax Submitted JobsResources
production14 daysn/an/a6000-6500 CPU cores
debug30 minutes258 CPU cores
pascal5 daysn/an/a80 CPU cores, 40 Maxwell GPUs
maxwell5 daysn/an/a144 CPU cores, 48 Maxwell GPUs

Commands

SLURM offers a number of helpful commands for tasks ranging from job submission and monitoring to modifying resource requests for jobs that have already been submitted to the queue. Below is a list of SLURM commands:

SLURMFunction
sbatch [job_script]Job submission
squeueJob/Queue status
scancel [JOB_ID]Job deletion
scontrol hold [JOB_ID]Job hold
scontrol release [JOB_ID]Job release
sinfoCluster status
sallocLaunch interactive job
srun [command]Launch (parallel) job step
sacctDisplays job accounting information

sbatch

The sbatch command is used for submitting jobs to the cluster. sbatch accepts a number of options either from the command line, or (more typically) from a batch script. An example of a SLURM batch script (called simple.slurm ) is shown below:

#!/bin/bash
#SBATCH --mail-user=myemail@vanderbilt.edu
#SBATCH --mail-type=ALL
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --mem-per-cpu=1G
#SBATCH --time=0-00:15:00     # 15 minutes
#SBATCH --output=my.stdout
#SBATCH --job-name=just_a_test

# Put commands for executing job below this line
# This example is loading the Anaconda distribution of Python and
# writing out the version of Python
module load Anaconda2
python --version

To submit this batch script, a user would type:

sbatch simple.slurm

This job (called just_a_test ) requests 1 compute node, 1 task (by default, SLURM will assign 1 CPU core per task), 1 GB of RAM per CPU core, and 15 minutes of wall time (the time required for the job to complete). Note that these are the defaults for any job, but it is good practice to include these lines in a SLURM script in case you need to request additional resources.

Optionally, any #SBATCH line may be replaced with an equivalent command-line option. For instance, the #SBATCH --ntasks=1 line could be removed and a user could specify this option from the command line using:

sbatch --ntasks=1 simple.slurm

The commands needed to execute a program must be included beneath all #SBATCH commands. Lines beginning with the # symbol (without /bin/bash or SBATCH) are comment lines that are not executed by the shell. The example above simply prints the version of Python loaded in a user’s path. It is good practice to include any module load commands in your SLURM script. A real job would likely do something more complex than the example above, such as read in a Python file for processing by the Python interpreter.

For more information about sbatch see: http://slurm.schedmd.com/sbatch.html

squeue

squeue is used for viewing the status of jobs. By default, squeue will output the following information about currently running jobs and jobs waiting in the queue: Job ID, Partition, Job Name, User Name, Job Status, Run Time, Node Count, and Node List. There are a large number of command-line options available for customizing the information provided by squeue . Below are a list of examples:

CommandMeaning
squeue --longProvide more job information
squeue --user=USER_IDProvide information for USER_ID’s jobs
squeue --account=ACCOUNT_IDProvide information for jobs running under ACCOUNT_ID
squeue --states=runningShow running jobs only
squeue --format=account,username,numcpus,state,timeleftCustomize output of squeue
squeue --startList estimated start time for queued jobs
squeue --helpShow all options

For more information about squeue see: http://slurm.schedmd.com/squeue.html

sacct

This command is used for viewing information for completed jobs. This can be useful for monitoring job progress or diagnosing problems that occurred during job execution. By default, sacct will report Job ID, Job Name, Partition, Account, Allocated CPU Cores, Job State, and Exit Code for all of the current user’s jobs that completed since midnight of the current day. Many options are available for modifying the information output by sacct :

CommandMeaning
sacct --starttime 12.04.14Show information since midnight of Dec 4, 2014
sacct --allusersShow information for all users
sacct --accounts=ACCOUNT_IDShow information for all users under ACCOUNT_ID
sacct --format="JobID,user,account,elapsed, Timelimit,MaxRSS,ReqMem,MaxVMSize,ncpus,ExitCode"Show listed job information
sacct --helpShow all options

The --format option is particularly useful, as it allows a user to customize output of job usage statistics. We would suggest create an alias for running a customized version of sacct . For instance, the elapsedand Timelimit arguments allow for a comparison of allocated vs. actual wall time. MaxRSS and MaxVMSizeshows maximum RAM and virtual memory usage information for a job, respectively, while ReqMem reports the amount of RAM requested.

For more information about sacct see: http://slurm.schedmd.com/sacct.html

scontrol

scontrol is used for monitoring and modifying queued jobs, as well as holding and releasing jobs. One of its most powerful options is the scontrol show job option. Below is a list of useful scontrol commands:

CommandMeaning
scontrol show job JOB_IDShow information for queued or running job
scontrol hold JOB_IDPlace hold on job
scontrol release JOB_IDRelease hold on job
scontrol show nodesShow hardware details for nodes on cluster
scontrol update JobID=JOB_ID Timelimit=1-12:00:00Change wall time to 1 day 12 hours
scontrol update dependency=JOB_IDAdd job dependency so that job only starts after JOB_ID completes
scontrol --helpShow all options

Please note that the time limit or memory of a job can only be adjust for pending jobs, not for running jobs.

For more information about scontrol see: http://slurm.schedmd.com/scontrol.html

salloc

The function of salloc is to launch an interactive job on compute nodes. This can be useful for troubleshooting/debugging a program or if a program requires user input. To launch an interactive job requesting 1 node, 2 CPU cores, and 1 hour of wall time, a user would type:

salloc --nodes=1 --ntasks=2 --time=1:00:00

This command will execute and then wait for the allocation to be obtained. Once the allocation is granted, an interactive shell is initiated on the allocated node (or one of the allocated nodes, if multiple nodes were allocated). At this point, a user can execute normal commands and launch his/her application like normal.

Note that all of the sbatch options are also applicable for salloc , so a user can insert other typical resource requests, such as memory. Another useful feature in salloc is that it enforces resource requests to prevent users or applications from using more resources than were requested. For example:

[bob@vmps12 ~]$ salloc --nodes=1 --ntasks=2 --time=1:00:00
salloc: Pending job allocation 1772833
salloc: job 1772833 queued and waiting for resources
salloc: job 1772833 has been allocated resources
salloc: Granted job allocation 1772833
[bob@vmp586 ~]$ hostname
vmp586
[bob@vmp586 ~]$ srun -n 2 hostname
vmp586
vmp586
[bob@vmp586 ~]$ srun -n 4 hostname
srun: error: Unable to create job step: More processors requested than permitted
[bob@vmp586 ~]$ exit
exit
srun: error: vmp586: task 0: Exited with exit code 1
salloc: Relinquishing job allocation 1772833
salloc: Job allocation 1772833 has been revoked.
[bob@vmps12 ~]$

In this example, srun -n 4 failed because only 2 tasks were allocated for this interactive job (for details on srun see Section 3.9 below). Also note that typing exit during the interactive session will kill the interactive job, even if the allotted wall time has not been reached.

For more information about salloc see: http://slurm.schedmd.com/salloc.html

xalloc

Similarly to salloc , this command provides an interactive shell on a compute node but with the possibility of running programs with a graphical user interface (GUI) directly on the compute node. To correctly visualize the GUI on your monitor, you first need to connect to the cluster’s gateway with the X11 forwarding abilitated as follows:

[bob@bobslaptop ~]$ ssh -X bob@login.accre.vanderbilt.edu

Then from the gateway request the interactive job with X11 forwarding as in the following example:

[bob@vmps12 ~]$ xalloc --nodes=1 --ntasks=2 --time=1:00:00
srun: job 12555243 queued and waiting for resources
srun: job 12555243 has been allocated resources
[bob@vmp586 ~]$

At this point when launching a GUI based software, the interface should appear on your monitor.

sinfo

sinfo allows users to view information about SLURM nodes and partitions. A partition is a set of nodes (usually a cluster) defined by the cluster administrator. Below are a few example uses of sinfo :

CommandMeaning
sinfo --NelDisplays info in a node-oriented format
sinfo --partition=gpuGet information about GPU nodes
sinfo --states=IDLEDisplays info about idle nodes
sinfo --helpShow all options

For more information about sinfo see: http://slurm.schedmd.com/sinfo.html

sreport

sreport is used for generating reports of job usage and cluster utilization. It queries the SLURM database to obtain this information. By default information will be shown for jobs run since midnight of the current day. Some examples:

CommandMeaning
sreport cluster utilizationShow cluster utilization report
sreport user topShow top 10 cluster users based on total CPU time
sreport cluster AccountUtilizationByUser start=2014-12-01Show account usage per user dating back to December 1, 2014
sreport job sizesbyaccount PrintJobCountShow number of jobs run on a per-group basis
sreport --helpShow all options

For more information about sreport see: http://slurm.schedmd.com/sreport.html

srun

Finally, srun is used to create job arrays for parallel processing. More information about srun is available in GPUs, Parallel Processing and Job Arrays.

Environmental Variables

VariableMeaning
SLURM_JOBIDJob ID
SLURM_SUBMIT_DIRJob submission directory
SLURM_SUBMIT_HOSTName of host from which job was submitted
SLURM_JOB_NODELISTNames of nodes allocated to job
SLURM_ARRAY_TASK_IDTask id within job array
SLURM_JOB_CPUS_PER_NODECPU cores per node allocated to job
SLURM_NNODESNumber of nodes allocated to job

Each of these environment variables can be referenced from a SLURM batch script using the $ symbol before the name of the variable (e.g. echo $SLURM_JOBID). A full list of SLURM environment variables can be found here: http://slurm.schedmd.com/sbatch.html#lbAF