Basic SLURM Commands


Table of Contents

  1. Checking cluster info
  2. Running a program
  3. Checking the Queue
  4. Kill, stop or restart
  5. Check account utilization ***

Checking Cluster Info

sinfo

To see the status, such as availability, time limit, number of nodes of the cluster and partition run

sinfo 

Click here for more information on the cluster partitions and architecture

Running a Program

Using srun

To quickly run a program for prototyping or testing small programs run:

srun -N 1 --partition=quick ./executable 

This will allocate one node for a default amount of time on the quick partition. We suggest to use the quick partition as they are limited to 10 minutes per job and are generally easier to find allocation on.

Click here for more information on the srun command and parameters.

Using sbatch

For larger and more complex jobs, it is best practice to write a run script to automate module loading, passing environment variables, generating input and output files, and then running the executable.

Click here for more information on writing batch scripts for sbatch.

sbatch run_script.sh 

Checking the Queue

squeue

To see the entire current job queue active and pending use:

squeue

To check a particular users job queue use:

squeue -u username

Kill, stop or restart a job

To kill a specific program by Job ID:

scancel jobId

To kill all the jobs by a user:

scancel -u username 

To kill all the pending jobs by a user:

scancel -u username --state=pending

To stop a running job:

scancel -s SIGSTOP jobId

To restart a stopped job:

scancel -s SIGCONT jobId

Check utilization history

sacct

To check the utilization of allocation:

sacct