Find out the CPU time and memory usage of a slurm job

If your job is finished, then the sacct command is what you're looking for. Otherwise, look into sstat. For sacct the --format switch is the other key element. If you run this command:

sacct -e

you'll get a printout of the different fields that can be used for the --format switch. The details of each field are described in the Job Account Fields section of the man page. For CPU time and memory, CPUTime and MaxRSS are probably what you're looking for. cputimeraw can also be used if you want the number in seconds, as opposed to the usual Slurm time format.

sacct --format="CPUTime,MaxRSS"

The other answers all detail formats for output of sacct, which is great for looking at multiple jobs aggregated in a table.

However, sometimes you want to look at a specific job in more detail, so you can tell whether your job efficiently used the allocated resources. For that, seff is very useful. The syntax is simply seff <Jobid>. For example, here's a recent job of mine (that failed):

$ seff 15780625

Job ID: 15780625
Cluster: mycluster
User/Group: myuser/mygroup
State: OUT_OF_MEMORY (exit code 0)
Nodes: 1
Cores per node: 16
CPU Utilized: 12:06:01
CPU Efficiency: 85.35% of 14:10:40 core-walltime
Job Wall-clock time: 00:53:10
Memory Utilized: 1.41 GB
Memory Efficiency: 70.47% of 2.00 GB

Note that the key CPU metric, CPU Utilized, corresponds to the TotalCPU field from sacct, while Memory Utilized corresponds to MaxRSS.

Tags:

Slurm