Such questions are the basic questions for any new tool to be used in batch jobs. We usually advise to launch a few test jobs with representative parameterization1). Subsequently, a setup for more, productive jobs can be chosen, such that a safety margin for wall time and memory limit is placed, which does not in turn throttle the own throughput2).
SLURM provides an on-board script,
seff, which can be used to evaluate jobs which have finished. To invoke it, run
$ seff <jobid>
It will give an output like:
Job ID: <given job ID> Cluster: <cluster> User/Group: <user>/<group> State: COMPLETED (exit code 0) Nodes: 1 Cores per node: 64 CPU Utilized: 05:04:22 CPU Efficiency: 86.73% of 05:50:56 core-walltime Job Wall-clock time: 00:05:29 Memory Utilized: 13.05 GB Memory Efficiency: 11.60% of 112.50 GB
Here the meaning is
| || the ID used
| ||the cluster name|
| ||user name for the job|
| ||a unix group3)|
|State|| can be any of
|Nodes||number of nodes reserved for the job|
|Cores per node||number of cores per node for the job|
|CPU Utilized||the utilized overall CPU time (used time per CPU * No. of CPUs)|
|CPU Efficiency||an apparent computation efficiency (utilized CPUs over core-walltime); the core-walltime is the turn-around time of the job, including setup and cleanup|
|Job Wall-clock time||elapsed time of the job|
|Memory Utilized||Peak Memory|
|Memory Efficiency||see below for an explanation|
Obviously, the CPU efficiency should not be too low. In the example 14% of the resources is not used – apparently. Is this good or bad? The reported “Memory Efficiency” is way below anything which can be considered “efficient”, right?
Still, the reported “Memory Efficiency” can be an important measure for the used memory. If you want to know your peak memory usage, that measure can give you a hint.
However, please note that SLURM samples the memory usage in intervals. Hence, usage peaks may be missed.
Genuine Profiling in order to optimize an application is not the purpose of post-hoc job analysis. We offer various tools for this purpose - and provide a wiki page on this topic.