This is an old revision of the document!
Information on Jobs
List job(s) … for you (or a different user) | Command |
---|---|
squeue -u $USER |
|
in <partition> | squeue -u $USER -p <partition> |
priority | sprio -l |
running | squeue -u $USER -t RUNNING |
pending | squeue -u $USER -t PENDING |
details | scontrol show jobid -dd <jobid> |
status info | sstat --format=AveCPU,AvePages,AveRSS,AveVMSize,JobID -j <jobid> --allsteps |
statistics on completed (per job) | sacct -j <jobid> --format=JobID,JobName,MaxRSS,Elapsed |
statistics on completed (per username) | sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed |
You can see completed Jobs only wit sacct
Controlling Jobs
To… job(s) | Command |
---|---|
cancel one | scancel <jobid> |
cancel all | scancel -u <username> |
cancel all the pending | scancel -t PENDING <jobid> |
cancel one or more by name | scancel --name <myJobName> |
pause one | scontrol hold <jobid> |
resume one | scontrol resume <jobid> |
requeue one | scontrol requeue <jobid> |
Modifying Pending Jobs
Sometimes squeue --start
might indicate a wrong requirement specification, e.g. BadConstraints
. In this case a user can figure out the mismatch with scontrol show job <jobid>
(which might require some experience). Wrong requirements can be fixed like:
To correct a job's | Command |
---|---|
memory requirement | scontrol update job <jobid> MinMemoryNode=<mem in MB> |
memory requirement | scontrol update job <jobid> MinMemoryCPU=<mem in MB> |
number of requested CPUs |