User Tools

Site Tools


slurm_manage

This is an old revision of the document!


Information on Jobs

List job(s) … for you (or a different user) Command
squeue -u $USER
in <partition> squeue -u $USER -p <partition>
priority sprio -l
running squeue -u $USER -t RUNNING
pending squeue -u $USER -t PENDING
details scontrol show jobid -dd <jobid>
status info sstat --format=AveCPU,AvePages,AveRSS,AveVMSize,JobID -j <jobid> --allsteps
statistics on completed (per job) sacct -j <jobid> --format=JobID,JobName,MaxRSS,Elapsed
statistics on completed (per username) sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed

You can see completed Jobs only wit sacct

Controlling Jobs

To… job(s) Command
cancel one scancel <jobid>
cancel all scancel -u <username>
cancel all the pending scancel -t PENDING <jobid>
cancel one or more by name scancel --name <myJobName>
pause one scontrol hold <jobid>
resume one scontrol resume <jobid>
requeue one scontrol requeue <jobid>

Modifying Pending Jobs

Sometimes squeue --start might indicate a wrong requirement specification, e.g. BadConstraints. In this case a user can figure out the mismatch with scontrol show job <jobid> (which might require some experience). Wrong requirements can be fixed like:

To correct a job's Command
memory requirement scontrol update job <jobid> MinMemoryNode=<mem in MB>
memory requirement scontrol update job <jobid> MinMemoryCPU=<mem in MB>
number of requested CPUs
slurm_manage.1499084261.txt.gz · Last modified: 2017/07/03 14:17 by meesters