User Tools

Site Tools


slurm_manage

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
slurm_manage [2019/03/21 07:40]
meesters [Pending Reasons]
slurm_manage [2019/05/06 10:17] (current)
meesters [Pending Reasons]
Line 43: Line 43:
 ====== Pending Reasons ====== ====== Pending Reasons ======
  
-So, why do my jobs not start? SLURM may list a number of reasons for pending jobs (those labelled ''​PD'',​ when ''​squeue''​ is triggered).+So, why do my jobs not start? SLURM may list a number of reasons for pending jobs (those labelled ''​PD'',​ when ''​squeue''​ is triggered). ​Here, we show some more frequent reasons:
  
 ^ Reason ^ Brief Explanation ^ ^ Reason ^ Brief Explanation ^
Line 54: Line 54:
 | ''​QOSGrpCpuLimit''​ | the requested partition is limited in the fraction of resources it can take from the cluster and this amount has been reached: jobs need to end, before new may start.| | ''​QOSGrpCpuLimit''​ | the requested partition is limited in the fraction of resources it can take from the cluster and this amount has been reached: jobs need to end, before new may start.|
 | ''​Resources''​ | while the partition may allow to take the resources you requested, it cannot not -- at the time -- provide the nodes to run on (e.g. because of a memory request which cannot be satisfied).| | ''​Resources''​ | while the partition may allow to take the resources you requested, it cannot not -- at the time -- provide the nodes to run on (e.g. because of a memory request which cannot be satisfied).|
 +| ''​ReqNodeNotAvail''​ | simply means that no node with the required resources is available. SLRUM will list //all// non-available nodes, which can be confusing. This reason is similar to ''​Priority''​ as it means that a specific job has to wait for a resource to be released.|
  
 And then there limitations due to the number of jobs a user or group (a.k.a. account) may run at a given time. More information on partitions can be found [[partitions|on their respective wiki site]]. And then there limitations due to the number of jobs a user or group (a.k.a. account) may run at a given time. More information on partitions can be found [[partitions|on their respective wiki site]].
slurm_manage.1553150408.txt.gz · Last modified: 2019/03/21 07:40 by meesters