start:working_on_mogon:slurm_manage

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
start:working_on_mogon:slurm_manage [2020/07/01 17:35]
meesters [Pending Reasons]
start:working_on_mogon:slurm_manage [2020/07/28 12:59] (current)
pkeller2
Line 52: Line 52:
 | ''​QOSMaxJobsPerAccountLimit''​ | For certain partitions the number of running jobs per account is limited. | | ''​QOSMaxJobsPerAccountLimit''​ | For certain partitions the number of running jobs per account is limited. |
 | ''​QOSGrpGRESRunMinutes''​ | For certain partitions the generic resources (e.g. GPUs) are limited. See [[:​start:​working_on_mogon:​partitions#​gpu_queues|GPU Queues]] | | ''​QOSGrpGRESRunMinutes''​ | For certain partitions the generic resources (e.g. GPUs) are limited. See [[:​start:​working_on_mogon:​partitions#​gpu_queues|GPU Queues]] |
-| ''​QOSGrpMemLimit''​ | the requested partition is limited in the fraction of resources it can take from the cluster and this amount has been reached: jobs need to end, before new may start.| +| ''​QOSGrpMemLimit''​ | The requested partition is limited in the fraction of resources it can take from the cluster and this amount has been reached: jobs need to end, before new may start.| 
-| ''​QOSGrpCpuLimit''​ | the requested partition is limited in the fraction of resources it can take from the cluster and this amount has been reached: jobs need to end, before new may start.| +| ''​QOSGrpCpuLimit''​ | The requested partition is limited in the fraction of resources it can take from the cluster and this amount has been reached: jobs need to end, before new may start.| 
-| ''​Resources''​ | while the partition may allow to take the resources ​you requested, it cannot not -- at the time -- provide the nodes to run on (e.g. because of a memory request which cannot be satisfied).| +| ''​Resources''​ | The job is eligible ​to run but resources ​aren't available ​at this time. This usually just means that your job will start next once nodes are done with their current jobs.| 
-| ''​ReqNodeNotAvail''​ | simply ​means that no node with the required resources is available. SLRUM will list //all// non-available nodes, which can be confusing. This reason is similar to ''​Priority''​ as it means that a specific job has to wait for a resource to be released.|+| ''​ReqNodeNotAvail''​ | Simply ​means that no node with the required resources is available. SLRUM will list //all// non-available nodes, which can be confusing. This reason is similar to ''​Resources''​ as it means that a specific job has to wait for a resource to be released.|
  
 And then there limitations due to the number of jobs a group (a.k.a. account) may run at a given time. More information on partitions can be found [[:​start:​working_on_mogon:​partitions|on their respective wiki site]]. And then there limitations due to the number of jobs a group (a.k.a. account) may run at a given time. More information on partitions can be found [[:​start:​working_on_mogon:​partitions|on their respective wiki site]].
  • start/working_on_mogon/slurm_manage.txt
  • Last modified: 2020/07/28 12:59
  • by pkeller2