User Tools

Site Tools


training_and_outreach:ticket_system

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
training_and_outreach:ticket_system [2019/04/01 10:59]
meesters created
training_and_outreach:ticket_system [2019/04/01 11:06]
meesters
Line 9: Line 9:
 If you write an eMail, please try to include **as much** information as possible, **including**,​ but not limited to: If you write an eMail, please try to include **as much** information as possible, **including**,​ but not limited to:
  
-  * The //job command line// or the //job script// (''​sbatch ...''​)+  * The //job command line// or the //job script// (the one invoked with ''​sbatch ...''​)
   * JobIDs   * JobIDs
   * The //​environment//​ you started the job in (at least the output of ''​module list'',​ maybe even ''​env | sort''​)   * The //​environment//​ you started the job in (at least the output of ''​module list'',​ maybe even ''​env | sort''​)
   * If possible, the whole job //output//, or if it is to big, the relevant output from the batch system (at the beginning and the end) and any //error messages// you encounter.   * If possible, the whole job //output//, or if it is to big, the relevant output from the batch system (at the beginning and the end) and any //error messages// you encounter.
 +
 +===== Investigatin a running Job =====
 +
 +Sometimes particular jobs give issues. Expecting us to investigate a running job, requires a job to be running at working hours and to be tracked at will. This can be accomplished,​ if you submit your job with the ''​--hold''​ flag, by amending it on the command line:
 +
 +<code bash>
 +$ sbatch --hold ...
 +</​code>​
 +
 +If you subsequently notify us with a mail to [[hpc@uni-mainz.de]] we can release the job at any time and investigate it. Be sure to give the path to the expected job output, too.
 +
 +<WRAP center round info 90%>
 +If you suspect a particular node giving issues (perhaps there is a hardware problem) and we shall investigate this((we do have an automated failure detection in place, such system have their weak spots)), you can submit with:
 +<code bash>
 +$ sbatch --hold -w <​nodename>​ ...
 +</​code>​
 +</​WRAP>​
 +
 +Thank you for your cooperation!
 +
training_and_outreach/ticket_system.txt · Last modified: 2019/04/29 10:07 by meesters