training_and_outreach:ticket_system

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
training_and_outreach:ticket_system [2019/04/01 10:59]
meesters created
training_and_outreach:ticket_system [2019/04/29 10:07] (current)
meesters
Line 1: Line 1:
- 
- 
 ====== How to get support ====== ====== How to get support ======
  
Line 7: Line 5:
 **Please do not** send eMail to our staff directly - using the ticketing system helps us with appropriate scheduling and load balancing while supporting you and every other user. **Please do not** send eMail to our staff directly - using the ticketing system helps us with appropriate scheduling and load balancing while supporting you and every other user.
  
-If you write an eMail, please try to include **as much** information as possible, **including**,​ but not limited to:+===== Issues with Batch Jobs =====
  
-  ​* The //job command line// ​or the //job script// (''​sbatch ...''​)+ 
 +If you write an eMail to inquire anything about a job (e.g. issues), please try to include **as much** information as possible, **including**,​ but not limited to: 
 + 
 +  ​* The //job command line// ​and path to the //job script// (the one invoked with ''​sbatch ...''​)
   * JobIDs   * JobIDs
   * The //​environment//​ you started the job in (at least the output of ''​module list'',​ maybe even ''​env | sort''​)   * The //​environment//​ you started the job in (at least the output of ''​module list'',​ maybe even ''​env | sort''​)
   * If possible, the whole job //output//, or if it is to big, the relevant output from the batch system (at the beginning and the end) and any //error messages// you encounter.   * If possible, the whole job //output//, or if it is to big, the relevant output from the batch system (at the beginning and the end) and any //error messages// you encounter.
 +
 +===== Issues with Job Submissions (when starting a Job) =====
 +
 +Jobs will not start, because SLURM complains? Please be sure to supply the following information:​
 +
 +  * The //job command line// and path to the //job script// (e.g. the one invoked with ''​$ sbatch ...''​)
 +  * The //​environment//​ you started the job in (at least the output of ''​module list'',​ maybe even ''​env | sort''​)
 +  * If possible **any** output, not just the error line.
 +  * If working on non-public login-nodes,​ please indicate the respective login-node, too.
 +
 +===== Investigating a running Job =====
 +
 +Sometimes particular jobs give issues. Expecting us to investigate a running job, requires a job to be running at working hours and to be tracked at will. This can be accomplished,​ if you submit your job with the ''​--hold''​ flag, by amending it on the command line:
 +
 +<code bash>
 +$ sbatch --hold ...
 +</​code>​
 +
 +If you subsequently notify us with a mail to [[hpc@uni-mainz.de]] we can release the job at any time and investigate it. Be sure to give the path to the expected job output, too.
 +
 +<WRAP center round info 90%>
 +If you suspect a particular node giving issues (perhaps there is a hardware problem) and we shall investigate this((we do have an automated failure detection in place, such system have their weak spots)), you can submit with:
 +<code bash>
 +$ sbatch --hold -w <​nodename>​ ...
 +</​code>​
 +</​WRAP>​
 +
 +Thank you for your cooperation!
 +
 +===== Anything else? =====
 +
 +You are aware that we [[training_and_outreach:​workshop|offer workshops]] for any issue which is more complex?
  • training_and_outreach/ticket_system.1554109180.txt.gz
  • Last modified: 2019/04/01 10:59
  • by meesters