training_and_outreach:ticket_system

How to get support

If you have any questions or need help regarding high performance computing at the ZDV, send an eMail to hpc@uni-mainz.de (which will create a new ticket in our ticketing system).

Please do not send eMail to our staff directly - using the ticketing system helps us with appropriate scheduling and load balancing while supporting you and every other user.

If you write an eMail to inquire anything about a job (e.g. issues), please try to include as much information as possible, including, but not limited to:

  • The job command line and path to the job script (the one invoked with $ sbatch …)
  • JobIDs
  • The environment you started the job in (at least the output of module list, maybe even env | sort)
  • If possible, the whole job output, or if it is to big, the relevant output from the batch system (at the beginning and the end) and any error messages you encounter.

Jobs will not start, because SLURM complains? Please be sure to supply the following information:

  • The job command line and path to the job script (e.g. the one invoked with $ sbatch …)
  • The environment you started the job in (at least the output of module list, maybe even env | sort)
  • If possible any output, not just the error line.
  • If working on non-public login-nodes, please indicate the respective login-node, too.

Sometimes particular jobs give issues. Expecting us to investigate a running job, requires a job to be running at working hours and to be tracked at will. This can be accomplished, if you submit your job with the –hold flag, by amending it on the command line:

$ sbatch --hold ...

If you subsequently notify us with a mail to hpc@uni-mainz.de we can release the job at any time and investigate it. Be sure to give the path to the expected job output, too.

If you suspect a particular node giving issues (perhaps there is a hardware problem) and we shall investigate this1), you can submit with:

$ sbatch --hold -w <nodename> ...

Thank you for your cooperation!

You are aware that we offer workshops for any issue which is more complex?


1)
we do have an automated failure detection in place, such system have their weak spots
  • training_and_outreach/ticket_system.txt
  • Last modified: 2019/04/29 10:07
  • by meesters