User Tools

Site Tools


training_and_outreach:ticket_system

How to get support

If you have any questions or need help regarding high performance computing at the ZDV, send an eMail to hpc@uni-mainz.de (which will create a new ticket in our ticketing system).

Please do not send eMail to our staff directly - using the ticketing system helps us with appropriate scheduling and load balancing while supporting you and every other user.

Issues with Batch Jobs

If you write an eMail to inquire anything about a job (e.g. issues), please try to include as much information as possible, including, but not limited to:

  • The job command line and path to the job script (the one invoked with $ sbatch …)
  • JobIDs
  • The environment you started the job in (at least the output of module list, maybe even env | sort)
  • If possible, the whole job output, or if it is to big, the relevant output from the batch system (at the beginning and the end) and any error messages you encounter.

Issues with Job Submissions (when starting a Job)

Jobs will not start, because SLURM complains? Please be sure to supply the following information:

  • The job command line and path to the job script (e.g. the one invoked with $ sbatch …)
  • The environment you started the job in (at least the output of module list, maybe even env | sort)
  • If possible any output, not just the error line.
  • If working on non-public login-nodes, please indicate the respective login-node, too.

Investigating a running Job

Sometimes particular jobs give issues. Expecting us to investigate a running job, requires a job to be running at working hours and to be tracked at will. This can be accomplished, if you submit your job with the –hold flag, by amending it on the command line:

$ sbatch --hold ...

If you subsequently notify us with a mail to hpc@uni-mainz.de we can release the job at any time and investigate it. Be sure to give the path to the expected job output, too.

If you suspect a particular node giving issues (perhaps there is a hardware problem) and we shall investigate this1), you can submit with:

$ sbatch --hold -w <nodename> ...

Thank you for your cooperation!

Anything else?

You are aware that we offer workshops for any issue which is more complex?

1)
we do have an automated failure detection in place, such system have their weak spots
training_and_outreach/ticket_system.txt · Last modified: 2019/04/29 10:07 by meesters