training_and_outreach:ticket_system

# Differences

This shows you the differences between two versions of the page.

 training_and_outreach:ticket_system [2019/04/29 10:07]meesters — (current) 2020/09/02 13:45 jrutte02 removed2019/04/29 10:07 meesters 2019/04/01 13:03 meesters [Investigatin a running Job] 2019/04/01 11:06 meesters 2019/04/01 11:06 meesters 2019/04/01 10:59 meesters created 2020/09/02 13:45 jrutte02 removed2019/04/29 10:07 meesters 2019/04/01 13:03 meesters [Investigatin a running Job] 2019/04/01 11:06 meesters 2019/04/01 11:06 meesters 2019/04/01 10:59 meesters created Line 1: Line 1: - ====== How to get support ====== - If you have any questions or need help regarding high performance computing at the ZDV, send an eMail to [[hpc@uni-mainz.de]] (which will create a new ticket in our ticketing system). \\ - - **Please do not** send eMail to our staff directly - using the ticketing system helps us with appropriate scheduling and load balancing while supporting you and every other user. - - ===== Issues with Batch Jobs ===== - - - If you write an eMail to inquire anything about a job (e.g. issues), please try to include **as much** information as possible, **including**, but not limited to: - - * The //job command line// and path to the //job script// (the one invoked with ''$sbatch ...'') - * JobIDs - * The //environment// you started the job in (at least the output of ''module list'', maybe even ''env | sort'') - * If possible, the whole job //output//, or if it is to big, the relevant output from the batch system (at the beginning and the end) and any //error messages// you encounter. - - ===== Issues with Job Submissions (when starting a Job) ===== - - Jobs will not start, because SLURM complains? Please be sure to supply the following information: - - * The //job command line// and path to the //job script// (e.g. the one invoked with ''$ sbatch ...'') - * The //environment// you started the job in (at least the output of ''module list'', maybe even ''env | sort'') - * If possible **any** output, not just the error line. - * If working on non-public login-nodes, please indicate the respective login-node, too. - - ===== Investigating a running Job ===== - - Sometimes particular jobs give issues. Expecting us to investigate a running job, requires a job to be running at working hours and to be tracked at will. This can be accomplished, if you submit your job with the ''--hold'' flag, by amending it on the command line: - - - $sbatch --hold ... - - - If you subsequently notify us with a mail to [[hpc@uni-mainz.de]] we can release the job at any time and investigate it. Be sure to give the path to the expected job output, too. - - - If you suspect a particular node giving issues (perhaps there is a hardware problem) and we shall investigate this((we do have an automated failure detection in place, such system have their weak spots)), you can submit with: - -$ sbatch --hold -w ... - - - - Thank you for your cooperation! - - ===== Anything else? ===== - - You are aware that we [[training_and_outreach:workshop|offer workshops]] for any issue which is more complex?
• training_and_outreach/ticket_system.1556525243.txt.gz