This shows you the differences between two versions of the page.

Link to this comparison view

io_odds_and_ends [2017/11/29 19:35] (current)
Line 1: Line 1:
 +====== Input and Output on HPC Systems ======
 +This page comments on general aspects related to workloads on Mogon I/II, more information with respect to filesystems can be found [[filesystems|here]].
 +===== Issues which may arise =====
 +Scientific applications perform I/O to parallel file system in primarily one of two ways:
 +  * Shared-­file (N-­to-­1):​ A single file is created, and all application tasks write to that file (usually to completely disjoint regions)
 +    * This increases usability: There is only one file to keep track of by the application
 +    * It may create lock contention and hinder performance ​
 +  * File-­per‐process (N‐to‐N):​ Each instance of an application / each task creates or reads a separate file and writes to that file, only.    ​
 +    * It may avoid lock contention on the application level, but increases the risk of file system stress when to writing to one destination,​ thereby triggering locking on the file system level 
 +    * It is impossible to restart these applications with a different number of tasks
 +===== How may/can I - as a user - analyze issues? =====
 +Currently, when suspecting I/O problems you should [[|address the HPC-team]]. There is no straight forward method available to analyze I/O problems on the user level (of third party applications).
 +<WRAP center round todo 90%>
 +We may provide more tools in the foreseeable future. ​
 +</​WRAP> ​
 +===== Which solution may solve which issue? =====
 +The statements above may seem a little abstract, particularly when third-party applications have to be used and no decision can be made about the application architecture.
 +However, a few rules of thumb can be given:
 +  * [[node_local_scheduling|Pooling short jobs]] is generally a good idea with respect to scheduling and organizing your work flow. If this involves reading identical input files by all those application instances, the stage-in to a [[slurm_localscratch|node local scratch]] or even into [[ramdisk|RAM]] may solve performance issues: By creating a temporary input resource, the need to keep track of file accesses for these particular files for the global file system is dropped.
 +  * Avoid keeping open file handles by many processes (within a directory). Violating this rule may cause delays, because the global file system needs to coordinate every writing process. A possible solution is to write into the job directory (see [[slurm_localscratch|node local scratch]]) and to copy this output to the global file system after a writing process is finished / releases a file.
 +  * Avoid writing too many small files: The overhead in keeping track of the meta information for millions of small files can be bigger than the file size. The global file system is not optimized for this. 
  • io_odds_and_ends.txt
  • Last modified: 2017/11/29 19:35
  • (external edit)