User Tools

Site Tools


io_odds_and_ends

Input and Output on HPC Systems

This page comments on general aspects related to workloads on Mogon I/II, more information with respect to filesystems can be found here.

Issues which may arise

Scientific applications perform I/O to parallel file system in primarily one of two ways:

  • Shared-­file (N-­to-­1): A single file is created, and all application tasks write to that file (usually to completely disjoint regions)
    • This increases usability: There is only one file to keep track of by the application
    • It may create lock contention and hinder performance
  • File-­per‐process (N‐to‐N): Each instance of an application / each task creates or reads a separate file and writes to that file, only.
    • It may avoid lock contention on the application level, but increases the risk of file system stress when to writing to one destination, thereby triggering locking on the file system level
    • It is impossible to restart these applications with a different number of tasks

How may/can I - as a user - analyze issues?

Currently, when suspecting I/O problems you should address the HPC-team. There is no straight forward method available to analyze I/O problems on the user level (of third party applications).

We may provide more tools in the foreseeable future.

Which solution may solve which issue?

The statements above may seem a little abstract, particularly when third-party applications have to be used and no decision can be made about the application architecture.

However, a few rules of thumb can be given:

  • Pooling short jobs is generally a good idea with respect to scheduling and organizing your work flow. If this involves reading identical input files by all those application instances, the stage-in to a node local scratch or even into RAM may solve performance issues: By creating a temporary input resource, the need to keep track of file accesses for these particular files for the global file system is dropped.
  • Avoid keeping open file handles by many processes (within a directory). Violating this rule may cause delays, because the global file system needs to coordinate every writing process. A possible solution is to write into the job directory (see node local scratch) and to copy this output to the global file system after a writing process is finished / releases a file.
  • Avoid writing too many small files: The overhead in keeping track of the meta information for millions of small files can be bigger than the file size. The global file system is not optimized for this.
io_odds_and_ends.txt · Last modified: 2017/11/29 19:35 (external edit)