Checking the quality1) of experimental data is crucial to data analysis.
One of the best-known tools for estimating the sequencing quality (and providing summary statistics and plots) is FastQC.
We provide this software as modules under:
You can find a wrapper to ease your workflow, below.
You can visualize the html-files created by FastQC with
firefox, which is installed on both clusters.
deepTools grant a number of assesment tools. It is available as the module file(s):
MultiQC is an assessment tool to gather the output of multiple quality indicators and to visualize them.
It is available as a module file:
Both might be run in job context or outside.
We provide a wrapper module on Mogon in order to aggregate jobs and to integrate the quality check into a workflow.
The wrapper script is available as a module:
The code is under version management and hosted internally, here.
The wrapper script will submit a job, it is not intended to be just within a SLURM environment, but rather creates one.
QAWrapper -h will display a help message with all the options, the script provides. Likewise, the call
QAWrapper –credits will display credits and a version history.
The script, after loading the module, can then be run like:
$ QAWrapper --executable=<executable> [options] <inputdir>
Different meanings of the selected executable
fastqc is to determine the quality of the “raw” data in FASTQ format. Yet, other executables are available, such as
samtools with their invocation on
.bam file to summarize the quality of mapping tools.
See below for a detailed description. If a particular executable is not supported, you can approach us.
inputdirneeds to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string
unpairedare ignored; this is to support preprocessing with the quality assessment module.
QAWrapperattempts to deduce your SLURM account. This may fail, in which case
-A, –accountneeds to be supplied.
–executable, defaults to
fastqc. Other options:
samtools. Option is case-insensitive.
-l,–runlimit, this defaults to 300 minutes.
-p,–partition, the default is
parallelon Mogon2, no smp-partition should be choosen.
–args, arguments otherwise not set by the wrapper - the defaults of the choosen executable apply for unset arguments
-d,–dependency, list of comma separated jobids, the job will wait for to finish
-o,–outdir, output directory path (default is the current working directory)
–constraint, on Mogon II, only: defaults to
| || FASTQ files ending on ||assess quality of raw or trimmed data|
| || mapped ||assess quality of mapped (sorted or filtered) compressed data||