User Tools

Site Tools


software:topical:lifescience:ngs_read_mapping_tools

This is an old revision of the document!


NGS Read Mapping Software on Mogon

This page is currently under construction.

As a first introduction into NGS alignment software tools we recommend reading this short blog post. Or in other words: It might be, that the list of supported tools grows and grows, due to your requests, but will never really cover everybody's favorite tool.

Notwithstanding, own benchmarks a first impression can be found in the same blog.

Software Options

BWA

BWA is one mapping tool, particularly to map “low-divergent sequences against a large reference genome”. Modules on Mogon can be found as1):

bio/BWA

The Wrapper Script

The wrapper script is not installed, yet.

To leverage the task from 1 (or a few) samples to be mapped to several in parallel, we provide a wrapper script, which is available as a module:

bio/parallel_BWA

The code is under version management and hosted internally, here.

The wrapper script will submit a job, it is not intended to be just within a SLURM environment, but rather creates one.

Calling parallel_BWA -h will display a help message with all the options, the script provides. Likewise, the call parallel_BWA –credits will display credits and a version history.

The script, after loading the module, can then be run like:

$ parallel_BWA [options] <referencedir> <inputdir>

Limitations:

  • The wrapper recognizes FASTQ files with suffixes “*.gz”, “*.fastq” or “*.fq” and will allways assume FASTQ files (compressed or uncompressed).
  • The number of processes (and therefore nodes) is limited to the number of samples.
  • The wrapper only works for paired end sequencing data, where the file tuples are designated with the following strings “_1” and “_2” or “_R1” and “_R2”, respectively.
  • BWA does not scale well to big data. It is better to split input to chuncks of ~1GB
  • BWA does not scale well beyond a NUMA block (8 threads on Mogon I)

BarraCuda

The Wrapper Script

Razer3

The Wrapper Script

Bowtie2

Bowtie2 is a well known read aligner with a focus on gapped alignments.

As preliminary scaling tests indicate that the program can scale to a full node and is still reasonably fast, no wrapper script has been installed as a module, so far2). Instead, a few samples are given:

A Sample Script

STAR

The Wrapper Script

segemehl

segemehl seems to be a pretty good alignment tool, mentioned here, due to the blog which is cited below.

There will be no wrapper script for segemehl: If this comparison bears any truth, the software might be really good. But also pretty memory hungry. And several tens GB / core is just too mutch. If you want to try segemehl, be sure to write your own wrapper script (perhaps stage-in the reference to a local scratch, not the ramdisk) and reserve sufficient memory. Be aware that you will be accounted for the pro-longed run time and memory.

Comparison Benchmarks

This part needs some more time to be finished ….

1)
loading a module without version specification will load the most recent one
2)
If you feel a workflow logic can profit from a wrapper, please approach us.
software/topical/lifescience/ngs_read_mapping_tools.1536906451.txt.gz · Last modified: 2018/09/14 08:27 by meesters