software:topical:lifescience:ngs_read_mapping_tools

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
software:topical:lifescience:ngs_read_mapping_tools [2018/12/15 07:32]
meesters
software:topical:lifescience:ngs_read_mapping_tools [2019/10/24 15:48] (current)
meesters [BarraCuda]
Line 24: Line 24:
 See [[:​software:​topical:​lifescience:​ngs_read_mapping_tools#​gpu-based|below for a wrapper script]] to ease your workflow. See [[:​software:​topical:​lifescience:​ngs_read_mapping_tools#​gpu-based|below for a wrapper script]] to ease your workflow.
  
 +==== Minimap2 ====
 +
 +[[https://​github.com/​lh3/​minimap2|Minimap2]] is supposed to be a replacement for ''​bwa mem''​. Modules are installed under 
 +
 +''​bio/​minimap2''​
  
  
Line 32: Line 37:
 ''​bio/​SeqAn/<​version>''​ ''​bio/​SeqAn/<​version>''​
  
-You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]], eventually ((not yet)).+You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
  
Line 38: Line 43:
  
 [[https://​www.nature.com/​articles/​nmeth.1923|Bowtie2]] is a well known read aligner with a focus on gapped alignments. [[https://​www.nature.com/​articles/​nmeth.1923|Bowtie2]] is a well known read aligner with a focus on gapped alignments.
 +
 +Module(s) can be found at:
 +
 +''​bio/​Bowtie2/<​version>''​
 +
 +You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
 ==== STAR ==== ==== STAR ====
  
-<WRAP center round todo 65%>  +[[https://​www.ncbi.nlm.nih.gov/​pubmed/​23104886|STAR]] is a well known mapping tool for RNA-Seq data.  
-More info soon-ish. + 
-Particularly,​ a wrapper module is forthcomingAs STAR can work with a shared memory option, the wrapper ​is fundamentally different to that for other tools+Module(s) can be found at: 
-</WRAP>+ 
 +''​bio/STAR/<​version>''​ 
 + 
 +You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
 ==== segemehl ==== ==== segemehl ====
Line 58: Line 72:
 ''​bio/​segemehl/​0.2.0-foss-2018a''​ ''​bio/​segemehl/​0.2.0-foss-2018a''​
  
 +==== TopHat ====
 +
 +[[https://​ccb.jhu.edu/​software/​tophat/​index.shtml|TopHat]] is a fast splice junction mapper for RNA-Seq reads.
 +
 +Module can be found at:
 +
 +''​bio/​TopHat/<​version>''​
 +
 +
 +<WRAP center round info 90%>
 +This program is not yet incorporated into the wrapping module.
 +</​WRAP>​
 ==== yara ==== ==== yara ====
  
Line 105: Line 131:
  
   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference
-  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​trimmomatic|trimmomatic ​module]].+  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​qc|quality check module]].
  
 The options: The options:
Line 115: Line 141:
   * ''​-p,​--partition'',​ the default is ''​nodeshort''​ or ''​parallel''​ on Mogon2, no smp-partition should be choosen.   * ''​-p,​--partition'',​ the default is ''​nodeshort''​ or ''​parallel''​ on Mogon2, no smp-partition should be choosen.
   * ''​-o,​--outdir''​ output directory path (default is the current working directory)   * ''​-o,​--outdir''​ output directory path (default is the current working directory)
 +  * ''​--tag''​ optional tag/prefix for logfiles and directories
 +  * ''​--groups''​ set to provide a lists of read group tags (len(groups) must equal to No. of files)
   * ''​--single''​ (no arguments) to evaluate single end data   * ''​--single''​ (no arguments) to evaluate single end data
   * ''​--args''​ to supply additional flags, e. g. ''​--args="​-l 1024 -n 0.02"''​ for BWA - note the quotation marks, they are necessary.   * ''​--args''​ to supply additional flags, e. g. ''​--args="​-l 1024 -n 0.02"''​ for BWA - note the quotation marks, they are necessary.
Line 121: Line 149:
  
   * Per input tuple (paired sequencing data, only) a BAM file with the prefix of the input will be written. In the case of single end data, there will be one output per input, only.   * Per input tuple (paired sequencing data, only) a BAM file with the prefix of the input will be written. In the case of single end data, there will be one output per input, only.
 +
 +=== Generating Read Group Tags ===
 +
 +Read group tags can be inserted with the ''​--groups''​ flag((From version 0.6 onward.)). The tags are supplied as a list on the command line. An example code to generate a tag list for consecutively ordered tags would be:
 +
 +<code bash>
 +# defining the input directory appropriately in a master script:
 +inputdir=/​some/​path/​to/​your/​data # assuming '​_R1'​ defines the forward reads in a paired end scenario
 +
 +# a template - may deviate from project to project
 +template="​@RG\tID:​+ID+\tLB:​unknown_lb\tPL:​illumina\tSM:​sample+ID+"​
 +# the tag list to be generated
 +tags=""​
 +# number of samples - this snippet could be integrated in a script ​
 +nsamples=$(find $inputdir -name '​*_R1*.fastq'​ | grep -v unpaired | wc -l)
 +# now the actual generation:
 +for ((i=1; i <= $nsamples; i++)); do
 +  tags="​$tags $(sed -e "​s/​+ID+/​$i/​g"​ <<<​ $template)"​
 +done
 +</​code>​
  
  
Line 140: Line 188:
  
 <WRAP center round important 90%> <WRAP center round important 90%>
-**Limitations**: +**Considerations**: 
-  * See the parallel_BWA wrapper +  * See the [[software:​topical:​lifescience:​ngs_read_mapping_tools#​standard_mappers|"​standard"​ Mappers]] 
-  * Also: The script will only use the ''​m2_gpu''​ partition and therefore needs an account with the ''​m2_''​ prefix.+  * Also: The script will only use the ''​m2_gpu''​ partition and therefore needs an account with the ''​m2_''​ prefix((This is because development to support the wild "​zoo"​ of hardware and partition setting is hardly worth the effort for this software, as tests show that standard bwa (properly mapped) outperforms the gpu version.)).
 </​WRAP>​ </​WRAP>​
  
Line 148: Line 196:
 About Arguments: About Arguments:
   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference. No symbolic links are allowed.   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference. No symbolic links are allowed.
-  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​trimmomatic|trimmomatic ​module]].+  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​qc|quality check module]].
  
  
  • software/topical/lifescience/ngs_read_mapping_tools.1544855522.txt.gz
  • Last modified: 2018/12/15 07:32
  • by meesters