software:topical:lifescience:ngs_read_mapping_tools

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
software:topical:lifescience:ngs_read_mapping_tools [2018/12/18 09:13]
meesters
software:topical:lifescience:ngs_read_mapping_tools [2019/10/24 15:48] (current)
meesters [BarraCuda]
Line 24: Line 24:
 See [[:​software:​topical:​lifescience:​ngs_read_mapping_tools#​gpu-based|below for a wrapper script]] to ease your workflow. See [[:​software:​topical:​lifescience:​ngs_read_mapping_tools#​gpu-based|below for a wrapper script]] to ease your workflow.
  
 +==== Minimap2 ====
 +
 +[[https://​github.com/​lh3/​minimap2|Minimap2]] is supposed to be a replacement for ''​bwa mem''​. Modules are installed under 
 +
 +''​bio/​minimap2''​
  
  
Line 32: Line 37:
 ''​bio/​SeqAn/<​version>''​ ''​bio/​SeqAn/<​version>''​
  
-You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]], eventually ((not yet)).+You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
  
Line 38: Line 43:
  
 [[https://​www.nature.com/​articles/​nmeth.1923|Bowtie2]] is a well known read aligner with a focus on gapped alignments. [[https://​www.nature.com/​articles/​nmeth.1923|Bowtie2]] is a well known read aligner with a focus on gapped alignments.
 +
 +Module(s) can be found at:
 +
 +''​bio/​Bowtie2/<​version>''​
 +
 +You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
 ==== STAR ==== ==== STAR ====
  
-<WRAP center round todo 65%>  +[[https://​www.ncbi.nlm.nih.gov/​pubmed/​23104886|STAR]] is a well known mapping tool for RNA-Seq data.  
-More info soon-ish. + 
-Particularly,​ a wrapper module is forthcomingAs STAR can work with a shared memory option, the wrapper ​is fundamentally different to that for other tools+Module(s) can be found at: 
-</WRAP>+ 
 +''​bio/STAR/<​version>''​ 
 + 
 +You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
 ==== segemehl ==== ==== segemehl ====
Line 58: Line 72:
 ''​bio/​segemehl/​0.2.0-foss-2018a''​ ''​bio/​segemehl/​0.2.0-foss-2018a''​
  
 +==== TopHat ====
 +
 +[[https://​ccb.jhu.edu/​software/​tophat/​index.shtml|TopHat]] is a fast splice junction mapper for RNA-Seq reads.
 +
 +Module can be found at:
 +
 +''​bio/​TopHat/<​version>''​
 +
 +
 +<WRAP center round info 90%>
 +This program is not yet incorporated into the wrapping module.
 +</​WRAP>​
 ==== yara ==== ==== yara ====
  
Line 105: Line 131:
  
   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference
-  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​trimmomatic|trimmomatic ​module]].+  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​qc|quality check module]].
  
 The options: The options:
Line 115: Line 141:
   * ''​-p,​--partition'',​ the default is ''​nodeshort''​ or ''​parallel''​ on Mogon2, no smp-partition should be choosen.   * ''​-p,​--partition'',​ the default is ''​nodeshort''​ or ''​parallel''​ on Mogon2, no smp-partition should be choosen.
   * ''​-o,​--outdir''​ output directory path (default is the current working directory)   * ''​-o,​--outdir''​ output directory path (default is the current working directory)
 +  * ''​--tag''​ optional tag/prefix for logfiles and directories
 +  * ''​--groups''​ set to provide a lists of read group tags (len(groups) must equal to No. of files)
   * ''​--single''​ (no arguments) to evaluate single end data   * ''​--single''​ (no arguments) to evaluate single end data
   * ''​--args''​ to supply additional flags, e. g. ''​--args="​-l 1024 -n 0.02"''​ for BWA - note the quotation marks, they are necessary.   * ''​--args''​ to supply additional flags, e. g. ''​--args="​-l 1024 -n 0.02"''​ for BWA - note the quotation marks, they are necessary.
Line 122: Line 150:
   * Per input tuple (paired sequencing data, only) a BAM file with the prefix of the input will be written. In the case of single end data, there will be one output per input, only.   * Per input tuple (paired sequencing data, only) a BAM file with the prefix of the input will be written. In the case of single end data, there will be one output per input, only.
  
-<WRAP center round info 90%> +=== Generating Read Group Tags ===
-//**About injecting group tags:**//+
  
-As shown arguments ​can be supplied ​with the ''​--args''​ argument to ''​MapperWrapper''​. ​In order to be recognized ​as a whole string it needs to be enclosed in quotations, e.g. +Read group tags can be inserted ​with the ''​--groups'' ​flag((From version 0.6 onward.)). The tags are supplied ​as a list on the command line. An example code to generate a tag list for consecutively ordered tags would be:
  
 <code bash> <code bash>
-for bwa +defining the input directory appropriately in a master script: 
---args="-R '@RG\tID:ABC\tLB:lb\tPL:ILLLUMINA\tPM:​HISEQ\tSM:NA12878'​"  +inputdir=/​some/​path/​to/​your/​data # assuming '​_R1'​ defines the forward reads in a paired end scenario 
-for yara + 
---args="--rg '@RG\tID:​ABC\tLB:​lb\tPL:​ILLLUMINA\tPM:​HISEQ\tSM:​NA12878'" ​+# a template ​may deviate from project to project 
 +template="​@RG\tID:​+ID+\tLB:unknown_lb\tPL:illumina\tSM:sample+ID+
 +the tag list to be generated 
 +tags="
 +# number of samples ​this snippet could be integrated in a script  
 +nsamples=$(find $inputdir ​-name '*_R1*.fastq' ​| grep -v unpaired | wc -l) 
 +# now the actual generation:​ 
 +for ((i=1; i <= $nsamples; i++)); do 
 +  tags="$tags $(sed -e "​s/​+ID+/​$i/​g"​ <<<​ $template)"​ 
 +done
 </​code>​ </​code>​
- 
-</​WRAP>​ 
  
  
Line 162: Line 196:
 About Arguments: About Arguments:
   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference. No symbolic links are allowed.   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference. No symbolic links are allowed.
-  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​trimmomatic|trimmomatic ​module]].+  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​qc|quality check module]].
  
  
  • software/topical/lifescience/ngs_read_mapping_tools.1545120802.txt.gz
  • Last modified: 2018/12/18 09:13
  • by meesters