User Tools

Site Tools


software:topical:lifescience:ngs_read_mapping_tools

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
software:topical:lifescience:ngs_read_mapping_tools [2018/12/15 07:32]
meesters
software:topical:lifescience:ngs_read_mapping_tools [2019/08/05 09:29]
meesters fix: broken link
Line 23: Line 23:
  
 See [[:​software:​topical:​lifescience:​ngs_read_mapping_tools#​gpu-based|below for a wrapper script]] to ease your workflow. See [[:​software:​topical:​lifescience:​ngs_read_mapping_tools#​gpu-based|below for a wrapper script]] to ease your workflow.
- 
  
  
Line 32: Line 31:
 ''​bio/​SeqAn/<​version>''​ ''​bio/​SeqAn/<​version>''​
  
-You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]], eventually ((not yet)).+You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
  
Line 38: Line 37:
  
 [[https://​www.nature.com/​articles/​nmeth.1923|Bowtie2]] is a well known read aligner with a focus on gapped alignments. [[https://​www.nature.com/​articles/​nmeth.1923|Bowtie2]] is a well known read aligner with a focus on gapped alignments.
 +
 +Module(s) can be found at:
 +
 +''​bio/​Bowtie2/<​version>''​
 +
 +You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
 ==== STAR ==== ==== STAR ====
  
-<WRAP center round todo 65%>  +[[https://​www.ncbi.nlm.nih.gov/​pubmed/​23104886|STAR]] is a well known mapping tool for RNA-Seq data.  
-More info soon-ish. + 
-Particularly,​ a wrapper module is forthcomingAs STAR can work with a shared memory option, the wrapper ​is fundamentally different to that for other tools+Module(s) can be found at: 
-</WRAP>+ 
 +''​bio/STAR/<​version>''​ 
 + 
 +You can find a wrapper to ease your workflow, [[software:​topical:​lifescience:#​standard_mappers|below]].
  
 ==== segemehl ==== ==== segemehl ====
Line 58: Line 66:
 ''​bio/​segemehl/​0.2.0-foss-2018a''​ ''​bio/​segemehl/​0.2.0-foss-2018a''​
  
 +==== TopHat ====
 +
 +[[https://​ccb.jhu.edu/​software/​tophat/​index.shtml|TopHat]] is a fast splice junction mapper for RNA-Seq reads.
 +
 +Module can be found at:
 +
 +''​bio/​TopHat/<​version>''​
 +
 +
 +<WRAP center round info 90%>
 +This program is not yet incorporated into the wrapping module.
 +</​WRAP>​
 ==== yara ==== ==== yara ====
  
Line 105: Line 125:
  
   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference
-  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​trimmomatic|trimmomatic ​module]].+  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​qc|quality check module]].
  
 The options: The options:
Line 115: Line 135:
   * ''​-p,​--partition'',​ the default is ''​nodeshort''​ or ''​parallel''​ on Mogon2, no smp-partition should be choosen.   * ''​-p,​--partition'',​ the default is ''​nodeshort''​ or ''​parallel''​ on Mogon2, no smp-partition should be choosen.
   * ''​-o,​--outdir''​ output directory path (default is the current working directory)   * ''​-o,​--outdir''​ output directory path (default is the current working directory)
 +  * ''​--tag''​ optional tag/prefix for logfiles and directories
 +  * ''​--groups''​ set to provide a lists of read group tags (len(groups) must equal to No. of files)
   * ''​--single''​ (no arguments) to evaluate single end data   * ''​--single''​ (no arguments) to evaluate single end data
   * ''​--args''​ to supply additional flags, e. g. ''​--args="​-l 1024 -n 0.02"''​ for BWA - note the quotation marks, they are necessary.   * ''​--args''​ to supply additional flags, e. g. ''​--args="​-l 1024 -n 0.02"''​ for BWA - note the quotation marks, they are necessary.
Line 121: Line 143:
  
   * Per input tuple (paired sequencing data, only) a BAM file with the prefix of the input will be written. In the case of single end data, there will be one output per input, only.   * Per input tuple (paired sequencing data, only) a BAM file with the prefix of the input will be written. In the case of single end data, there will be one output per input, only.
 +
 +=== Generating Read Group Tags ===
 +
 +Read group tags can be inserted with the ''​--groups''​ flag((From version 0.6 onward.)). The tags are supplied as a list on the command line. An example code to generate a tag list for consecutively ordered tags would be:
 +
 +<code bash>
 +# defining the input directory appropriately in a master script:
 +inputdir=/​some/​path/​to/​your/​data # assuming '​_R1'​ defines the forward reads in a paired end scenario
 +
 +# a template - may deviate from project to project
 +template="​@RG\tID:​+ID+\tLB:​unknown_lb\tPL:​illumina\tSM:​sample+ID+"​
 +# the tag list to be generated
 +tags=""​
 +# number of samples - this snippet could be integrated in a script ​
 +nsamples=$(find $inputdir -name '​*_R1*.fastq'​ | grep -v unpaired | wc -l)
 +# now the actual generation:
 +for ((i=1; i <= $nsamples; i++)); do
 +  tags="​$tags $(sed -e "​s/​+ID+/​$i/​g"​ <<<​ $template)"​
 +done
 +</​code>​
  
  
Line 140: Line 182:
  
 <WRAP center round important 90%> <WRAP center round important 90%>
-**Limitations**: +**Considerations**: 
-  * See the parallel_BWA wrapper +  * See the [[software:​topical:​lifescience:​ngs_read_mapping_tools#​standard_mappers|"​standard"​ Mappers]] 
-  * Also: The script will only use the ''​m2_gpu''​ partition and therefore needs an account with the ''​m2_''​ prefix.+  * Also: The script will only use the ''​m2_gpu''​ partition and therefore needs an account with the ''​m2_''​ prefix((This is because development to support the wild "​zoo"​ of hardware and partition setting is hardly worth the effort for this software, as tests show that standard bwa (properly mapped) outperforms the gpu version.)).
 </​WRAP>​ </​WRAP>​
  
Line 148: Line 190:
 About Arguments: About Arguments:
   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference. No symbolic links are allowed.   * ''​referencedir''​ needs to be the (relative) path to a directory containing an indexed BWA reference. No symbolic links are allowed.
-  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​trimmomatic|trimmomatic ​module]].+  * ''​inputdir''​ needs to be a (relative) path to a directory containing all inputs. Subdirectories and files containing the string ''​unpaired''​ are ignored; this is to support preprocessing with the [[software:​topical:​lifescience:​qc|quality check module]].
  
  
software/topical/lifescience/ngs_read_mapping_tools.txt · Last modified: 2019/10/24 15:48 by meesters