software:topical:lifescience:genome_assembly

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
software:topical:lifescience:genome_assembly [2019/10/09 21:59]
meesters [SPAdes]
— (current)
Line 1: Line 1:
-====== Genome Assembly ====== 
  
-===== Software Options ===== 
- 
-==== Canu ==== 
- 
-[[https://github.com/marbl/canu|canu]] is a fork of the [[http://wgs-assembler.sourceforge.net/wiki/index.php?title=Main_Page|Celera Assembler]], designed for high-noise single-molecule sequencing. It is available as a module: 
- 
-''bio/canu'' 
- 
-=== Quick Start Example - Escherichia coli K12=== 
-We will briefly explain here how to submit the [[https://canu.readthedocs.io/en/latest/quick-start.html#assembling-pacbio-or-nanopore-data|quick start example]] from Canu to Mogon I. 
- 
- Download the P6-C4 chemistry released by Pacific Biosciences with 
- 
-<code shell> 
-$ curl -L -o pacbio.fastq http://gembox.cbcb.umd.edu/mhap/raw/ecoli_p6_25x.filtered.fastq 
-</code> 
- 
-to your desired directory and use the following batch-script 
-  
-<file bash CanuQuickStart.slurm> 
-#!/bin/bash 
-  
-#SBATCH -J canuTest              # Job name 
-#SBATCH -o canuTestLog.%j.out    # Specify stdout output file (%j expands to jobId) 
-#SBATCH -p nodeshort             # Partition name ('parallel' on Mogon II) 
-#SBATCH -N 1                     # Total number of nodes requested (64 cores/node per Mogon I node) 
-#SBATCH -c 64                    # Total number of cores for the single task 
-#SBATCH -t 00:30:00              # Run time (hh:mm:ss) - 0.5 hours                                                                                                                                                                               
-#SBATCH -A <account>             # Specify allocation to charge against 
-##SBATCH --mem=<value>           # optional: remove comment, if more than the partition default is required 
-  
-# Loading modules:  
-module load bio/canu/1.6-foss-2017a 
-  
-# Launch the executable 1 times 
-srun canu -p  ecoli -d ecoli-pacbio genomeSize=4.8m -pacbio-raw pacbio.fastq maxThreads=64 useGrid=false  
-</file> 
- 
-The script will lauch one executable with 64 threads using the entire memory of a node. Canu will auto-detect the available computational resources and scale itself to it. If needed Canu can be restricted to utilize only a certain amount of memory with ''maxMemory=<amount in GiB>''  
- 
-==== MaSuRCA ==== 
- 
-The [[https://github.com/alekseyzimin/masurca|MaSuRCA]] MaSuRCA (Maryland Super Read Cabog Assembler) assembler claims to combine the benefits of deBruijn graph and Overlap-Layout-Consensus assembly approaches. 
- 
-The modules are available as: 
- 
-''bio/MaSuRCA'' 
- 
-==== Platanus ==== 
- 
-[[http://platanus.bio.titech.ac.jp/|Platanus]] is available as a module: 
- 
-''bio/Platanus'' 
- 
-==== Trinity ==== 
- 
- 
-We do have a module for Trinity: ''bio/Trinity''. 
- 
-However: 
- 
-<WRAP center round alert> 
-[[https://github.com/trinityrnaseq/trinityrnaseq/wiki|Trinity]] is not a single piece of software, but rather three consecutive programs((hence the name, in case you wondered)). As those come with different demands on resources it would be a waste of time and faireshare, also a throttled run in itself, if run //as-is//. In the LSF-times we had an un-announced wrapper script. Un-announced because nobody ever cared to approach us, after it was established for a particular group. 
- 
-In case you contemplate using Trinity, please approach us and we will re-establish this script with adaptions for SLURM.  
- 
-Why do we do not it right now? It would require time, data and cooperation (feedback).</WRAP> 
- 
- 
-==== SPAdes ==== 
- 
- 
-[[http://cab.spbu.ru/software/spades/|SPAdes]] is available as a module: 
- 
-''bio/SPAdes'' 
- 
-Be sure to [[https://github.com/ablab/spades|approach the developer(s)]] in case of issues. SPades is under heavy development((which is a good thing!)) and the support is friendly and helpful.  
- 
- 
-==== Velvet ==== 
- 
-[[https://www.ebi.ac.uk/~zerbino/velvet/|Velvet]]((Development seems to have ceased.)) is available as modules adhering to this name scheme: ''bio/Velvet/<version>-<toolchain>-<kmer-info>''. 
- 
-As the kmer size to be used is hard-compiled, it is to be selected by the respective module. 
- 
-===== Workflow Integration ===== 
- 
-There is no workflow integration, yet. Due to the lack of feedback support is restricted to [[https://en.wikipedia.org/wiki/Software_as_a_service|SaaS]]. If you like to change that or need further support, please [[hpc@uni-mainz.de|approach us]] - bearing in mind that support is limited to the available man power. 
  • software/topical/lifescience/genome_assembly.1570651153.txt.gz
  • Last modified: 2019/10/09 21:59
  • by meesters