Bioinformatics
Ampliconnoise
AmpliconNoise is a collection of programs for the removal of noise from 454 sequenced PCR amplicons. It involves two steps the removal of noise from the sequencing itself and the removal of PCR point errors. This project also includes the Perseus algorithm for chimera removal. [AmpliconNoise Website - 22 March 2013] | ||
Available versions and module name 1.27 (AmpliconNoise-1.27 or AmpliconNoise) | Default version available 1.27 | Official Website https://code.google.com/p/ampliconnoise/ Download from: http://ampliconnoise.googlecode.com/files/AmpliconNoiseV1.27.tar.gz |
ARB
The ARB software is a graphically oriented package comprising various tools for sequence database handling and data analysis. A central database of processed (aligned) sequences and any type of additional data linked to the respective sequence entries is structured according to phylogeny or other user defined criteria. [ARB Website - 13 December 2013] | ||
Available versions and module name 5.19 (arb-5.19 or arb) | Default version available 5.19 | Official Website |
Blast (Legacy Blast From Ncbi, Not Blast+)
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families. [Blast Website - 22 March 2013] | ||
Available versions and module name 2.2.22 (blast-2.2.22 or blast) | Default version available 2.2.22 | Official Website http://blast.ncbi.nlm.nih.gov/ Download from: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/2.2.22/blast-2.2.22-ia32-linux.tar.gz |
Blast+
BLAST+ is a new suite of BLAST tools that utilizes the NCBI C++ Toolkit. The BLAST+ applications have a number of performance and feature improvements over the legacy BLAST applications. For details, please see the BLAST+ user manual and the article in BMC Bioinformatics (PubMed link). [Blast Website - January 2015] | ||
Available versions and module name 2.2.22 (blast+_2.2.28 or blast+) 2.6.0 (blast+_2.6.0) | Default version available not loaded by default | Official Website http://blast.ncbi.nlm.nih.gov/ Download from: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.2.28+-x64-linux.tar.gz |
Bowtie2
Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes. [Bowtie2Website - January 2015] | ||
Available versions and module name 2.2.4 (bowtie2 or bowtie2) | Default version available 2.2.4 | Official Website http://bowtie-bio.sourceforge.net/bowtie2/index.shtml |
CD-HIT
CD-HIT was originally a protein clustering program. The main advantage of this program is its ultra-fast speed. It can be hundreds of times faster than other clustering programs, for example, BLASTCLUST. Therefore it can handle very large databases, like NR. The 1st version of this program, CD-HI, was published and released in 2001. The 2nd version, called CD-HIT, was published in 2002 with significant improvements. Since 2004, CD-HIT has been hosted at bioinformatics.org as an open source project. [CD-HIT User Guide (/apps/software/cd-hit/3.1.1/cd-hit-user-guide.txt) - 22 March 2013] | ||
Available versions and module name 3.1.1 (cd-hit-3.1.1 or cd-hit) | Default version available 3.1.1 | Download from: http://www.bioinformatics.org/download/cd-hit/cd-hit-2007-0131.tar.gz |
CDBTools
CDB (Constant DataBase) indexing and retrieval tools for FASTA files cdbtools contains a "platform independent file-based hashing tools (cdbfasta and cdbyank) that can be used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files." [cdb README (/apps/software/cdbtools/README) - 22 March 2013] | ||
Available versions and module name Version unknown (cdbtools) | Default version available Version unknown | Downloaded from: ftp://occams.dfci.harvard.edu/pub/bio/tgi/software/cdbfasta/cdbfasta.tar.gz |
Chimeraslayer
Available versions and module name Version unknown (cdbtools) | Default version available Version unknown | Downloaded from: |
Cufflinks
Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols. Cufflinks was originally developed as part of a collaborative effort between the Laboratory for Mathematical and Computational Biology, led by Lior Pachter at UC Berkeley, Steven Salzberg’s computational genomics group at the Institute of Genetic Medicine at Johns Hopkins University, and Barbara Wold’s lab at Caltech. The project is now maintained by Cole Trapnell’s lab at the University of Washington. [Cufflinks Website- January 2015] | ||
Available versions and module name 2.2.1 (cufflinks-2.2.1 or cufflinks) | Default version available 2.2.1 | Website: |
Cytoscape
Cytoscape is an open source software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data. Although Cytoscape was originally designed for biological research, now it is a general platform for complex network analysis and visualization. [http://www.cytoscape.org/what_is_cytoscape.html - June 2014] | ||
Available versions and module name 3.1.1 (cytoscape-3.1.1 or cytoscape) | Default version available 3.1.1 | Website: |
FASTQC
FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis. [FastQC website/ - 17 December 2014] | ||
Available versions and module name 0.11.2 (fastqc-0.11.2 or fastqc) | Default version available 0.11.2 | Website: |
Fasttree
FastTree -- inferring approximately-maximum-likelihood trees for large multiple sequence alignments. [FastTree Source File (//apps/software/FastTree/2.1.3/FastTree-2.1.3.c) - 22 March 2013] | ||
Available versions and module name 2.1.3 (FastTree-2.1.3 or FastTree) | Default version available 2.1.3 | Downloaded from: |
Fastx-Toolkit
The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing. The main processing of such FASTA/FASTQ files is mapping (aka aligning) the sequences to reference genomes or other databases using specialized programs. Example of such mapping programs are: Blat, SHRiMP, LastZ, MAQ and many many others. However, The FASTX-Toolkit tools perform some of these preprocessing tasks. [Fastx-Toolkit - Janurary 2015] | ||
Available versions and module name 0.0.13 (fastx-toolkit-0.0.13 or fastx-toolkit) | Default version available 0.0.13 | Website: |
Gepard
Gepard (German: "cheetah", Backronym for "GEnome PAir - Rapid Dotter") allows the calculation of dotplots even for large sequences like chromosomes or bacterial genomes. [Gepard Website - January 2015] | ||
Available versions and module name 1.30 (gepard-1.30 or gepard) | Default version available 1.30 | Official Website http://www.helmholtz-muenchen.de/icb/software/gepard/index.html |
Hmmer
"HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST." [HMMER Website - 22 March 2013] | ||
Available versions and module name 3.0 (hmmer-3.0 or hmmer) | Default version available 3.0 | Official Website Download from: |
IVG
The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations. [IVG Website - January 2015] | ||
Available versions and module name 2.3.40 (igv-2.3.40 or igv) | Default version available 2.3.40 | Official Website |
Infernal
Available versions and module name 1.0.2 (infernal-1.0.2 or infernal) | Default version available 1.0.2 | Downloaded from: ftp://selab.janelia.org/pub/software/infernal/infernal.tar.gz |
Kraken
Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm. [Kraken Website - September 2016] | ||
Available versions and module name 0.10.5-beta (kraken-0.10.5-beta or kraken) | Default version available 0.10.5-beta | Official Website |
Meta Velvet
Meta Velvet : An extension of Velvet assembler to de novo metagenome assembly from short sequence reads [Meta Velvet Website - 22 March 2013] | ||
Available versions and module name 1.2.02 (metavelvet-1.2.02 or metavelvet) | Default version available 1.2.02 | Official Website |
Mothur
mothur "seeks to develop a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community." mothur "added the functionality of a number of other popular tools including s-libshuff, TreeClimber (i.e. the parsimony test), UniFrac, distance calculation, visualization tools, a NAST-based aligner, and many other features." [Mothur Website - 22 March 2013] | ||
Available versions and module name 1.25.0 (mothur-1.25.0) 1.40.0 (mothur-1.40.0) 1.42.0 (mothur-1.42.0 or mothur) | Default version available 1.42.0 | Official Website Download from: |
Muscle
MUSCLE is one of the best-performing multiple alignment programs according to published benchmark tests, with accuracy and speed that are consistently better than CLUSTALW. MUSCLE can align hundreds of sequences in seconds. Most users learn everything they need to know about MUSCLE in a few minutesonly a handful of command-line options are needed to perform common alignment tasks. [MUSCLE Website - 22 March 2013] | ||
Available versions and module name 3.8.425 (muscle-3.8.425 or muscle) | Default version available 3.8.425 | Official Website Download from: |
NGOPT
ngopt - de novo assembly & analysis of Illumina sequence data, including the A5 pipeline, A5-miseq, tools to evaluate assembly quality, and scripts to facilitate data submission to NCBI and the RAST annotation system . [ngopt Website - 7th December 2015] | ||
Available versions and module name 20150522 (ngopt-20150522 or ngopt) | Default version available 20150522 | Official Website |
Parsinsert
ParsInsert is a C++ implementation of Parsimonious Insertion. The algorithm exploits the knowledge provided by publicly available curated phylogenetic trees to efficiently insert new sequences and infer their taxonomic classification. The ParsInsert placement of new sequences in the tree is deterministic which allows for distributed processing to quickly handle millions of reads. [ParsInsert Website - 22 March 2013] | ||
Available versions and module name 1.04 (ParsInsert-1.04 or ParsInsert) | Default version available 1.04 | Official Website http://sourceforge.net/projects/parsinsert/files/ Download from: http://downloads.sourceforge.net/project/parsinsert/ParsInsert.1.04.tgz |
Pear - Paired-End Read Merger
PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory. PEAR evaluates all possible paired-end read overlaps and without requiring the target fragment size as input. In addition, it implements a statistical test for minimizing false-positive results. Together with a highly optimized implementation, it can merge millions of paired end reads within a couple of minutes on a standard desktop computer. [Pear Website - 08 July 2015] | ||
Available versions and module name 0.9.61.1 (pear-0.9.6 or pear) | Default version available 0.9.6 | Official Website |
Pplacer
Pplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment. Pplacer is designed to be fast, to give useful information about uncertainty, and to offer advanced visualization and downstream analysis. [Pplacer Website - 22 March 2013] | ||
Available versions and module name 1.1 (pplacer-1.1 or pplacer) | Default version available 1.1 | Official Website http://matsen.fhcrc.org/pplacer/ Download from: http://matsen.fhcrc.org/pplacer/builds/pplacer-v1.1-Linux.tar.gz |
QIIME
QIIME (pronounced "chime") stands for Quantitative Insights Into Microbial Ecology. QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data). QIIME takes users from their raw sequencing output through initial analyses such as OTU picking, taxonomic assignment, and construction of phylogenetic trees from representative sequences of OTUs, and through downstream statistical analysis, visualization, and production of publication-quality graphics. QIIME has been applied to single studies based on billions of sequences from thousands of samples. [QIIME Website - 22 March 2013] | ||
Available versions and module name 1.6.0 (qiime-1.6.0 or qiime) 1.7.0 (qiime-1.7.0) 1.8.0 (qiime-1.8.0) 2018.4 (via singularity image qiime2-2018.04.img) 2018.8 (via singularity image qiime2-2018.8.simg) 2019.1 (via singularity image qiime2-2019.1.simg) | Default version available 1.6.0 | Official Website |
RAXML
Available versions and module name 7.3.0 (raxml-7.3.0 or raxml) | Default version available 7.3.0 | Downloaded from: ftp://thebeast.colorado.edu/pub/QIIME-v1.5.0-dependencies/stamatak-standard-RAxML-5_7_2012.tgz |
RRNAselector
rRNASelector: a computer program for selecting rRNA genes from metagenomic and metatranscriptomic sequences. This JAVA program is designed to be used for selecting prokaryotic rRNA sequences from metagenomic and metatrancriptomic shotgun libraries. The rRNASelector selects both bacterial and archaeal rRNA genes (5S, 16S and 23S). The remaining sequences can be used for non-rRNA-based bioinformatic analysis (e.g. homology search against protein databases). Additionally, for further rRNA-based analysis, the resultant sequences are trimmed to contain only rRNA coding regions. [rRNASelector Website - 17 December 2014] | ||
Available versions and module name No module available | Default version available No module available | Website: |
RSEM
RSEM is a software package for estimating gene and isoform expression levels from RNA-Seq data. [RSEM README (http://deweylab.github.io/RSEM/README.html) - 07 February 2017] | ||
Available versions and module name 1.3.0 (rsem-1.3.0) 1.3.1 (rsem-1.3.1 or rsem) | Default version available 1.3.1 | Website: |
RTAX
RTAX: Rapid and accurate taxonomic classification of short paired-end sequence reads from the 16S ribosomal RNA gene. [RTAX README (/apps/software/RTAX/0.983/README) - 22 March 2013] | ||
Available versions and module name 0.983 (rtax-0.983 or rtax) | Default version available 0.983 | Downloaded from: |
Sam Tools
SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format. [SAM Tools Website - January 2015] | ||
Available versions and module name 1.1 (samtools-1.1 or samtools) | Default version available 1.1 | QIIME SourceTracker Information: http://samtools.sourceforge.net/ Downloaded from: |
SAMStat
Displaying sequence statistics for next generation sequencing [SAMStat Website - January 2015] | ||
Available versions and module name 1.5 (samstat-1.5 or samstat) | Default version available 1.5 | Website: |
SourceTracker
SourceTracker is designed to predict the source of microbial communities in a set of input samples (i.e., the sink samples). [QIIME SourceTracker Website - 5 December 2013] | ||
Available versions and module name 0.9.5 (sourcetracker-0.9.5 or sourcetracker) | Default version available 0.9.5 | QIIME SourceTracker Information: http://qiime.org/tutorials/source_tracking.html Downloaded from: http://sourceforge.net/projects/sourcetracker/files/latest/download |
Spades
SPAdes – St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines. | ||
Available versions and module name 3.11.1 (spades-3.11.1 or spades) | Default version available 3.11.1 | Website: http://cab.spbu.ru/software/spades/ Downloaded from: http://cab.spbu.ru/files/release3.11.1/SPAdes-3.11.1-Linux.tar.gz |
Tophat
TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyses the mapping results to identify splice junctions between exons. TopHat is a collaborative effort among Daehwan Kim and Steven Salzberg in the Center for Computational Biology at Johns Hopkins University, and Cole Trapnell in the Genome Sciences Department at the University of Washington. TopHat was originally developed by Cole Trapnell at the Center for Bioinformatics and Computational Biology at the University of Maryland, College Park. [TopHat Website - January 2015] | ||
Available versions and module name 2.0.13 (tophat-2.0.13 or tophat) | Default version available 2.0.13 | Website: |
Trimmomatic
Trimmomatic: A flexible read trimming tool for Illumina NGS data. [Trimmomatic Website - 17 December 2014] | ||
Available versions and module name No module available | Default version available No module available | Website: |
TrinityRNASEQ
Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. [Trinityrnaseq Website - January 2015] | ||
Available versions and module name 2.0.0 (trinityrnaseq-2.0.0 or trinityrnaseq) | Default version available 2.0.0 | Website: |
Uclust
UCLUST v1.2.22q (C) Copyright 2009-10 Robert C. Edgar - Licensed for use in PyNAST and QIIME. | ||
Available versions and module name 1.2.22q (uclust-1.2.22q or uclust) | Default version available 1.2.22q | Downloaded from: |
Uproc
With rapidly increasing volumes of biological sequence data the functional analysis of new sequences in terms of similarities to known protein families challenges classical bioinformatics. The ultrafast protein classification (UProC) toolbox implements a novel algorithm ("Mosaic Matching") for large-scale sequence analysis and is now available in terms of an open source C library. UProC is up to three orders of magnitude faster than profile-based methods and achieved up to 80% higher sensitivity on unassembled short reads (100 bp) from simulated metagenomes. UProC does not depend on a multiple alignment of family-specific sequences. Therefore, in addition to the protein domain classification according to the Pfam database, UProC can, in principle, also provide the detection of KEGG Orthologs. We provide a precompiled database for KEGG Ortholog classification which we applied to the prediction of functional repertoires from short reads (see below). [UProC Website - 25 November 2016] | ||
Available versions and module name 1.2.0 (uproc-1.2.0 or uproc) | Default version available 1.2.0 | Official Website Downloaded from: |
USEARCH
USEARCH is a unique sequence analysis tool with thousands of users world-wide. USEARCH offers search and clustering algorithms that are often orders of magnitude faster than BLAST. [usearch Website - 22 March 2013] | ||
Available versions and module name 5.2.236 (usearch-5.2.236) 5.2.32 (usearch-5.2.32) 6.0.307 (usearch-6.0.307) 7.0.1090 (usearch or usearch-7.0.1090) | Default version available 7.0.1090 | Official Website http://www.drive5.com/usearch/ Downloaded from: |
Velvet
Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near Cambridge, in the United Kingdom. Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired-end read and long read information, when available, to retrieve the repeated areas between contigs. [Velvet Website - 17 December 2014] | ||
Available versions and module name 1.2.10 (velvet or velvet-1.2.10) | Default version available 1.2.10 | Official Website |
Velvet Optimiser
VelvetOptimiser is a multi-threaded Perl script for automatically optimising the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet de novo sequence assembler. [Velvet Optimiser Website - 17 December 2014] | ||
Available versions and module name 2.2.5 (velvetoptimiser or velvetoptimiser-2.2.5) | Default version available 2.2.5 | Official Website |
If, for whatever reason, you would like to use the older versions of the compilers, or other available software, you can simply "unload" the current module and load the older version. See Software Module Information webpage.
Software list
|
|
- Glasgow Haskell Compiler
- GNU Bison
- GNU Compiler
- GNU Fortran Compiler
- Go Compiler
- Intel C and C++ Compiler
- Intel Fortran Compiler
- Mono C# Compiler and associated libraries
- Apache Maven
- GNU Emacs
- IntelliJ IDEA
- RStudio
- Tex Live
- Tex Info
- ANSYS (Fluent and Mechanical)
- ABAQUS
- AVL Fire
- Comsol Multiphysics
- Circos
- ImageJ
- ImageMagick
- PLPlot
- Xfig
- Perl
- Python (with many Modules Installed)
- GDAL
- GEOS
- Panoply
- Mathematica
- Grid Mathematica
- Matlab
- Matlab-Simulink
- Octave
- OpenBUGS
- R
- SPSS (by IBM)
- Java
- APSIM
- Boost C++ Libraries
- Expat XML Parser
- FFTW
- GMP (GNU Multiple Precision Arithmetic Library)
- GNU Scientific Library (GSL)
- HDF5
- HDF-EOS5
- Intel Math Kernel Libraries (Intel® MKL)
- libbz2
- libcurl
- libjpeg
- libpng
- MPI
- OpenBLAS
- PCRE (Perl Compatible Regular Expressions)
- szip
- tcl
- tk
- UDUNITS