Tool Name |
Description |
Adapter Removal | Removes adapter sequences and trims low quality bases from the 3' end of reads. Overlapping paired-ended reads can be merged into consensus sequences and adapter sequence can be found for paired-ended data if not known. |
AfterQC | Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data. |
bcl2fastq | bcl2fastq can be used to both demultiplex data and convert BCL files to FASTQ file formats for downstream analysis. |
BioBloom Tools | BioBloom Tools assigns reads to different references using bloom filters. This is faster than alignment and can be used for contamination detection. |
Cluster Flow | Cluster Flow is a simple and flexible bioinformatics pipeline tool. |
Cutadapt | Cutadapt is a tool to find and remove adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads. |
ClipAndMerge | adapter clipping and read merging in ancient DNA analysis |
FastQ Screen | FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect. |
FastQC | FastQC is a quality control tool for high throughput sequence data, written by Simon Andrews at the Babraham Institute in Cambridge. |
Fastp | An ultra-fast all-in-one FASTQ preprocessor |
FLASh | FLASH (Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from NGS data. |
Flexbar | Flexible barcode and adapter removal |
InterOp | The Illumina InterOp libraries are a set of common routines used for reading and writing InterOp metric files. These metric files are binary files produced during a run providing detailed statistics about a run. In a few cases, the metric files are produced after a run during secondary analysis (index metrics) or for faster display of a subset of the original data (collapsed quality scores). |
Jellyfish | JELLYFISH is a tool for fast, memory-efficient counting of k-mers in DNA. |
KAT | The K-mer Analysis Toolkit (KAT) contains a number of tools that analyse and compare K-mer spectra. |
leeHom | leeHom is a program for the Bayesian reconstruction of ancient DNA |
minionqc | Quality control for long reads from ONT (Oxford Nanopore Technologies) sequencing. |
SeqyClean | SeqyClean is a comprehensive preprocessing software application for NGS reads. |
Skewer | Skewer is an adapter trimming tool specially designed for processing next-generation sequencing (NGS) paired-end sequences. |
SortMeRNA | SortMeRNA is a program tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and metagenomic data. |
Trimmomatic | Trimmomatic is a flexible read trimming tool for Illumina NGS data |
BISCUIT | BISCUIT is a software tool suite for analyzing bisulfite-converted DNA sequencing. |
Bismark | Bismark is a tool to map bisulfite converted sequence reads and determine cytosine methylation states. |
Bowtie 1 | Bowtie 1 is an ultrafast, memory-efficient short read aligner. |
Bowtie 2 | Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. |
BBMap | BBMap is a suite of pre-processing, assembly, alignment, and statistics tools for DNA/RNA sequencing reads. |
HiCUP | HiCUP (Hi-C User Pipeline) is a tool for mapping and performing quality control on Hi-C data. |
HiC-Pro | HiC-Pro is an optimized and flexible pipeline for Hi-C data processing. |
HISAT2 | HISAT2 is a fast and sensitive alignment program for mapping NGS reads (both DNA and RNA) to reference genomes. |
Kallisto | kallisto is a program for quantifying abundances of transcripts from RNA-Seq data. |
Longranger | A set of analysis pipelines that perform sample demultiplexing, barcode processing, alignment, quality control, variant calling, phasing, and structural variant calling. |
Salmon | Salmon is a tool for quantifying the expression of transcripts using RNA-seq data. |
STAR | STAR is an ultrafast universal RNA-seq aligner. |
TopHat | TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes. |
Bamtools | BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files. |
Bcftools | BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. |
biobambam2 | biobambam3 contains tools for processing BAM files for early stage alignment file processing |
BUSCO | BUSCO assesses genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs. |
Conpair | Conpair estimates concordance and contamination for tumour–normal pairs |
DamageProfiler | DNA damage investigation tool for ancient DNA analysis |
DeDup | Improved Duplicate Removal for merged/collapsed reads in ancient DNA analysis |
deepTools | Tools to process and analyze deep sequencing data. |
Disambiguate | Disambiguation algorithm for reads aligned to two species (e.g. human and mouse genomes) from Tophat, Hisat2, STAR or BWA mem. |
featureCounts | featureCounts is a highly efficient general-purpose read summarization program that counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations. |
Fgbio | Fgbio can be used for processing and evaluating data containing UMIs |
GATK | Variant Discovery in High-Throughput Sequencing Data |
goleft indexcov | Quickly estimate coverage from a whole-genome bam index, providing 16KB resolution. This is useful as a quick QC to get coverage values across the genome. |
Hap.py | Hap.py is a set of programs based on htslib to benchmark variant calls against gold standard truth datasets. Som.py output not currently supported. |
HiCExplorer | HiCexplorer addresses the common tasks of Hi-C analysis from processing to visualization. |
HOMER | HOMER is a suite of tools for Motif Discovery and next-gen sequencing analysis. |
HTSeq | HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays. HTSeq-count takes a file with aligned sequencing reads, plus a list of genomic features and counts how many reads map to each feature. |
MACS2 | MACS2 identifies transcription factor binding sites in ChIP-seq data. |
methylQA | methylQA is a methylation sequencing data quality assessment tool. |
mosdepth | Mosdepth performs fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing. |
miRTrace | miRTrace, developed by the team of Marc Friedländer (KTH, Sweden), is a quality control software for small RNA sequencing data. |
MTNucRatio | A simple tool to compute mitochondrial to nuclear genome ratios. |
phantompeakqualtools | Computes enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. |
Peddy | Peddy calculates genotype :: pedigree correspondence checks, ancestry checks and sex checks using VCF files. |
Picard | Picard is a set of Java command line tools for manipulating high-throughput sequencing data. |
Preseq | Preseq estimates the complexity of a library, showing how many additional unique reads are sequenced for increasing total read count. |
Prokka | Prokka is a software tool for the rapid annotation of prokaryotic genomes. |
QoRTs | QoRTs is a fast, efficient, and portable toolkit designed to assist in the analysis, QC and data management of RNA-Seq datasets. |
Qualimap | Qualimap is a platform-independent application to facilitate the quality control of alignment sequencing data and its derivatives like feature counts. |
QUAST | A Quality Assessment Tool for Genome Assemblies by the Center for Algorithmic Biotechnology. |
RNA-SeQC | RNA-SeQC is a java program which computes a series of quality control metrics for RNA-seq data. |
RSEM | RSEM (RNA-Seq by Expectation-Maximization) is a software package for estimating gene and isoform expression levels from RNA-Seq data. |
RSeQC | RSeQC is a package that provides a number of useful modules that can comprehensively evaluate high throughput RNA-seq data. |
Samblaster | Samblaster is a tool to mark duplicates and extract discordant and split reads from sam files. |
Samtools | Samtools is a suite of programs for interacting with high-throughput sequencing data. |
Sargasso | Sargasso is a tool to separate mixed-species RNA-seq reads according to their species of origin. |
Sex.DetErrMine | A python script to calculate the relative coverage of X and Y chromosomes, and their associated error bars, from the depth of coverage at specified SNPs. |
Slamdunk | Slamdunk is a tool to analyze SLAM-Seq data. |
SnpEff | SnpEff is a genetic variant annotation and effect prediction toolbox. It annotates and predicts the effects of variants on genes (such as amino acid changes). |
Supernova | Supernova is a de novo genome assembler for 10X Genomics linked-reads. |
Stacks | Stacks is a software for analyzing restriction enzyme-based data (e.g. RAD-seq) |
THeTA2 | THeTA2 estimates tumour purity and clonal / subclonal copy number. |
VCFTools | VCFTools is a program for working with and reporting on VCF files. |
VerifyBAMID | VerifyBamID checks whether reads match known genotypes or are contaminated as a mixture of two samples. |