SaaSHub helps you find the best software and product alternatives Learn more →
Top 15 C++ Bioinformatic Projects
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)Project mention: Qurstion about automating trimming process | reddit.com/r/bioinformatics | 2022-05-26
The next version of bwa-memProject mention: Anyone use DRAGEN-GATK? | reddit.com/r/bioinformatics | 2022-10-12
If you haven’t heard of it already you may want to check out https://github.com/bwa-mem2/bwa-mem2 which is a faster version of bwa-mem. I’ve been using it for a while now and found it to be quite stable, same results as the original and the speed improvement is nice.
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
A fast and sensitive gapped read aligner
Ultra-fast and memory-efficient (meta-)genome assembler
Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.Project mention: What's an efficient way to find multiple subsequences in several FASTQs? | reddit.com/r/bioinformatics | 2022-02-08
I’ve got a similar situation. I was implementing the Smith-Waterman algorithm when I figured someone had to have already written a “fast” version of this. I found the edlib package (https://github.com/Martinsos/edlib) which does sequence alignment using Levenshtein distance. Essentially same DP algorithm as your traditional NW or SW only this is a C++ implementation with a Python wrapper. (I’m assuming you’re using Python, could be wrong though). The pertinent aspects of the output of this function contains the distance (dissimilarity) and the location (what index does the alignment start and end). This tool may go a ways to helping your pipeline. You could also look to metagenomic papers for inspiration as this is a problem (find a substring in a huge amount of data) that the community contends with all the time. Kmer based approach may also be useful if you want to attempt the alignment free path. Cheers.
Haplotype VCF comparison toolsProject mention: Help running hap.py | reddit.com/r/bioinformatics | 2022-11-22
I have been tasked with benchmarking a variant calling pipeline running hap.py as part of my bioinformatics MSc project.
The modern C++ library for sequence analysis. Contains version 3 of the library and API docs.
Write Clean C++ Code. Always.. Sonar helps you commit clean C++ code every time. With over 550 unique rules to find C++ bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
Bayesian haplotype-based mutation calling (by luntergroup)Project mention: genotyping tool | reddit.com/r/bioinformatics | 2022-07-13
Check out octopus https://github.com/luntergroup/octopus
An ultrafast memory-efficient short read alignerProject mention: Burrows–Wheeler Transform | news.ycombinator.com | 2022-09-24
Genomics Extension for SQLiteProject mention: sqlite-zstd: Transparent dictionary-based row-level compression for SQLite - An SQLite extension written in Rust to reduce the database size without losing functionality | reddit.com/r/rust | 2022-07-31
Yes, that is indeed an obviously missing part. I knew about ZIPVFS, but somehow forgot to investigate closer. Probably because I started this project before GenomicsSQLite was a thing (that seems like the best alternative).
Fast, efficient RNA-Seq metrics for quality control and process optimizationProject mention: Tools for strand direction detection RNA-Seq | reddit.com/r/bioinformatics | 2022-10-08
I like to use RNA-SeQC (https://github.com/getzlab/rnaseqc). It shows the percentage of forward/reverse reads that alingned to either the sense or antisense strands. It is also compatible with multiQC which is a big plus.
A Low-cost Open-source High-speed Multi-camera Motion Capture System.
A compressed, associative, exact, and weighted dictionary for k-mers.Project mention: Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2 | reddit.com/r/bioinformatics | 2022-09-08
The paper describing a new tool from our lab has just been published in Genome Biology (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-022-02743-6). Cuttlefish 2 is a tool for efficiently computing the compacted de Bruijn graph (or a spectrum preserving string set) from either raw sequencing reads or from reference genomes. It is quite fast and very memory efficient — for example, we were able to construct the compacted de Bruijn graph on a set of 661K bacterial genomes in 16 hours and 30 minutes using only 48.7GB of RAM. Construction of the compacted de Bruijn graph is an important initial processing step in e.g. genome assembly, and is also important in several other areas such as comparative genomics and as a critical step in building certain types of indices (e.g. [sshash](https://github.com/jermp/sshash)). You can find the cuttlefish 2 software on GitHub [here](https://github.com/COMBINE-lab/cuttlefish), and it can also be installed via Bioconda. We'd be happy to have your feedback!
Efficient variant-call data storage and retrieval library using the TileDB storage library.Project mention: Has anyone stored/queried VCFs and their variant records in a relational database? | reddit.com/r/bioinformatics | 2022-11-12
Perhaps of interest https://github.com/TileDB-Inc/TileDB-VCF
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
C++ Bioinformatics related posts
Help running hap.py
1 project | reddit.com/r/bioinformatics | 22 Nov 2022
Anyone use DRAGEN-GATK?
1 project | reddit.com/r/bioinformatics | 12 Oct 2022
Tools for strand direction detection RNA-Seq
2 projects | reddit.com/r/bioinformatics | 8 Oct 2022
Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2
3 projects | reddit.com/r/bioinformatics | 8 Sep 2022
Ask HN: Should I publish my research code?
10 projects | news.ycombinator.com | 14 Jan 2022
[TileDB webinar] Population genomics is a data management problem
3 projects | reddit.com/r/bioinformatics | 20 Oct 2021
Bioinformatics programming language
1 project | reddit.com/r/u_waynerad | 18 Oct 2021
A note from our sponsor - #<SponsorshipServiceOld:0x00007fea59209068>
www.saashub.com | 4 Feb 2023
What are some of the best open-source Bioinformatic projects in C++? This list will help you: