Top 14 Genome Open-Source Projects

deepvariant

5 3,080 9.1 Python

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
mosdepth

3 656 6.3 Nim

fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing

Project mention: Calculating Average Coverage or Read Depth for a Sequence (WES) | /r/bioinformatics | 2023-06-24

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
DNABERT

1 543 3.1 Python

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Project mention: [D] New to DNABERT | /r/MachineLearning | 2023-11-03

If I want to get started, they said it's optional to pre-train (so you can skip to step 3). This is where I got tripped up: "Note that the sequences are in kmer format, so you will need to convert your sequences into that." From what I understand, you need to do this so that all of the sequences are the same length? So kmer=6 means all of the sequences are length 6? Someone suggested that I take the first nucleotide in the promoter and grab 3 nucleotides before and 3 nucleotides after (+/-3 bases). I don't think that's how the kmer thing works though? I tried replicating how I think it works down below (I got confused on the last row of the 'after' df). Please correct me if I'm wrong!

Augustus

1 264 5.0 C++

Genome annotation with AUGUSTUS (by Gaius-Augustus)
masurca

1 229 3.5 M4
NanoSim

1 213 5.6 Python

Nanopore sequence read simulator
eager

1 124 7.6 Nextflow

A fully reproducible and state-of-the-art ancient DNA analysis pipeline
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
pyrodigal

1 122 8.7 Cython

Cython bindings and Python interface to Prodigal, an ORF finder for genomes and metagenomes. Now with SIMD!

Project mention: DNA to amino acid sequence? | /r/bioinformatics | 2023-06-19

True! I believe bakta relies on this python implementation of prodigal for translation https://github.com/althonos/pyrodigal

OSGenome

2 107 3.2 Python

An Open Source Web Application for Genetic Data (SNPs) using 23AndMe and Data Crawling Technologies
PGA

1 46 0.0 Perl

Plastid Genome Annotator
hmep

0 7 0.0 Haskell

Haskell Multi Expression Programming implemented with the focus on speed
bioinformatics

1 3 6.5 Python

Bioinformatic algorithms for the UCLA Bioinformatics Specialization (by ashinzekene)
Genome

1 3 2.7 Python

Genome Network Ala Neural Network (by DanShai)
DIF

2 1 3.6 Jupyter Notebook

"DNA IMAGE FOOTPRINT" The main idea is to convert a DNA sequence to an image to find any related sequences in the image with common algorithms (by MahdiKarimian)
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Genome related posts

Calculating Average Coverage or Read Depth for a Sequence (WES)

1 project | /r/bioinformatics | 24 Jun 2023
How to get a DNA report using AncestryDNA/23andme raw data without uploading to another server?

1 project | /r/bioinformatics | 28 Apr 2023
Why is bedtools genomecov giving me a blank output file?

1 project | /r/bioinformatics | 7 Apr 2023
Nanopore long read assembly help!

2 projects | /r/bioinformatics | 10 May 2022
Cross-species functional annotation?

2 projects | /r/bioinformatics | 15 Feb 2022
Generate image footprint from DNA sequence for sequence matching

1 project | news.ycombinator.com | 18 Nov 2021
Raw nanowire sequencer data

2 projects | /r/bioinformatics | 26 Jun 2021
A note from our sponsor - SaaSHub
www.saashub.com | 6 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Genome projects? This list will help you:

	Project	Stars
1	deepvariant	3,080
2	mosdepth	656
3	DNABERT	543
4	Augustus	264
5	masurca	229
6	NanoSim	213
7	eager	124
8	pyrodigal	122
9	OSGenome	107
10	PGA	46
11	hmep	7
12	bioinformatics	3
13	Genome	3
14	DIF	1

Genome

Top 14 Genome Open-Source Projects

Genome related posts

Calculating Average Coverage or Read Depth for a Sequence (WES)

How to get a DNA report using AncestryDNA/23andme raw data without uploading to another server?

Why is bedtools genomecov giving me a blank output file?

Nanopore long read assembly help!

Cross-species functional annotation?

Generate image footprint from DNA sequence for sequence matching

Raw nanowire sequencer data

Index