Open-source projects categorized as Biology | Edit details

Top 17 Biology Open-Source Projects

  • GitHub repo deepchem

    Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology

    Project mention: How do I transition into bioinformatics from a senior software engineer (14 years of experience)? | | 2021-05-23
  • GitHub repo OpenWorm

    Project Home repo for Central Dockerfile and Project-wide issues

    Project mention: I just learned about the OpenWorm project. Does this have any implications for the philosophy of consciousness? | | 2021-06-03
  • GitHub repo Thrive

    The main repository for the development of the evolution game Thrive.

    Project mention: A few questions about evolution. | | 2021-07-19

    There is a game trying to make a more complex spore simulation called Thrive. Its free2play but currently only has the first, microbial stage.

  • GitHub repo ncbi-genome-download

    Scripts to download genomes from the NCBI FTP servers

    Project mention: Downloading genomes from database via command line FTP | | 2021-07-16

    I know you said Ensembl, but if you can live with NCBI, I would suggest

  • GitHub repo Vcflib

    C++ library and cmdline tools for parsing and manipulating VCF files

  • GitHub repo SeqAn

    SeqAn's official repository.

  • GitHub repo Catalyst.jl

    Chemical reaction network and systems biology interface for scientific machine learning (SciML). High performance, GPU-parallelized, and O(1) solvers in open source software

    Project mention: Should I switch over completely to Julia from Python for numerical analysis/computing? | | 2021-07-08

    ModelingToolkit.jl adds a different spin on this by noting what makes a good modeling system isn't top down but a system that allows for bottom up contributions. ModelingToolkit is built on Symbolics.jl which uses OSCAR.jl etc., so every time the symbolics community gets better ModelingToolkit.jl gets better. It connects to the whole SciML ecosystem, so any improvement to any of the SciML interface packages is directly an improvement to ModelingToolkit.jl. ModelingToolkit is made to be a set of composable compiler abstractions called transformations, so anyone can add new packages that do new transformations that improve the ecosystem. One that I really like is MomentClosure.jl which symbolically transforms stochastic ModelingToolkit models (ReactionSystem) to approximate symbolic ODESystem models of the moments. And there's domain-specific langauges like Catalyst.jl being built on the interface to give more ways to build models, which is spawning the biocommunity to make model importers into the symbolic forms, when then feeds more ODE models into the same compiler. JuliaSim is then building on this ecosystem, adding cloud infrastructure that is special-purpose made for doing parallel computations of these models, automatic symbolic model discovery from data, automatic generation of approximate models with machine learning, and tying the Julia Computing compiler team into the web that is building this ecosystem.

  • GitHub repo seagull

    A Python Library for Conway's Game of Life

    Project mention: Which language do you use to code cellular automata? | | 2021-05-24

    Python! I even made a small library to do it:

  • GitHub repo poly

    A Go package for engineering organisms.

    Project mention: Ask HN: What's an interesting DIY genetic engineering project? | | 2021-06-11

    I have experience in DIY genetic engineering (have run a DIY home genetic engineering lab for almost 10 years now )

    What you can do and what you can do are different things. Genetic engineering and biological manipulation go as deep as software, and tacit knowledge about execution is non-trivial to the point where you WILL mess up experiments (so expect to repeat a lot).

    That said, you can still do some fun stuff. I would recommend trying to do something very small but actually novel. For example, if you've done a GFP transformation into E.coli, try to get the GFP transformation working in a new organism (maybe a yogurt bacteria). Keep it small though, and keep it single cellular, or else you are putting yourself into the pit of despair.

    Also check out the Poly project ( We're basically building (decent) open-source software for doing synthetic biology. Since you're a software developer, doing code reviews and reading our mega-comments (like might help you understand some more of the fundamental engineering problems we synthetic biologists are encountering. Also, in code reviews, if you don't understand something, a practicing synthetic biologist will explain it to you so that we can improve our docs.

  • GitHub repo pydna

    Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning in Python and Jupyter notebooks.

    Project mention: Most optimal programming language for the field of genetics? | | 2021-02-25


  • GitHub repo BioSequences.jl

    Biological sequences for the julia language

    Project mention: Learning which programming language will make me the most accessible in bioinformatics community? (if there's any) | | 2021-04-20

  • GitHub repo Wham

    Structural variant detection and association testing

  • GitHub repo BioAmp-EXG-Pill

    BioAmp EXG Pill is a small and elegant Analog Front End (AFE) board for BioPotential signal acquisition.

    Project mention: Anyone know of any very cheap / DIY EEG contraptions? | | 2021-03-22
  • GitHub repo libsequence

    libsequence: a C++ class library for evolutionary genetic analysis

  • GitHub repo aquarium

    The Aquarium Lab Operating System

    Project mention: New Study Explains How to Engineer the Coronavirus + All Other Synthetic Biology Research This Week | | 2021-02-01

    From the methods section of the Aquarium paper: Aquarium is distributed under the open-source MIT license. Aquarium, documentation, and installation instructions are freely available ( along with links to Dockerized versions of the software. Code is maintained on Github ( Aquarium’s Python API (Trident) is also under the open-source MIT license and is hosted on the open-source python repository at PyPI ( and its documentation and installation instructions are also freely available (

  • GitHub repo cas

    Cellular Automata Simulator

    Project mention: Where can I learn how to use Golly? | | 2021-04-14
  • GitHub repo full_spectrum_bioinformatics

    An open-access bioinformatics text

    Project mention: Beginner's bioinformatics books for someone without any knowledge in biology? | | 2021-05-13

    A former advisor of mine wrote this up a bit ago and I felt it was solid (also a CS major but did some bio work and his text was written with that in mind).

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-07-19.


What are some of the best open-source Biology projects? This list will help you:

Project Stars
1 deepchem 3,053
2 OpenWorm 1,245
3 Thrive 1,038
4 ncbi-genome-download 537
5 Vcflib 435
6 SeqAn 396
7 Catalyst.jl 197
8 seagull 136
9 poly 120
10 pydna 90
11 BioSequences.jl 89
12 Wham 80
13 BioAmp-EXG-Pill 66
14 libsequence 46
15 aquarium 43
16 cas 19
17 full_spectrum_bioinformatics 17