biofast
Benchmarking programming languages/implementations for common tasks in Bioinformatics (by lh3)
viroiddb
A curated database of all available viroid-like RNA sequences (by Benjamin-Lee)
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
biofast
Posts with mentions or reviews of biofast.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-09-23.
-
Parsing huge files in Python
FYI: the python packages I mentioned earlier can all directly read gzip'd fastq files. See also this repo for examples.
-
Does Rust Support Reading in FATSA files?
needletail is rated in the Heng Li benchmark (https://github.com/lh3/biofast/)
- Why I Use Nim instead of Python for Data Processing
viroiddb
Posts with mentions or reviews of viroiddb.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-09-23.
-
Why I Use Nim instead of Python for Data Processing
That's great. However the method that you use to find the canonical representative [1] is quadratic (when the string has length N, there are N rotations and for each rotation you need to check N characters to determine whether this is earlier than the best on that you have found so far). For large strings you would probably want to switch to one of the linear minimal string rotation algorithms [2].
[1] https://github.com/Benjamin-Lee/viroiddb/blob/main/scripts/c...
[2] https://en.wikipedia.org/wiki/Lexicographically_minimal_stri...
What are some alternatives?
When comparing biofast and viroiddb you can also consider the following projects:
nimtorch - PyTorch - Python + Nim
benchmarks - Some benchmarks of different languages
scikit-bio - scikit-bio: a community-driven Python library for bioinformatics, providing versatile data structures, algorithms and educational resources.
nimpy - Nim - Python bridge
readfq - Fast multi-line FASTA/Q reader in several programming languages
PrimesResult - The results of the Dave Plummer's Primes Drag Race
RecursiveFactorization.jl
Arraymancer - A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends