samtools
libdna
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
samtools
We haven't tracked posts mentioning samtools yet.
Tracking mentions began in Dec 2020.
libdna
-
A good, fast hash for nucleotides triplet converted to 0, 1, 3, 2 using `3 & (nuc << 1)`
While that works for the canonical bases your method won't support CCN which should give Proline. Hence, for my implementation I have opted for a slower but more general approach. As protein coding sequences are usually short performance isn't an issue, really.
-
Counting the number of matching characters in two ASCII strings
In Bioinformatics, if you now the number of mismatching characters between two strings of DNA your can compute their evolutionary distance. As DNA is long, easily a few megabytes, computing such a hamming distance via SIMD really pays of. Here is my implementation if anyone is interested: https://github.com/kloetzl/libdna
What are some alternatives?
Genbank - Genbank format tools and parser
htslib - C library for high-throughput sequencing data formats
ViennaRNAParser
samtools - Tools (written in C using htslib) for manipulating next-generation sequencing data
seqtk - Toolkit for processing sequences in FASTA/Q formats
pn2codon - Python Rust FFI for reverse-translating Amino Acid sequences to DNA sequences
RNAlien - RNAlien - unsupervised RNA family model construction
ProteinToCodonTranslator
phybin - Binning (Newick) Phylogenetic Trees by Topology
BlastHTTP - Haskell cabal libary for submission and result retrieval from the NCBI Blast REST webservice
ADPfusionForest - Dynamic programming on tree and forest structures
vcf - Haskell library to handle VCF (Variant Call Format) files