samtools
libdna
Our great sponsors
samtools | libdna | |
---|---|---|
3 | 2 | |
1,543 | 20 | |
2.1% | - | |
8.4 | 7.4 | |
7 days ago | about 2 months ago | |
C | C | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
samtools
-
BWA MEM with merged paired-end reads and unmerged in the same run.
I don’t know about your first question, but for your second, use “bwa mem -p …” for smart pairing of an interleaved FASTQ/A with both paired end reads and singletons. BWA will recognize whether adjacent reads are paired if they have the same prefix (see here)
-
Show a Tanuki with Samtools!
In this post, I'm going to add the command tanuki to Samtools to display a ASCII art(AA) tanuki.
-
How do I know if I'm gonna get my module?
Quite a number of bioinformatics related tool that were written in C/C++, eg. (https://github.com/samtools/samtools). But there are also a lot of modern packages for Python / R now
libdna
-
A good, fast hash for nucleotides triplet converted to 0, 1, 3, 2 using `3 & (nuc << 1)`
While that works for the canonical bases your method won't support CCN which should give Proline. Hence, for my implementation I have opted for a slower but more general approach. As protein coding sequences are usually short performance isn't an issue, really.
-
Counting the number of matching characters in two ASCII strings
In Bioinformatics, if you now the number of mismatching characters between two strings of DNA your can compute their evolutionary distance. As DNA is long, easily a few megabytes, computing such a hamming distance via SIMD really pays of. Here is my implementation if anyone is interested: https://github.com/kloetzl/libdna
What are some alternatives?
seqtk - Toolkit for processing sequences in FASTA/Q formats
samtools - [Moved to: https://github.com/ingolia/SamTools]
MMseqs2 - MMseqs2: ultra fast and sensitive search and clustering suite
htslib - C library for high-throughput sequencing data formats
RNAlien - RNAlien - unsupervised RNA family model construction
pn2codon - Python Rust FFI for reverse-translating Amino Acid sequences to DNA sequences
libBigWig - A C library for handling bigWig files
bioinformatics-toolkit - A collection of bioinformatics algorithms
ClustalParser - Parse output of Clustal tools
BlastHTTP - Haskell cabal libary for submission and result retrieval from the NCBI Blast REST webservice
EntrezHTTP - Haskell cabal libary for submission and result retrieval from the NCBI Entrez REST webservice