bowtie2
kraken2
Our great sponsors
bowtie2 | kraken2 | |
---|---|---|
2 | 7 | |
618 | 658 | |
- | - | |
7.6 | 5.1 | |
6 days ago | about 1 month ago | |
C++ | C++ | |
GNU General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
bowtie2
-
NHI Genome Studies: Mexico Govt Sept 12 Congressional hearing
2) Use bowtie2 to align reads against CHM13. This will let you separate human from nonhuman (important, as human sequences are a common contaminant in many nonhuman genomes).
- Computationally intensive steps in RNA-seq analysis
kraken2
-
NHI Genome Studies: Mexico Govt Sept 12 Congressional hearing
3) Use Kraken2 to classify remaining reads. I'd start with the standard database.
-
Refseq bacterial genomes to clean reads?
See more information in: kraken2 manual
-
Fastest way to check E. coli contamination levels in eukaryotic NGS libraries?
If you've got a fast solid state drive with >200G of space, then kraken2 + bracken works really well. First, use kraken2 to map reads to taxa in memory-mapped mode (to reduce system memory consumption):
-
Inferring bacterial population sizes from metagenomic data
Yes, that can be done. Bacterial proportions is pretty much what programs like Kraken2 and Centrifuge do.
-
Command line tool for species identification from Fasta files
Or Kraken2
-
How can I generate a list of short (75-150bp) sequences from a bacterial genome and find out if any of those sequences are unique to that organism?
For bacterial metagenomic stuff you can quickly reduce the amount of sequences you need to BLAST by using Kraken2.
-
Show HN: An API for running computationally intensive tools
While implementing and scaling data analysis pipelines at a biotech startup, I spent most of my time getting new tools running efficiently and scaling them. Implementing something like Kraken2 for genomic analysis (https://github.com/DerrickWood/kraken2) on our infrastructure took weeks and was hard to scale. I expected a library for running these tools on managed infrastructure via an API to exist – like Twilio for sending text messages or Stripe for processing payments – but I couldn't find any.
Toolchest is an API for running data analysis tools easily (i.e. copy and paste a few lines of code), without managing the infrastructure. We're starting with computational genomics tools, but tools in other spaces can be added. Please drop me a message if you have a use case in mind! For example, I've thought about making hashcat powered by Tesla V100 GPUs accessible via our API.
All feedback is welcome! If you're curious about how it works, feel free to check out our docs: https://toolchest-python-client.readthedocs.io/en/latest/use...
What are some alternatives?
STAR - RNA-seq aligner
fastp - An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
bwa-mem2 - The next version of bwa-mem
megahit - Ultra-fast and memory-efficient (meta-)genome assembler
seq - A high-performance, Pythonic language for bioinformatics
IntaRNA - Efficient target prediction incorporating accessibility of interaction sites
CHM13 - The complete sequence of a human genome
vg - tools for working with genome variation graphs