GenomicSQLite
fastp
GenomicSQLite | fastp | |
---|---|---|
1 | 9 | |
154 | 1,785 | |
- | 2.9% | |
3.4 | 4.7 | |
4 months ago | about 2 months ago | |
C++ | C++ | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
GenomicSQLite
-
sqlite-zstd: Transparent dictionary-based row-level compression for SQLite - An SQLite extension written in Rust to reduce the database size without losing functionality
Yes, that is indeed an obviously missing part. I knew about ZIPVFS, but somehow forgot to investigate closer. Probably because I started this project before GenomicsSQLite was a thing (that seems like the best alternative).
fastp
-
R pipelines for bulk RNA-seq analyses
fastp + multiQC + Salmon + DESeq2 all some nextflow workflow. It is a good exercise (not complicated) to create the pipeline from scratch the first time to properly understand each tool.
-
NHI Genome Studies: Mexico Govt Sept 12 Congressional hearing
1) QC the data with fastp. This'll trim out adapters and toss reads that are poor quality.
- Illumina adapters and quality trimming
-
Low-complexity sequence filtering tool
fastp has an adjustable low complexity filter option.
-
Can you evaluate my pipeline?
- in terms of preprocessing and QC, I prefer fastp (https://github.com/OpenGene/fastp)
-
Current QC tools for short read and long read sequencing
I generally use fastp as an all-in-one tool for short reads: https://github.com/OpenGene/fastp
- Qurstion about automating trimming process
-
What methods (conda installable only please) can you use to determine the complexity of a fastq file? (e.g., kmer analysis)
I don't know if this fits exactly what you need, but I'm using fastp to check my fastq.gz files lately: https://github.com/OpenGene/fastp. You can install it via conda.
-
A tool to count basepair in fastq file
If you also need some other basic statistics or want to filter the reads you can try fastp (https://github.com/OpenGene/fastp). If only the basepair count is needed, awk might be the fastest solution as suggested before.
What are some alternatives?
FlatBuffers - FlatBuffers: Memory Efficient Serialization Library
galaxy - Data intensive science for everyone.
hap.py - Haplotype VCF comparison tools
readfq - A simple tool to calculate reads number and total base count in FASTQ file
bwa-mem2 - The next version of bwa-mem
glslSmartDeNoise - Fast glsl deNoise spatial filter, with circular gaussian kernel, full configurable
megahit - Ultra-fast and memory-efficient (meta-)genome assembler
nextclade - Viral genome alignment, mutation calling, clade assignment, quality checks and phylogenetic placement
seq - A high-performance, Pythonic language for bioinformatics
readfq - Fast multi-line FASTA/Q reader in several programming languages
seqtk - Toolkit for processing sequences in FASTA/Q formats
fasql - DuckDB Extension for reading and writing FASTA and FASTQ Files