rnaseq vs sage

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

rnaseq		sage
	Project
14	Mentions	5
772	Stars	187
4.3%	Growth	-
9.4	Activity	8.1
3 days ago	Latest Commit	13 days ago
Nextflow	Language	Rust
MIT License	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

rnaseq

Posts with mentions or reviews of rnaseq. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-09.

R pipelines for bulk RNA-seq analyses
3 projects | /r/bioinformatics | 9 Dec 2023
Point of using Hisat2 build to index reference genomes when working with known genomes mouse/human?
1 project | /r/bioinformatics | 29 May 2023

Just run something like this and don’t worry about it: https://nf-co.re/rnaseq
I used featureCounts to quantify RNA-seq reads and got a low successful alignment percentage. Is this a problem?
1 project | /r/bioinformatics | 18 May 2023

Try https://nf-co.re/rnaseq ! I know it was a lot of work to get to featurecounts, but it actually has been depreciated in favor of either salmon or RSEM quantification. In my experience, STAR-RSEM is the best way to get the most accurate quantification of RNA-Seq data
What are some good examples of well-engineered bioinformatics pipelines?
8 projects | /r/bioinformatics | 5 Apr 2023
How to know where to align if I have RNAseq data??
1 project | /r/bioinformatics | 9 Mar 2023

Consider looking into NFCore's RNAseq pipeline. I haven't tried this one myself, but it looks very comprehensive and has nice documentation: https://nf-co.re/rnaseq
Semi Budget-Friendly High-Thread Count Options?
1 project | /r/HomeServer | 13 Feb 2023

my go-to benchmark for performance is the standard nf-core RNA-Seq pipeline; https://nf-co.re/rnaseq keep in mind that the included test profiles pull sample data down from the internet so that can end up bottlenecking your PC if you dont have a fast connection
How to get NGS programming experience?
1 project | /r/bioinformatics | 31 Jan 2023

I would suggest the nf-core/rnaseq pipeline. It's used by many core facilities around the world. Also, there are many more pipelines from nf-core, e.g. Sarek for variant calling.
Illumina: can I use it on my laptop?
1 project | /r/bioinformatics | 13 Dec 2022

You’ll have a batch effect if you use a different pipeline, but you can quantify RNA easily on a laptop. https://nf-co.re/rnaseq
What is the preferred way of documenting a Nextflow pipeline?
1 project | /r/bioinformatics | 13 Oct 2022

Hi u/_Fallen_Azazel_, thank you for the answer. I took a look at their stuff but couldn't really find how they handle the documentation. For instance, `nf-core/rnaseq` is a model pipeline from the nf-core community, still, the documentation rendered on the nf-core website doesn't have any correlated markdown file at their repo (at least not that I could find). It is not clear for me how I should ideally do it.
Generate GUIs and deploy bioinformatics workflows with python
3 projects | /r/bioinformatics | 7 Sep 2022

First lets recognize that the framework presented has new features that don't exist in the previous DSLs you mention. Many developers highly value these additions and they along could justify a new stab at a workflow language: and for many the represent tradeoff * Interface generation * Declarative cloud resource provisionment * Static typing * Native python support This workflow has a similar level of complexity to nf-core/rnaseq (not the same, but similar in number of constituent tasks for the purpose of counting transcript abundance). It ingests raw sequencing reads, runs QC + trimming, does psuedo-alignment, recovers counts from abundance estimates, and aggregates counts over a many samples for direct use by diff-exp tools. (It is not 'running salmon'. I think that is a reductionist take.) It does this in addition to dynamically building React.js interfaces, adding static type validation to input parameters, and deploying cloud infrastructure in a simpler way. For the lines of code comparison, I think it is a weird way to compare software as the number of lines of code is no proxy for legibility, ease of development, likelihood of long-term maintenance (many more people know python than nextflow). Nonetheless nf-core/rnaseq has nearly 1000 lines alone in its workflow entrypoint alone - https://github.com/nf-core/rnaseq/blob/master/workflows/rnaseq.nf . With imported modules + subworkflows, LOC actually reaches the many thousands.. (Now I understand it is more complex and mature, but I highlight why I think the comparison is weird and wonder what you are even comparing to.) Whereas the entire logic of the presented pipeline is actually neatly encapsulated in 1200 lines of a single file. Overall this feels like a that doesn't come from a place of rational discourse, rather group dislike for something new and different. What I would like to do is address and talk about specific technical points (preferably over issues on github) so conversations can be productive and improve the tools I am working on.

sage

Posts with mentions or reviews of sage. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-06.

Does anyone know a great guide/documentation explaining how to implement Percolator?
2 projects | /r/proteomics | 6 Jun 2023

If you want to implement LDA from scratch, you could check out how Sage is doing it.
What are some good examples of well-engineered bioinformatics pipelines?
8 projects | /r/bioinformatics | 5 Apr 2023

You could check out https://github.com/lazear/sage - it's a near comprehensive program/pipeline for analyzing DDA/shotgun proteomics data. Most proteomics pipelines consist of running multiple, separate tools in sequence (search, spectrum rescoring, retention time prediction, quantification), but sage performs all of these. This cuts down on the need for disk space for storing intermediate results (none required), the need for IO (files are read once), and results in a proteomics pipeline that is >10-1000x faster than anything else, including commercial solutions
Proteomics search engine written in Rust
5 projects | /r/rust | 5 Nov 2022

You can also check out the intro blog post if you're interesting in learning more about the algorithm behind Sage. Beyond being fast, it also includes integrated machine learning (linear discriminant analysis, KDE) for rescoring spectral matches.
Opinions on AlphaPept
2 projects | /r/proteomics | 30 Oct 2022

You could try out Sage, if you're looking for speed - I don't think you'll find anything faster. https://github.com/lazear/sage

What are some alternatives?

When comparing rnaseq and sage you can also consider the following projects:

mag - Assembly and binning of metagenomes

seqkit - A cross-platform and ultrafast toolkit for FASTA/Q file manipulation

diffexpr - Porting DESeq2 into python via rpy2

fasten - :construction_worker: Fasten toolkit, for streaming operations on fastq files

HomeBrew - 🍺 The missing package manager for macOS (or Linux)

mokapot - Fast and flexible semi-supervised learning for peptide detection in Python

configs - Config files used to define parameters specific to compute environments at different Institutions

juicer - A One-Click System for Analyzing Loop-Resolution Hi-C Experiments

patterns - A curated collection of Nextflow implementation patterns

Rust-Bio - This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration.

sarek - Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing

alphapept - A modular, python-based framework for mass spectrometry. Powered by nbdev.