gatk4-genome-processing-pipeline-azure
Workflows used for processing whole genome sequence data + germline variant calling. (by microsoft)
sage
Proteomics search & quantification so fast that it feels like magic (by lazear)
gatk4-genome-processing-pipeline-azure | sage | |
---|---|---|
4 | 5 | |
7 | 197 | |
- | - | |
0.0 | 7.7 | |
2 months ago | 8 days ago | |
wdl | Rust | |
BSD 3-clause "New" or "Revised" License | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gatk4-genome-processing-pipeline-azure
Posts with mentions or reviews of gatk4-genome-processing-pipeline-azure.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-04-05.
- What are some good examples of well-engineered bioinformatics pipelines?
-
Do you know how to get CNVs out of WES data sorted.bam files? (Free)
The GATK suite is pretty standard for calling germline mutations. Somatic mutation calling is a lot newer/trickier, so I'm just going to link to the GDC's practices.
-
Best way to document tool?
This pre-processing pipeline from Microsoft (adapted from the Broad Institute/GATK) is pretty well-documented - at least in my opinion - with input requirements, expected outputs, software requirements, etc.
sage
Posts with mentions or reviews of sage.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-06-06.
-
Does anyone know a great guide/documentation explaining how to implement Percolator?
If you want to implement LDA from scratch, you could check out how Sage is doing it.
-
What are some good examples of well-engineered bioinformatics pipelines?
You could check out https://github.com/lazear/sage - it's a near comprehensive program/pipeline for analyzing DDA/shotgun proteomics data. Most proteomics pipelines consist of running multiple, separate tools in sequence (search, spectrum rescoring, retention time prediction, quantification), but sage performs all of these. This cuts down on the need for disk space for storing intermediate results (none required), the need for IO (files are read once), and results in a proteomics pipeline that is >10-1000x faster than anything else, including commercial solutions
-
Proteomics search engine written in Rust
You can also check out the intro blog post if you're interesting in learning more about the algorithm behind Sage. Beyond being fast, it also includes integrated machine learning (linear discriminant analysis, KDE) for rescoring spectral matches.
-
Opinions on AlphaPept
You could try out Sage, if you're looking for speed - I don't think you'll find anything faster. https://github.com/lazear/sage
What are some alternatives?
When comparing gatk4-genome-processing-pipeline-azure and sage you can also consider the following projects:
juicer - A One-Click System for Analyzing Loop-Resolution Hi-C Experiments
rnaseq - RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.