trinityrnaseq
sage
trinityrnaseq | sage | |
---|---|---|
6 | 5 | |
803 | 188 | |
0.6% | - | |
2.8 | 7.7 | |
16 days ago | 17 days ago | |
Perl | Rust | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
trinityrnaseq
- What are some good examples of well-engineered bioinformatics pipelines?
-
Can I get so guidance on how to run sample data in trinityrnaseq-Trinity-v2.13.2 in Ubuntu Linux ?
The screen shot suggests you are not able to actually compile/install trinity. The cmake command is not used to run it. After you download and unpack the trinity software package: https://github.com/trinityrnaseq/trinityrnaseq/releases/download/Trinity-v2.13.2/trinityrnaseq-v2.13.2.FULL.tar.gz, you should look at the INSTALL file, which tells you to:
-
Beginner question for using dockerized trinity
I'm trying commands from the Docker site and the trinity site. When I try to run the command, it behaves like the code is incomplete (hitting enter just gives me a new line). To my understanding, the issue can be either in the pathway/to/files part of the code, of in the trinityrnaseq/trinityrnaseq part.
-
GATK version issue with variant calling following Trinity pipeline python script
For my analysis I am trying to do variant calling on my transcriptome. I am following the Trinity github pipeline for variant calling, in which they provided a python script for GATK for the variant calling. I use SLURM to run this job for me:
-
What's your bioinformatics bible? Any bibliography recommendations for experimental design?
Since you're an R user, I would recommend Bioconductor. Here's an excellent tutorial on how to analyze RNA-seq data (make sure that you're R version is up to date!): https://www.bioconductor.org/packages/devel/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html . However, if you want to assemble raw transcriptome data, then Trinity is a great tool (https://github.com/trinityrnaseq/trinityrnaseq/wiki).
- Can DESeq2 and edgeR be used on Trinity assembled transcripts given that only the longest isoforms are used?
sage
-
Does anyone know a great guide/documentation explaining how to implement Percolator?
If you want to implement LDA from scratch, you could check out how Sage is doing it.
-
What are some good examples of well-engineered bioinformatics pipelines?
You could check out https://github.com/lazear/sage - it's a near comprehensive program/pipeline for analyzing DDA/shotgun proteomics data. Most proteomics pipelines consist of running multiple, separate tools in sequence (search, spectrum rescoring, retention time prediction, quantification), but sage performs all of these. This cuts down on the need for disk space for storing intermediate results (none required), the need for IO (files are read once), and results in a proteomics pipeline that is >10-1000x faster than anything else, including commercial solutions
-
Proteomics search engine written in Rust
You can also check out the intro blog post if you're interesting in learning more about the algorithm behind Sage. Beyond being fast, it also includes integrated machine learning (linear discriminant analysis, KDE) for rescoring spectral matches.
-
Opinions on AlphaPept
You could try out Sage, if you're looking for speed - I don't think you'll find anything faster. https://github.com/lazear/sage
What are some alternatives?
spades - SPAdes Genome Assembler
rnaseq - RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
seqkit - A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
juicer - A One-Click System for Analyzing Loop-Resolution Hi-C Experiments
fasten - :construction_worker: Fasten toolkit, for streaming operations on fastq files
mokapot - Fast and flexible semi-supervised learning for peptide detection in Python
Rust-Bio - This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration.
alphapept - A modular, python-based framework for mass spectrometry. Powered by nbdev.
gatk4-genome-processing-pipeline-azure - Workflows used for processing whole genome sequence data + germline variant calling.