serratus
bioconda-recipes
Our great sponsors
serratus | bioconda-recipes | |
---|---|---|
4 | 5 | |
243 | 1,564 | |
- | 1.3% | |
0.0 | 10.0 | |
about 1 year ago | 2 days ago | |
Jupyter Notebook | Shell | |
GNU General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
serratus
- Ask HN: What Are You Working on This Year?
-
Ask HN: Who is hiring? (January 2023)
The Laboratory for RNA-Based Lifeforms | University of Toronto | Full-Time | ONSITE
We're a research computational biology lab at the forefront of RNA virus and virus-like agent discovery. Our goal is to help prevent the next pandemic by building the technical infrastructure to assist global virology research and public-health responses.
Seeking a full-stack developer who is creative, passionate, and willing to learn. No biology experience neccesary, but are a plus. Key assets: Python/R, AWS/HPC, postgres, javascript. See full job posting: http://rrna.ca/id0002
See: Serratus (https://serratus.io)
-
Software engineers: consider working on genomics
Serratus (https://github.com/ababaian/serratus) is an OSS bioinformatics project created by a passionate group of volunteers. Short story is we're re-analyzing all of the world's DNA/RNA sequencing data to find new viruses that other people have missed. It works surprisingly well, but there's a ton left to do.
bioconda-recipes
-
Why should academic researchers use Rust?
Rust makes distribution and maintenance near trivial. My lab develops a fairly widely-used tool, salmon, for the quantification of transcript expression from RNA-seq data. This tool is written in C++14, and has a substantial number of dependencies. The process of updating the tool (e.g. bumping dependencies) and cutting a new release is painful. To maintain widespread availability, we distribute this tool using bioconda which uses it's own CI and setup to build new releases for (in our case) Linux and MacOS. Things break all the time. For example, recently, they bumped the compiler used to build packages. This changed some default "implementation defined" behavior, causing previously functioning code to fail. We didn't find this locally, because we didn't test that specific compiler version. When we tried to release a new version, we had to go back and fix things etc. This is not just because different compilers exist, but because the C++ specification is soooo complicated and the set of undefined and implementation defined behavior is sooo broad that it's very brittle and it's easy for things to "break" via bitrot. However, the stability provided by Rust has been phenomenal so far. In our code, we only use stable Rust features, and we have benefited tremendously from the empirical guarantee that valid Rust code (except in exceptional cases like latent bugs in the language) will remain valid. While not all crates follow it religiously, there is a reasonable respect for semantic versioning. Thus, cutting a new release of one of our Rust tools is often as simple as just updating the Cargo.toml (and Cargo.lock in the case of applications), tagging a new release in GitHub, and letting the bioconda CI do it's business with the tagged artifacts. The build "scripts" are almost always trivial because the builds just work, across platforms, across CIs, etc. Now, new projects like cargo dist look like they make this process even simpler.
-
Software engineers: consider working on genomics
I contribute to Nextflow core (https://nf-co.re/) It's more of a collection of pipelines than traditional software, but there are users all around the world and a good community.
Most of the packages on bioconda (https://bioconda.github.io/) are open source. But you probably want to find a sub-field that interests you most before finding a project.
In grad school, we also had an ex-google software engineer volunteer with us one day a week. It was very impactful for many members of the lab to learn good engineering practices, and it wasn't at all like the sentiment others in this thread are expressing where engineers were "janitors".
-
Conda's Contrasting Command Clarification
Answering this because I believe there is no way other than looking at their "build.sh" in the recipes (e.g., in the build.sh of MOSCA, you can see the symbolic link in the 7th line will be mosca.py). One thing you can do to find the main script is also to run "find ~/anaconda3 (or miniconda3) -name *name_of_the_tool*" to find most of their scripts.
-
How to mix separated versions of Python in the cleanest way
In my world (research science) we usually use anaconda, which is just a slightly higher-level wrapper around python virtual envs. But they also maintain more repositories of various modules that scientists need. e.g. https://bioconda.github.io/
-
Seq: A programming language for high-performance computational genomics
Seems like there's a conda packaging on the works: https://github.com/bioconda/bioconda-recipes/pull/29660
What are some alternatives?
atproto - Social networking technology created by Bluesky
Biopython - Official git repository for Biopython (originally converted from CVS)
tidb - TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://tidbcloud.com/free-trial
pyenv-virtualenv - a pyenv plugin to manage virtualenv (a.k.a. python-virtualenv)
TeamTeri - Bioinformatics on GCP, AWS or Azure
adam - ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
seq - A high-performance, Pythonic language for bioinformatics
getting-started-with-genomics-tools-and-resources - Unix, R and python tools for genomics and data science
seq-genomics - Coursera Bioinformatics / Stepik Genome Sequencing with seq-lang
nimconf2021 - Slides for Nimconf21
pyenv-virtualenvwrapper - an alternative approach to manage virtualenvs from pyenv.