Biopython
bioconda-recipes
Our great sponsors
Biopython | bioconda-recipes | |
---|---|---|
31 | 5 | |
4,120 | 1,552 | |
4.6% | 0.8% | |
9.6 | 10.0 | |
about 23 hours ago | 7 days ago | |
Python | Shell | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Biopython
-
comparing the similarity between a set of protein sequences
Usearch will do all-against-all comparisons, cluster sequences, and produce alignments for each cluster. You can set the clustering threshold (proportion of residues identical). The alignments are in fasta format, which is pretty standard. If all you want is basic similarity it might be easiest to just write something that calculates normalized Hamming distances (typically called p-distances in the molecular evolution literature) between pairs of sequences. I suspect the biopython fasta reader (you can install biopython from https://biopython.org/) will be good enough.
-
u/Responsible-Gas3852 comments on "Why is Cancer so Hard to Cure?"
Yes, the computing tool for biological computation.
-
My boss is considering letting me take a programming course if I have some good reasons why.
Beside that their core lectures to non-computer scientists are public (survey), workshops by software carpentry move around the globe. Maybe your intent to seed hands-on knowledge is in similar tune before heading for biopython, bioperl, bioawk. It doesn't hurt to tap into resources initially written for non-labrats either, e.g. about regular expressions by programming historian.
-
Help they’re turning me into a programmer
Well, what language do you want to learn? What is your background so far? Assuming it is more on the side of biology, software carpentry's Python may eventually lead to biopython? Though there equally is a chance for AWK (Hack the planet's text! and bioawk...
-
Joining the Open Source Development Course
Python is the main programming language I use nowadays. In particular numpy and pandas are of course extremely useful. I also use biopython package - a collection of software tools for biological computation written in Python by an international group of researchers and developers.
- osdc-2023-assignment1
-
parse a fasta file using regex
Just use Biopython.
-
Seq: A programming language for high-performance computational genomics
It might be pretty useful as a teaching tool, but I'm skeptical of its long-term benefit to professionals. I'm not sure the ecosystem of Seq users will be large enough, y'know? Again, it's pretty impressive work, and it's come a long way. I wish the devs all the best. :)
-
Looking for a tool to convert a whole fasta file with CDS sequences to a fasta file with protein sequences.
Quickly looking over it, that's a bunch of different scripts/functions using https://biopython.org/ to parse FASTA files in different ways. Probably the answer you're looking for is in that txt, or have a look at the Biopython tutorial.
-
Journey into bioinformatics
For bioinformatics in Python, the BioPython library (https://biopython.org/) is commonly used. An alternative to this package is Biotite (https://www.biotite-python.org/), a package I am maintaining.
bioconda-recipes
-
Why should academic researchers use Rust?
Rust makes distribution and maintenance near trivial. My lab develops a fairly widely-used tool, salmon, for the quantification of transcript expression from RNA-seq data. This tool is written in C++14, and has a substantial number of dependencies. The process of updating the tool (e.g. bumping dependencies) and cutting a new release is painful. To maintain widespread availability, we distribute this tool using bioconda which uses it's own CI and setup to build new releases for (in our case) Linux and MacOS. Things break all the time. For example, recently, they bumped the compiler used to build packages. This changed some default "implementation defined" behavior, causing previously functioning code to fail. We didn't find this locally, because we didn't test that specific compiler version. When we tried to release a new version, we had to go back and fix things etc. This is not just because different compilers exist, but because the C++ specification is soooo complicated and the set of undefined and implementation defined behavior is sooo broad that it's very brittle and it's easy for things to "break" via bitrot. However, the stability provided by Rust has been phenomenal so far. In our code, we only use stable Rust features, and we have benefited tremendously from the empirical guarantee that valid Rust code (except in exceptional cases like latent bugs in the language) will remain valid. While not all crates follow it religiously, there is a reasonable respect for semantic versioning. Thus, cutting a new release of one of our Rust tools is often as simple as just updating the Cargo.toml (and Cargo.lock in the case of applications), tagging a new release in GitHub, and letting the bioconda CI do it's business with the tagged artifacts. The build "scripts" are almost always trivial because the builds just work, across platforms, across CIs, etc. Now, new projects like cargo dist look like they make this process even simpler.
-
Software engineers: consider working on genomics
I contribute to Nextflow core (https://nf-co.re/) It's more of a collection of pipelines than traditional software, but there are users all around the world and a good community.
Most of the packages on bioconda (https://bioconda.github.io/) are open source. But you probably want to find a sub-field that interests you most before finding a project.
In grad school, we also had an ex-google software engineer volunteer with us one day a week. It was very impactful for many members of the lab to learn good engineering practices, and it wasn't at all like the sentiment others in this thread are expressing where engineers were "janitors".
-
How to mix separated versions of Python in the cleanest way
In my world (research science) we usually use anaconda, which is just a slightly higher-level wrapper around python virtual envs. But they also maintain more repositories of various modules that scientists need. e.g. https://bioconda.github.io/
-
Seq: A programming language for high-performance computational genomics
Seems like there's a conda packaging on the works: https://github.com/bioconda/bioconda-recipes/pull/29660
What are some alternatives?
RDKit - The official sources for the RDKit library
biotite - A comprehensive library for computational molecular biology
Numba - NumPy aware dynamic Python compiler using LLVM
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
PyDy - Multibody dynamics tool kit.
weblogo - WebLogo 3: Sequence Logos redrawn
statsmodels - Statsmodels: statistical modeling and econometrics in Python
bccb - Incubator for useful bioinformatics code, primarily in Python and R
bcbio-nextgen - Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
SciPy - SciPy library main repository
Dask - Parallel computing with task scheduling
PatZilla - PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.