tiny_python_projects
Biopython
tiny_python_projects | Biopython | |
---|---|---|
4 | 31 | |
1,389 | 4,188 | |
- | 1.6% | |
3.7 | 9.6 | |
2 months ago | 3 days ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tiny_python_projects
-
Coding Programs and Sites for Learning R and Python?
My book, Mastering Python for Bioinformatics (O'Reilly, 2021), uses many biofx challenges from the Rosalind.info site, but it's not necessarily a beginner book. The most important thing I teach is the use of tests to verify that a program/function is correct (or at least behaves predictably). You can see https://github.com/kyclark/biofx_python for all the code/tests. To learn more about Python and testing, I would recommend you start with other books such at my Tiny Python Projects (Manning, 2020). Code and tests are at https://github.com/kyclark/tiny_python_projects. I recorded videos showing how to write and test all those programs at tinypythonprojects.com. Best of luck!
-
AWK wildcard, is it possible?
From Clark's Tiny Python Projects (the corresponding code shared on GitHub) I learned the concept of test driven development (specific to Python, the book elected pytest for quality control) which equally can be applied for other programming languages. For me, continuous integration tests (some projects on GitHub use), or unit tests tap into this field.
-
Help they’re turning me into a programmer
What the 101 beginner courses sometimes/often skip (because there isn't enough time, attendees become tired, etc) is the next level, automated testing. As an example, pytest for Python allows you to set up "a test bank" to monitor if the output of your program's result are reasonable. This then is test driven development (e.g., Clark's Tiny Python Projects).
-
Enable hyphenation only for code blocks
Only as recommendation: If the lines of the source code (here: you C code you aim to document) are kept short, in manageable bytes (similar to entries parser.add_argument in Clark's "Tiny Python Projects", example seldomly pass beyond the frequently recommended threshold of 80 characters/line), reporting with listings becomes easier (equally, the reading of the difference logs/views by git and vimdiff), than with lines of say 120 characters per line. Though we no longer are constrained to 80 characters per line by terminals/screens and punch cards (when Fortran still was FORTRAN), this is a reason e.g., yapf for Python allows you to choose between 4 spaces/indentation (PEP8 style), or 2 spaces/indentation (Google style).
Biopython
- Invitación a proyecto - Biopython en Español
- Biopython – Python Tools for Computational Molecular Biology
-
comparing the similarity between a set of protein sequences
Usearch will do all-against-all comparisons, cluster sequences, and produce alignments for each cluster. You can set the clustering threshold (proportion of residues identical). The alignments are in fasta format, which is pretty standard. If all you want is basic similarity it might be easiest to just write something that calculates normalized Hamming distances (typically called p-distances in the molecular evolution literature) between pairs of sequences. I suspect the biopython fasta reader (you can install biopython from https://biopython.org/) will be good enough.
-
u/Responsible-Gas3852 comments on "Why is Cancer so Hard to Cure?"
Yes, the computing tool for biological computation.
-
My boss is considering letting me take a programming course if I have some good reasons why.
Beside that their core lectures to non-computer scientists are public (survey), workshops by software carpentry move around the globe. Maybe your intent to seed hands-on knowledge is in similar tune before heading for biopython, bioperl, bioawk. It doesn't hurt to tap into resources initially written for non-labrats either, e.g. about regular expressions by programming historian.
- Can you run ScanProsite locally?
- How to iterate over the whole GRCh38 genome with python?
-
Help they’re turning me into a programmer
Well, what language do you want to learn? What is your background so far? Assuming it is more on the side of biology, software carpentry's Python may eventually lead to biopython? Though there equally is a chance for AWK (Hack the planet's text! and bioawk...
-
Biology related exercices and "challenges" to train by myself
I think you mind find something of a community around BioPython, which might be helpful. Just looking at the capabilities will probably be instructive as well.
-
Joining the Open Source Development Course
Python is the main programming language I use nowadays. In particular numpy and pandas are of course extremely useful. I also use biopython package - a collection of software tools for biological computation written in Python by an international group of researchers and developers.
What are some alternatives?
bioawk - BWK awk modified for biological data
RDKit - The official sources for the RDKit library
yapf - A formatter for Python files
biotite - A comprehensive library for computational molecular biology
biofx_python - Code for Mastering Python for Bioinformatics (O'Reilly, 2021, ISBN 9781098100889)
bioconda-recipes - Conda recipes for the bioconda channel.
Numba - NumPy aware dynamic Python compiler using LLVM
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
PyDy - Multibody dynamics tool kit.
weblogo - WebLogo 3: Sequence Logos redrawn
statsmodels - Statsmodels: statistical modeling and econometrics in Python
bccb - Incubator for useful bioinformatics code, primarily in Python and R