bioawk VS tiny_python_projects

Compare bioawk vs tiny_python_projects and see what are their differences.

bioawk

BWK awk modified for biological data (by lh3)

tiny_python_projects

Code for Tiny Python Projects (Manning, 2020, ISBN 1617297518). Learning Python through test-driven development of games and puzzles. (by kyclark)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
bioawk tiny_python_projects
8 4
572 1,375
- -
0.0 3.7
over 1 year ago about 2 months ago
C Python
- MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

bioawk

Posts with mentions or reviews of bioawk. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-11.
  • Bioawk: Awk Modified for Biological Data
    1 project | news.ycombinator.com | 31 Mar 2024
  • Any links to R-scripts for common NGS pipelines?
    2 projects | /r/bioinformatics | 11 May 2023
    Data wrangling is actually what awk excels at, and it's generally much more concise than R for that sort of thing. I'm aware that a lot of awk one liners look like gibberish to the uninitiated, but it actually makes a lot of sense when you understand the pattern-action structure of awk programs. It is also installed on any *nix system, there's no need to worry about installing dependencies or setting up virtual environments. And it's several times faster than R. Also Bioawk is glorious.
  • Is BioAwk frequently used, or even useful?
    2 projects | /r/bioinformatics | 5 May 2023
    A few months ago, I learned about this utility known as bioawk, written by Heng Li of samtools fame. Apparently, it is essentially a tweaked version of awk, with some extra goodies added for parsing and processing of bioinformatics file formats. While the functionality seems cool, I was wondering whether it is worth installing on my server, and incorporating into our workflows, because it seems so niche. I have not seen many references to it. Or is it better if we stick to Python scripts for this sort of work? Are there any computational speed advantages, etc. that bioawk offers over regular Python scripts for processing of, let's say, BED files or VCF files?
  • What are the most useful cutting edge tools I should learn for bioinformatics?
    3 projects | /r/bioinformatics | 26 Apr 2023
  • My boss is considering letting me take a programming course if I have some good reasons why.
    2 projects | /r/labrats | 13 Apr 2023
    Beside that their core lectures to non-computer scientists are public (survey), workshops by software carpentry move around the globe. Maybe your intent to seed hands-on knowledge is in similar tune before heading for biopython, bioperl, bioawk. It doesn't hurt to tap into resources initially written for non-labrats either, e.g. about regular expressions by programming historian.
  • What are strictly data analysis jobs?
    3 projects | /r/labrats | 22 Feb 2023
    On the other hand, some of the techniques to set the ground for data analysis are equally valuable in other situations. The two installments about regular expressions on programming historian Understanding Regular Expressions and Cleaning OCR’d text with Regular Expressions, for example. They have no relevance to handling chemicals in the lab, yet since then, I find myself working with data files more efficiently, than earlier because of grep, an utility in Linux to crawl across data files. Or AWK, actually picking up theses "regexes", which I find generally useful since Benjamin Porter's "Hack the planet's text" (presentation video, and exercise video) with its link back to chem/bio e.g., to bioawk (btw, there equally is biopython, too).
  • Help they’re turning me into a programmer
    3 projects | /r/labrats | 13 Feb 2023
    Well, what language do you want to learn? What is your background so far? Assuming it is more on the side of biology, software carpentry's Python may eventually lead to biopython? Though there equally is a chance for AWK (Hack the planet's text! and bioawk...
  • Awk: The Power and Promise of a 40-Year-Old Language
    4 projects | news.ycombinator.com | 7 Sep 2021
    There's even a version of awk specifically designed for bioinformatics that natively knows how to handle fasta, fastq, and bam files, among other formats.

    https://github.com/lh3/bioawk

tiny_python_projects

Posts with mentions or reviews of tiny_python_projects. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-30.
  • Coding Programs and Sites for Learning R and Python?
    2 projects | /r/bioinformatics | 30 Mar 2023
    My book, Mastering Python for Bioinformatics (O'Reilly, 2021), uses many biofx challenges from the Rosalind.info site, but it's not necessarily a beginner book. The most important thing I teach is the use of tests to verify that a program/function is correct (or at least behaves predictably). You can see https://github.com/kyclark/biofx_python for all the code/tests. To learn more about Python and testing, I would recommend you start with other books such at my Tiny Python Projects (Manning, 2020). Code and tests are at https://github.com/kyclark/tiny_python_projects. I recorded videos showing how to write and test all those programs at tinypythonprojects.com. Best of luck!
  • AWK wildcard, is it possible?
    1 project | /r/bash | 25 Feb 2023
    From Clark's Tiny Python Projects (the corresponding code shared on GitHub) I learned the concept of test driven development (specific to Python, the book elected pytest for quality control) which equally can be applied for other programming languages. For me, continuous integration tests (some projects on GitHub use), or unit tests tap into this field.
  • Help they’re turning me into a programmer
    3 projects | /r/labrats | 13 Feb 2023
    What the 101 beginner courses sometimes/often skip (because there isn't enough time, attendees become tired, etc) is the next level, automated testing. As an example, pytest for Python allows you to set up "a test bank" to monitor if the output of your program's result are reasonable. This then is test driven development (e.g., Clark's Tiny Python Projects).
  • Enable hyphenation only for code blocks
    2 projects | /r/LaTeX | 6 Jan 2023
    Only as recommendation: If the lines of the source code (here: you C code you aim to document) are kept short, in manageable bytes (similar to entries parser.add_argument in Clark's "Tiny Python Projects", example seldomly pass beyond the frequently recommended threshold of 80 characters/line), reporting with listings becomes easier (equally, the reading of the difference logs/views by git and vimdiff), than with lines of say 120 characters per line. Though we no longer are constrained to 80 characters per line by terminals/screens and punch cards (when Fortran still was FORTRAN), this is a reason e.g., yapf for Python allows you to choose between 4 spaces/indentation (PEP8 style), or 2 spaces/indentation (Google style).

What are some alternatives?

When comparing bioawk and tiny_python_projects you can also consider the following projects:

cligen - Nim library to infer/generate command-line-interfaces / option / argument parsing; Docs at

Biopython - Official git repository for Biopython (originally converted from CVS)

csvquote - Enables common unix utlities like cut, awk, wc, head to work correctly with csv data containing delimiters and newlines

yapf - A formatter for Python files

orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis

biofx_python - Code for Mastering Python for Bioinformatics (O'Reilly, 2021, ISBN 9781098100889)

zarp - The Zavolab Automated RNA-seq Pipeline

MethylDackel - A (mostly) universal methylation extractor for BS-seq experiments.

readfq - Fast multi-line FASTA/Q reader in several programming languages

vcftools - A set of tools written in Perl and C++ for working with VCF files, such as those generated by the 1000 Genomes Project.

dsutils - Command-line tools for doing data science