make-booster
tes-azure-legacy
make-booster | tes-azure-legacy | |
---|---|---|
3 | 1 | |
8 | 18 | |
- | - | |
10.0 | 10.0 | |
almost 2 years ago | 8 months ago | |
Makefile | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
make-booster
-
Snakemake – A framework for reproducible data analysis
For a very different approach, check out make-booster:
https://github.com/david-a-wheeler/make-booster
Make-booster provides utility routines intended to greatly simplify data processing (particularly a data pipeline) using GNU make. It includes some mechanisms specifically to help Python, as well as general-purpose mechanisms that can be useful in any system. In particular, it helps reliably reproduce results, and it automatically determines what needs to run and runs only that (producing a significant speedup in most cases). Released as open source software.
-
A Love Letter to Make
https://github.com/david-a-wheeler/make-booster
I think a lot of hate on make is due to poor use. If your makefile is complex, refactor it. Auto-generate dependencies (it only takes a few lines in GNU make). And don't use recursive make, that way lies madness. I also think GNU make is the wiser tool; POSIX make lacks too much in many cases.
-
The Unreasonable Effectiveness of Makefiles
https://github.com/david-a-wheeler/make-booster
From its readme:
"This project (contained in this directory and below) provides utility routines intended to greatly simplify data processing (particularly a data pipeline) using GNU make. It includes some mechanisms specifically to help Python, as well as general-purpose mechanisms that can be useful in any system. In particular, it helps reliably reproduce results, and it automatically determines what needs to run and runs only that (producing a significant speedup in most cases)."
"For example, imagine that Python file BBB.py says include CC, and file CC.py reads from file F.txt (and CC.py declares its INPUTS= as described below). Now if you modify file F.txt or CC.py, any rule that runs BBB.py will automatically be re-run in the correct order when you use make, even if you didn't directly edit BBB.py."
This is NOT functionality directly provided by Python, and the overhead with >1000 files was 0.07seconds which we could live with :-).
tes-azure-legacy
-
Snakemake – A framework for reproducible data analysis
Snakemake is a beautiful project and evolves and improves so fast. Years ago I realized I needed to up my game from the usual bash based NGS data processing pipelines I was writing. Based on several recommendation I choose Snakemake. I have never regretted it, It worked perfectly on our PBS cluster then on our Slurm cluster. I made some steps to make it run on K8s, which is supports, and most recently, I'm still/again happy with my choice for Snakemake because it (together with Nextflow) seems to be the chosen framework for GA4GH's cloud work stream's "products" like WES and TES [0]. This seems to be the tech stack where Amazon Omics and Microsoft Genomics focus on [1].
I owe a lot to Snakemake and Johannes Köster, I hope some day I can repay him and his project.
[0] https://www.ga4gh.org/work_stream/cloud/
[1] https://github.com/Microsoft/tes-azure
What are some alternatives?
tclmake - Partial make clone in pure Tcl
snakemake-wrappers - This is the development home of the Snakemake wrapper repository, see
checkexec - CLI tool to conditionally execute commands only when files in a dependency list have been updated. Like `make`, but standalone.
mandala - A powerful and easy to use Python framework for experiment tracking and incremental computing
oxen-release - Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
dagger - Application Delivery as Code that Runs Anywhere
just - 🤖 Just a command runner
handlebars.c - C implementation of handlebars.js
bake - A Bash-based Make alternative.