Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
yplatform
Self-service bootstrap/build/CI/CD. Software and configuration that supports various cycles of software development.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
For data science specifically, I would strongly suggest looking into DVC: https://dvc.org/.
You can easily write DVC stage files by hand as a straightforward Makefile replacement, and integrate other features into your workflow as needed/desired.
If you're using Make as a command runner or a "standard entry point" to a project, instead of a build system that tracks dependencies between files, I highly recommend using `just` instead: https://github.com/casey/just
It has this functionality built-in, and avoids a lot of Make's idiosyncrasies. (Not affiliated, just a fan.)
Example: https://github.com/denibertovic/makefiles/blob/master/bare.m...
or stick to the most basic functionality it may be of use as a way to visualize the execution graph (DAG)
https://github.com/TomConlin/MakefileViz
I have approached the problem in a similar way, but without python.
I published my self-documenting docker Makefiles here:
https://github.com/Infused-Insight/docker_makefiles
Another part I think might be parallel execution, running multiple shells at the same time
https://github.com/rofl0r/jobflow
This is exactly my experience which lead me to create https://github.com/ysoftwareab/yplatform - with a consistent make interface https://github.com/ysoftwareab/yplatform/tree/master/build.m...
PS: quite feature complete but not yet well marketed so to speak. I'm actually recording an asciinema session this week in order for a visitor to grasp quicker the mentioned benefits.