-
hamilton
Discontinued A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton (by stitchfix)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
This project reminds me a lot of Dask https://dask.org/. A library that allows delayed calculation of complex dataframes in Python.
-
My pynto https://github.com/punkbrwstr/pynto is a similar framework for creating dataframes, but using a concatenative paradigm that treats the frame as a stack of columns. Functions ("words") operate on the stack to set up the graph for each column, and execution happens afterwards in parallel. Instead of function modifiers like @does it uses combinators to apply quoted operations to multiple columns. The postfix syntax (think postscript or factor) is unambiguous, if a bit old-school.
-
This reminds me a bit of a Clojure library called Plumbing (formerly Graph): https://github.com/plumatic/plumbing. It also let you create a DAG for structured computation. It was used for a web service, at that time.
-
Having worked on "Dagger", you may be interested in https://github.com/timkpaine/tributary
-
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Hamilton is more similar to the Prosto data processing toolkit which also relies on column operations defined via Python functions:
https://github.com/asavinov/prosto
However, Prosto allows for data processing via column operations in many tables (implemented as pandas data frames) by providing a column-oriented equivalents for joins and groupby (hence it has no joins and no groupbys which are known to be quite difficult and require high expertise).
Prosto also provides Column-SQL which might be simpler and more natural in many use cases.
The whole approach is based on the concept-oriented model of data which makes functions first-class elements of the model as opposed to having only sets in the relational model.