Our great sponsors
-
hamilton
Discontinued A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton (by stitchfix)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
The python package, hamilton, from Stitch Fix (https://hamilton-docs.gitbook.io/docs/) can help manage transformations on pandas dataframes. This DAG of transformations is managed separately in a file - so it can be versioned, in case the transformations change. The memory required is reduced, because only the API call tables and mapping parameter table have to be in memory. The calculated columns can be produced as needed. Just like dbt, transformations are separate from the source tables - but hamilton can be used on any python object - not just dataframes. dbt is SQL based.