Show HN: Hamilton, a Microframework for Creating Dataframes

This page summarizes the projects mentioned and recommended in the original post on

Our great sponsors
  • Scout APM - Less time debugging, more time building
  • OPS - Build and Run Open Source Unikernels
  • SonarLint - Deliver Cleaner and Safer Code - Right in Your IDE of Choice!
  • hamilton

    Library for creating dataframes from functions.

  • Dask

    Parallel computing with task scheduling

    This project reminds me a lot of Dask A library that allows delayed calculation of complex dataframes in Python.

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • pynto

    Time series analysis in Python using the concatenative paradigm

    My pynto is a similar framework for creating dataframes, but using a concatenative paradigm that treats the frame as a stack of columns. Functions ("words") operate on the stack to set up the graph for each column, and execution happens afterwards in parallel. Instead of function modifiers like @does it uses combinators to apply quoted operations to multiple columns. The postfix syntax (think postscript or factor) is unambiguous, if a bit old-school.

  • plumbing

    Prismatic's Clojure(Script) utility belt

    This reminds me a bit of a Clojure library called Plumbing (formerly Graph): It also let you create a DAG for structured computation. It was used for a web service, at that time.

  • tributary

    Streaming reactive and dataflow graphs in Python

    Having worked on "Dagger", you may be interested in

  • prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

    Hamilton is more similar to the Prosto data processing toolkit which also relies on column operations defined via Python functions:

    However, Prosto allows for data processing via column operations in many tables (implemented as pandas data frames) by providing a column-oriented equivalents for joins and groupby (hence it has no joins and no groupbys which are known to be quite difficult and require high expertise).

    Prosto also provides Column-SQL which might be simpler and more natural in many use cases.

    The whole approach is based on the concept-oriented model of data which makes functions first-class elements of the model as opposed to having only sets in the relational model.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts